Commit Graph

421 Commits

Author SHA1 Message Date
Serge Matveenko
d51aa48a76
Limit Pydantic to V1 in dependencies (#24596)
* Limit Pydantic to V1 in dependencies

Pydantic is about to release V2 release which will break a lot of things. This change prevents `transformers` to be used with Pydantic V2 to avoid breaking things.

* more

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-01 00:04:03 +02:00
Yih-Dar
299aafe55f
Use protobuf 4 (#24599)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-30 20:56:55 +02:00
Yih-Dar
11cb6e0f7e
Unpin DeepSpeed and require DS >= 0.9.3 (#24541)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 14:01:22 +02:00
Yih-Dar
e84bf1f734
⚠️ Time to say goodbye to py37 (#24091)
* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 07:22:39 +02:00
Matt
8e164c5400
Improved keras imports (#24448)
* An end to accursed version-specific imports

* No more K.is_keras_tensor() either

* Update dependency tables

* Use a cleaner call context function getter

* Add a cap to <2.14

* Add cap to examples requirements too
2023-06-23 19:09:34 +01:00
Sylvain Gugger
26a2ec56d7
Clean up old Accelerate checks (#24279)
* Clean up old Accelerate checks

* Put back imports
2023-06-14 12:44:09 -04:00
Sylvain Gugger
8c5f306719
Update the pin on Accelerate (#24110) 2023-06-08 10:11:01 -04:00
Sylvain Gugger
ba695c1efd
v4.31.0.dev0 2023-06-07 16:49:00 -04:00
Zachary Mueller
5eb3d3c702
Up pinned accelerate version (#24089)
* Min accelerate

* Also min version

* Min accelerate

* Also min version

* To different minor version

* Empty
2023-06-07 16:21:51 -04:00
Sylvain Gugger
9193188276
Pin rhoknp (#23937) 2023-06-01 10:25:43 -04:00
Zachary Mueller
55451c66ce
Upgrade safetensors version (#23911)
* Upgrade safetensors

* Second table
2023-05-31 11:30:39 -04:00
Sanchit Gandhi
8f915c450d
Unpin numba (#23162)
* fix for ragged list

* unpin numba

* make style

* np.object -> object

* propagate changes to tokenizer as well

* np.long -> "long"

* revert tokenization changes

* check with tokenization changes

* list/tuple logic

* catch numpy

* catch else case

* clean up

* up

* better check

* trigger ci

* Empty commit to trigger CI
2023-05-31 14:59:30 +01:00
Nicolas Patry
9e8d7066e6
Making safetensors a core dependency. (#23254)
* Making `safetensors` a core dependency.

To be merged later, I'm creating the PR so we can try it out.

* Update setup.py

* Remove duplicates.

* Even more redundant.
2023-05-23 15:16:34 +02:00
Sylvain Gugger
9cf4a8b456
Build with non Python files (#23405)
* Add a test of the built release

* Polish everything

* Trigger CI
2023-05-16 14:23:10 -04:00
Yih-Dar
a3975f94f3
Only add files with modification outside doc blocks (#23327)
* min. version for pytest

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-05-12 16:35:15 +02:00
Sylvain Gugger
786b9cf5ca
Style 2023-05-11 14:40:38 -04:00
Lysandre Debut
71b19ee251
Agents extras (#23301)
* Agents extras

* Add to docs
2023-05-11 14:25:51 -04:00
José Ángel Rey Liñares
0c65fb7cfa
chore: allow protobuf 3.20.3 requirement (#22759)
* chore: allow protobuf 3.20.3

Allow latest bugfix release for protobuf (3.20.3)

* chore: update auto-generated dependency table

update auto-generated dependency table

* run in subprocess

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-10 20:22:56 +02:00
Sylvain Gugger
a0c0a78233
v4.30.0.dev0 2023-05-09 14:59:38 -04:00
Sylvain Gugger
94056b57be
New version of Accelerate for the Trainer (#23204) 2023-05-08 09:47:08 -04:00
Sylvain Gugger
3341bb41cd
Pin urllib3 2023-05-04 12:00:22 -04:00
Sylvain Gugger
4b6aecb48e
Pin numba for now (#23118) 2023-05-02 22:02:39 -04:00
amyeroberts
e5f3487190
Pin flax & optax version (#22895)
* Pin optax version

* Pin flax too

* Fixup
2023-04-20 17:30:14 +01:00
Zachary Mueller
aec10d162f
Update accelerate version + warning check fix (#22833) 2023-04-18 12:51:32 -04:00
Zachary Mueller
03462875cc
Introduce PartialState as the device handler in the Trainer (#22752)
* Use accelerate for device management

* Add accelerate to setup


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-17 15:09:45 -04:00
Sylvain Gugger
888c4a2ae0
v4.29.0.dev0 2023-04-12 20:04:29 -04:00
Sylvain Gugger
6db23af50c
Revert migration of setup to pyproject.toml (#22658) 2023-04-07 15:08:44 -04:00
Nicolas Patry
1670be4bde
Adding Llama FastTokenizer support. (#22264)
* Adding Llama FastTokenizer support.

- Requires https://github.com/huggingface/tokenizers/pull/1183 version
- Only support byte_fallback for llama, raise otherwise (safety net).
- Lots of questions are special tokens

How to test:

```python

from transformers.convert_slow_tokenizer import convert_slow_tokenizer
from transformers import AutoTokenizer
from tokenizers import Tokenizer

tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")

if False:
    new_tokenizer = Tokenizer.from_file("tok.json")
else:
    new_tokenizer = convert_slow_tokenizer(tokenizer)
    new_tokenizer.save("tok.json")

strings = [
    "This is a test",
    "生活的真谛是",
    "生活的真谛是[MASK]。",
    # XXX: This one is problematic because of special tokens
    # "<s> Something something",
]

for string in strings:
    encoded = tokenizer(string)["input_ids"]
    encoded2 = new_tokenizer.encode(string).ids

    assert encoded == encoded2, f"{encoded} != {encoded2}"

    decoded = tokenizer.decode(encoded)
    decoded2 = new_tokenizer.decode(encoded2)

    assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
```

The converter + some test script.

The test script.

Tmp save.

Adding Fast tokenizer + tests.

Adding the tokenization tests.

Correct combination.

Small fix.

Fixing tests.

Fixing with latest update.

Rebased.

fix copies + normalized added tokens  + copies.

Adding doc.

TMP.

Doc + split files.

Doc.

Versions + try import.

Fix Camembert + warnings -> Error.

Fix by ArthurZucker.

Not a decorator.

* Fixing comments.

* Adding more to docstring.

* Doc rewriting.
2023-04-06 09:53:03 +02:00
Xuehai Pan
4169dc84bf
[setup] migrate setup script to pyproject.toml (#22539)
* [setup] migrate setup script to `pyproject.toml`

* [setup] cleanup configurations

* remove unused imports
2023-04-03 14:03:41 -04:00
Xuehai Pan
80d1319e1b
[setup] drop deprecated distutils usage (#22531)
* [setup] drop deprecated `distutils` usage

* drop deprecated `distutils.util.strtobool` usage

* fix import order

* reformat docstring by `doc-builder`
2023-04-03 12:04:24 -04:00
Sylvain Gugger
2194943a34
Pin ruff (#22455) 2023-03-29 14:07:06 -04:00
Sylvain Gugger
4c295a265b
Update release instructions (#22454) 2023-03-29 14:05:42 -04:00
Joao Gante
88dae78f4d
TensorFlow: pin maximum version to 2.12 (#22364) 2023-03-24 18:45:03 +00:00
Sylvain Gugger
6587125c0a
Pin tensorflow-text to go with tensorflow (#22362)
* Pin tensorflow-text to go with tensorflow

* Make it more convenient to pin TensorFlow

* setup don't like f-strings
2023-03-24 10:54:06 -04:00
Stas Bekman
89a0a9eace
[deepspeed] offload + non-cpuadam optimizer exception doc (#22044)
* [deepspeed] offload + non-cpuadam optimizer exception doc

* deps
2023-03-21 17:00:05 -07:00
Ali Hassani
5990743fdd
Correct NATTEN function signatures and force new version (#22298) 2023-03-21 17:21:34 -04:00
Yih-Dar
67c2dbdb54
Time to Say Goodbye, torch 1.7 and 1.8 (#22291)
* time to say goodbye, torch 1.7 and 1.8

* clean up torch_int_div

* clean up is_torch_less_than_1_8-9

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-21 19:22:01 +01:00
Ali Hassani
3028b20a71
Fix natten (#22229)
* Add kernel size to NATTEN's QK arguments.

The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
argument to the QK operation to allow optional RPBs.

This ends up failing NATTEN tests.

This commit adds NATTEN back to circleci and adds the arguments to get
it working again.

* Force NATTEN >= 0.14.5
2023-03-17 11:07:55 -04:00
Sylvain Gugger
ebdb185bef
v4.28.0.dev0 2023-03-14 13:49:10 -04:00
amyeroberts
3412f5979d
Use PyAV instead of Decord in examples (#21572)
* Use PyAV instead of Decord

* Get frame indices

* Fix number of frames

* Update src/transformers/models/videomae/image_processing_videomae.py

* Fix up

* Fix copies

* Update timesformer doctests

* Update docstrings
2023-03-02 12:30:38 +00:00
Sylvain Gugger
6b0257de42
Sort deps alphabetically 2023-02-16 13:27:27 -05:00
Stas Bekman
68eff4036d
Update setup.py (#21584)
* Update setup.py

* suggestions
2023-02-13 10:12:14 -08:00
Sylvain Gugger
1efe9c0b24
Fix inclusion of non py files in package (#21546)
* Fix inclusion of non py files in package

* No need for the **
2023-02-09 14:15:10 -05:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting (#21480)
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
2023-02-06 18:10:56 -05:00
NielsRogge
5451f8896c
Add DETA (#20983)
* First draft

* Add initial draft of conversion script

* Convert all weights

* Fix config

* Add image processor

* Fix DetaImageProcessor

* Run make fix copies

* Remove timm dependency

* Fix dummy objects

* Improve loss function

* Remove conv_encoder attribute

* Update conversion scripts

* Improve postprocessing + docs

* Fix copied from statements

* Add tests

* Improve postprocessing

* Improve postprocessing

* Update READMEs

* More improvements

* Fix rebase

* Add is_torchvision_available

* Add torchvision dependency

* Fix typo and README

* Fix bug

* Add copied from

* Fix style

* Apply suggestions

* Fix thanks to @ydshieh

* Fix another dependency check

* Simplify image processor

* Add scipy

* Improve code

* Add threshold argument

* Fix bug

* Set default threshold

* Improve integration test

* Add another integration test

* Update setup.py

* Address review

* Improve deformable attention function

* Improve copied from

* Use relative imports

* Address review

* Replace assertions

* Address review

* Update dummies

* Remove dummies

* Address comments, update READMEs

* Remove custom kernel code

* Add image processor tests

* Add requires_backends

* Add minor comment

* Update scripts

* Update organization name

* Fix defaults, add doc tests

* Add id2label for object 365

* Fix tests

* Update task guide
2023-01-31 10:43:10 +01:00
Sylvain Gugger
6eb3c66a96
Add cPython files in build (#21372) 2023-01-30 11:19:30 -05:00
Sylvain Gugger
7119bb052a
v4.27.0.dev0 2023-01-23 16:52:35 -05:00
Sylvain Gugger
05e72aa0c4
Adapt repository creation to latest hf_hub (#21158)
* Adapt repository creation to latest hf_hub

* Update all examples

* Fix other tests, add Flax examples

* Address review comments
2023-01-18 11:14:00 -05:00
Hao Wang
375801d5e6
update pyknp to rhoknp (#20890)
* update pyknp to rhoknp

* fix linter

* fix linter

* fix linter

* fix linter

* fix linter

* support rhoknp==1.1.0, fix testcase
2022-12-31 01:22:26 -05:00
Yih-Dar
7032e02032
Install sentencepiece in DeepSpeed CI image (#20795)
* Install sentencepiece in DS CI image

* update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-16 18:23:46 +01:00