Commit Graph

10237 Commits

Author SHA1 Message Date
Joao Gante
ec6cd7633f
TF: Add missing cast to GPT-J (#18201)
* Fix TF GPT-J tests

* add try/finally block
2022-07-19 15:58:42 +01:00
Yih-Dar
05ed569c79
Use next-gen CircleCI convenience images (#18197)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-19 15:43:05 +02:00
flozi00
9f12ec7d87
Typo in readme (#18195) 2022-07-19 15:28:37 +02:00
Sylvain Gugger
dc9147ff36
Custom pipeline (#18079)
* Initial work

* More work

* Add tests for custom pipelines on the Hub

* Protect import

* Make the test work for TF as well

* Last PyTorch specific bit

* Add documentation

* Style

* Title in toc

* Bad names!

* Update docs/source/en/add_new_pipeline.mdx

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Auto stash before merge of "custom_pipeline" and "origin/custom_pipeline"

* Address review comments

* Address more review comments

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-07-19 12:02:35 +02:00
Patrick von Platen
3bb6356d4d
[From pretrained] Allow download from subfolder inside model repo (#18184)
* add first generation tutorial

* [from_pretrained] Allow loading models from subfolders

* remove gen file

* add doc strings

* allow download from subfolder

* add tests

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply comments

* correct doc string

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-19 11:53:53 +02:00
Snehan Kekre
ce0152819d
Update docs README with instructions on locally previewing docs (#18196)
* Update docs README with instructions on locally previewing docs

* Add instructions to install `watchdog` before previewing the docs
2022-07-19 11:47:26 +02:00
orgoro
798384467b
bugfix: div-->dim (#18135) 2022-07-19 10:24:56 +02:00
Sylvain Gugger
e630dad555
Add vision example to README (#18194) 2022-07-19 09:46:18 +02:00
Duong A. Nguyen
4bea6584e3
Remove use_auth_token from the from_config method (#18192)
* remove use_auth_token from from_config

* restore use_auth_token from_pretrained run_t5_mlm_flax
2022-07-19 08:13:20 +02:00
Sylvain Gugger
29fd471556 Use smaller variant of BLOOM for doc to fix tests 2022-07-18 15:17:29 -04:00
Sourab Mangrulkar
bc8e30bab9
FSDP integration enhancements and fixes (#18134)
* FSDP integration enhancements and fixes

* resolving comments

* fsdp fp16 mixed precision requires `ShardedGradScaler`
2022-07-19 00:02:10 +05:30
Nicola Procopio
8e445ca51d
Translation/training: italian translation training.mdx (#17662)
* added training.mdx

* updated training.mdx

* updated training.mdx

* updated training.mdx

* updated _toctree.yml

* fixed typos after review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-18 19:21:07 +02:00
Younes Belkada
6a1b1bf7a6
BLOOM minor fixes small test (#18175)
* minor fixes

- add correct revision
- corrected dosctring for test
- removed a test

* contrib credits

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
2022-07-18 19:18:19 +02:00
Nicola Procopio
c4cc894086
Translation italian: multilingual.mdx (#17768)
* added multilingual.mdx

* updated multilingual.mdx

* italian translation multilingual.mdx

* updated _toctree.yml

* fixed typos _toctree.yml

* fixed typos after review

* fixed error after review
2022-07-18 19:09:08 +02:00
Nicola Procopio
0a5b61d004
Added preprocessing.mdx italian translation (#17600)
* updated _toctree.yml

* added preprocessing

* updated preprocessing.mdx

* updated preprocessing.mdx

updated after review
2022-07-18 19:06:10 +02:00
SaulLu
ced1f1f5db
fix typo inside bloom documentation (#18187) 2022-07-18 17:43:52 +02:00
Sylvain Gugger
edadfc58af
Better default for offload_state_dict in from_pretrained (#18183) 2022-07-18 16:02:41 +02:00
Sylvain Gugger
aeeab1ffd0
Fix template for new models in README (#18182) 2022-07-18 16:01:51 +02:00
Ayan Sengupta
45255814a2
FIX: Typo (#18156) 2022-07-18 15:46:08 +02:00
Yih-Dar
6561fbcc6e
Update TF(Vision)EncoderDecoderModel PT/TF equivalence tests (#18073)
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:29:14 +02:00
Yih-Dar
cb19c2afdc
Fix expected loss values in some (m)T5 tests (#18177)
* fix expected loss values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:26:21 +02:00
Wang, Yi
7417f3acb7
[HPO] update to sigopt new experiment api (#18147)
* [HPO] update to sigopt new experiment api
* follow https://docs.sigopt.com/experiments

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* [HPO] use new API if sigopt version >= 8.0.0

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-07-18 15:19:40 +02:00
gcheron
8c14b342aa
add ONNX support for LeVit (#18154)
Co-authored-by: Guilhem Chéron <guilhemc@authentifier.com>
2022-07-18 15:17:07 +02:00
Lysandre Debut
c1c79b0655
NLLB tokenizer (#18126)
* NLLB tokenizer

* Apply suggestions from code review - Thanks Stefan!

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Final touches

* Style :)

* Update docs/source/en/model_doc/nllb.mdx

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR reviews

* Auto models

Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-18 08:12:34 -04:00
John Giorgi
a4f97e6ce0
Fix incorrect type hint for lang (#18161) 2022-07-18 09:53:18 +02:00
John Giorgi
c46d39f390
Fix check for falsey inputs in run_summarization (#18155) 2022-07-18 09:50:32 +02:00
Nicolas Patry
ccc0897804
Adding support for device_map directly in pipeline(..) function. (#17902)
* Adding support for `device_map` directly in `pipeline(..)` function.

* Updating the docstring.

* Adding a better docstring

* Put back type hints.

* Blacked. (`make fixup` didn't work ??!!)
2022-07-15 15:54:26 +02:00
Nicolas Patry
fca66ec4ef
Fixing a hard to trigger bug for text-generation pipeline. (#18131)
* Fixing a bug where attention mask was not passed to generate.

* Fixing zero-size prompts.

* Comment on top.
2022-07-15 15:54:07 +02:00
amyeroberts
8581a798c0
Add TF DeiT implementation (#17806)
* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Fixup

* Fix import

* Fix import

* Fix import

* Fix weight loading for tests whilst not on hub

* Add doc tests and remove to_2tuple

* Add back to_2tuple
Removing to_2tuple results in many downstream changes needed because of the copies checks

* Incorporate updates in Improve vision models #17731 PR

* Don't hard code num_channels

* Copy PyTorch DeiT embeddings and remove pytorch operations with mask

* Fix patch embeddings & tidy up

* Update PixelShuffle to move logic into class layer

* Update doc strings - remove PT references

* Use NHWC format in internal layers

* Fix up

* Use linear activation layer

* Remove unused import

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Move dataclass to top of file

* Remove from_pt now weights on hub

* Fixup

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-13 18:04:08 +01:00
Wei
7ea6ccc2b3
Enable torchdynamo with torch_tensorrt(fx path) (#17765)
* enable fx2trt

* Update perf_train_gpu_one.mdx

* Update perf_train_gpu_one.mdx

* add lib check

* update

* format

* update

* fix import check

* fix isort

* improve doc

* refactor ctx manager

* fix isort

* black format

* isort fix

* fix format

* update args

* update black

* cleanups

* Update perf_train_gpu_one.mdx

* code refactor

* code refactor to init

* remove redundancy

* isort

* replace self.args with args

Co-authored-by: Stas Bekman <stas@stason.org>
2022-07-13 12:43:28 -04:00
Sylvain Gugger
37aeb5787a
Make sharded checkpoints work in offline mode (#18125)
* Make sharded checkpoints work in offline mode

* Add test
2022-07-13 12:43:08 -04:00
Sylvain Gugger
0a21a48564 Revert "Make sharded checkpoints work in offline mode"
This reverts commit 3564c65786.
2022-07-13 10:53:25 -04:00
Sylvain Gugger
3564c65786 Make sharded checkpoints work in offline mode 2022-07-13 10:51:56 -04:00
lmagne
56e6487c40
add dataset split and config to model-index in TrainingSummary.from_trainer (#18064)
* added metadata to training summary

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-13 16:07:20 +02:00
John Giorgi
fde22c75a1
Add summarization name mapping for MultiNews (#18117)
* Add summarization name mapping for MultiNews

* Add summarization name mapping for MultiNews
2022-07-13 08:19:20 -04:00
Sebastian Sosa
195133363e
supported python versions reference (#18116)
* supported python versions reference

* Update CONTRIBUTING.md

removing commit hash from link

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-13 08:18:44 -04:00
Joao Gante
20509ab0e0
TF: unpack_inputs decorator independent from main_input_name (#18110) 2022-07-13 10:43:41 +01:00
Joao Gante
fcefa200b2
TF: remove graph mode distinction when processing boolean options (#18102) 2022-07-12 19:05:31 +01:00
Niklas Muennighoff
bc34c21191
Fix BLOOM dtype (#17995)
* Add fp16 option

* Fix BLOOM dtype

* Formatting

* Remove torch_dtype arg

* Revert formatting

* Apply formatting

* Add n_embed backward compat
2022-07-12 10:36:08 -04:00
Joao Gante
981714efe1
CLI: reenable pt_to_tf test (#18108) 2022-07-12 13:38:05 +01:00
wei zhao
f5221c06e4
Report value for a step instead of epoch. (#18095)
* Report value for a step instead of epoch.

Report an objective function value for a step instead of epoch to optuna.
I made this modification for the following reason:
If "eval_steps" is less than steps per epoch, there maybe warnings like this: "optuna/trial/_trial.py:592: UserWarning: The reported value is ignored because this `step` 0 is already reported.". So "step" are more appropriate than "epoch" here.

* MOD: make style.

Co-authored-by: zhaowei01 <zhaowei01@yuanfudao.com>
2022-07-12 08:18:35 -04:00
Sijun He
d4ebd4e112
speed up test (#18106) 2022-07-12 04:28:28 -04:00
jianan-gu
b7d8bd378c
Enhance IPEX integration in Trainer (#18072)
* enhance ipex import

* refine codes

* refine style

* add link

* style

Co-authored-by: Stas Bekman <stas@stason.org>
2022-07-11 21:34:09 -07:00
Younes Belkada
a462fc9232
Bloom Optimize operations (#17866)
* fix tolerance for a bloom slow test

* enhance alibi padding

- get rid of for loops
- deals better with padded batched input
- avoid useless cpu/gpu communication when creating alibi

Co-authored-by: justheuristic <justheuristic@gmail.com>

* optimize attention mask

* fix scaled softmax limit values

* optimize building alibi tensor

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix attention_mask shape when it's None

* minor fixes

- fix docstring + arg names

* remove colons in docstring

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* apply suggestion

* remove unsued arg

* refactor a bit

- use [:, None] for consistency

* refactor attention block

Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

* quick fixes

* first attempt

* refactor attention block and fix all tests except "test_simple_generation"

- added comments to better explain attention block

* remove debug lines and add TODO comment

* change `torch.bmm` to `torch.baddbmm`
- fixes `test_simple_generation`but breaks `test_batch_generation_padd`

* styling

* all tests are passing now
- use `bmm`
- add explanation for `allow_fp16_reduced_precision_reduction`

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* styling

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix support for accelerate

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove attn softmax in fp32

* refactor comments

* refactor a bit

- remove warning message
- remove print on test

* refer to pytorch t5

* change the slow tests

- do the tests in fp32
- remove some comments
- keep large comments

* update expected output for `test_simple_generation`
- we now test using fp32

* make style + change comments a bit

* fix dtype padd test

Co-authored-by: justheuristic <justheuristic@gmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-11 13:16:13 -04:00
Sylvain Gugger
5ff6f853d7 Mark slow test as such 2022-07-11 12:48:57 -04:00
Sylvain Gugger
b1b8222d80
Add filename to info diaplyed when downloading things in from_pretrained (#18099) 2022-07-11 12:45:06 -04:00
Sylvain Gugger
6c8017a5c8
Fix image segmentation and object detection pipeline tests (#18100) 2022-07-11 12:41:56 -04:00
Sylvain Gugger
b0520f594c Skip failing tests 2022-07-11 10:16:54 -04:00
Duong A. Nguyen
1e8140caad
Fix RESOURCE_EXHAUSTED error when dealing with large datasets in Flax example scripts (#18069)
* Fix RESOURCE_EXHAUSTED error for large datasets on Flax example scripts

* using np.permutation for creating batch_idx

* train_samples_idx -> training_samples_idx

* fix type hints
2022-07-11 15:59:08 +02:00
Yih-Dar
ac98a88fbc
Fix torchscript tests for GPT-NeoX (#18012)
* fix dtype issue in _attn

* fix RotaryEmbedding

* fix RotaryEmbedding 2

* clean up

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-11 05:02:54 -04:00