Commit Graph

9472 Commits

Author SHA1 Message Date
Steven Liu
f553c3ce4c
Update summary of the tasks (#16528)
* 📝 add image/vision classification and asr

* 🖍 minor formatting fixes

* Fixed a typo in legacy seq2seq_trainer.py (#16531)

* Add ONNX export for BeiT (#16498)

* Add beit onnx conversion support

* Updated docs

* Added cross reference to ViT ONNX config

* call on_train_end when trial is pruned (#16536)

* Type hints added (#16529)

* Fix Bart type hints (#16297)

* Add type hints to PLBart PyTorch

* Remove pending merge conflicts

* Fix PLBart Type Hints

* Add changes from review

* Add VisualBert type hints (#16544)

* Adding missing type hints for mBART model (PyTorch) (#16429)

* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

* type hints update

* making dependent modesls coherent

Co-authored-by: matt <rocketknight1@gmail.com>

* Remove MBart subclass of XLMRoberta in tokenzier docs (#16546)

* Remove MBart subclass of XLMRoberta in tokenzier

* Fix style

* Copy docs from MBart50 tokenizer

* Use random_attention_mask for TF tests (#16517)

* use random_attention_mask for TF tests

* Fix for TFCLIP test (for now).

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Improve code example (#16450)

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>

* Pin tokenizers version <0.13 (#16539)

* Pin tokenizers version <0.13

* Style

* Add code samples for TF speech models (#16494)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581)

* [FlaxSpeechEncoderDecoder] Fix dtype bug

* more fixes

* Making the impossible to connect error actually report the right URL. (#16446)

* Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556)

* Add utility to find model labels (#16526)

* Add utility to find model labels

* Use it in the Trainer

* Update src/transformers/utils/generic.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Quality

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Enable doc in Spanish (#16518)

* Reorganize doc for multilingual support

* Fix style

* Style

* Toc trees

* Adapt templates

* Add use_auth to load_datasets for private datasets to PT and TF examples (#16521)

* fix formatting and remove use_auth

* Add use_auth_token to Flax examples

* add a test checking the format of `convert_tokens_to_string`'s output (#16540)

* add new tests

* add comment to overridden tests

* TF: Finalize `unpack_inputs`-related changes (#16499)

* Add unpack_inputs to remaining models

* removed kwargs to `call()` in TF models

* fix TF T5 tests

* [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586)

* initialize the default rank set on TrainerState (#16530)

* initialize the default rank set on TrainerState

* fix style

* Trigger doc build

* Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* add a template to add missing tokenization test (#16553)

* add a template to add missing tokenization test

* add cookiecutter setting

* improve doc

* Update templates/adding_a_missing_tokenization_test/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* made _load_pretrained_model_low_mem static + bug fix (#16548)

* handle torch_dtype in low cpu mem usage (#16580)

* [Doctests] Correct filenaming (#16599)

* [Doctests] Correct filenaming

* improve quicktour

* make style

* Adding new train_step logic to make things less confusing for users (#15994)

* Adding new train_step logic to make things less confusing for users

* DO NOT ASK WHY WE NEED THAT SUBCLASS

* Metrics now working, at least for single-output models with type annotations!

* Updates and TODOs for the new train_step

* Make fixup

* Temporary test workaround until T5 has types

* Temporary test workaround until T5 has types

* I think this actually works! Needs a lot of tests though

* MAke style/quality

* Revert changes to T5 tests

* Deleting the aforementioned unmentionable subclass

* Deleting the aforementioned unmentionable subclass

* Adding a Keras API test

* Style fixes

* Removing unneeded TODO and comments

* Update test_step too

* Stop trying to compute metrics with the dummy_loss, patch up test

* Make style

* make fixup

* Docstring cleanup

* make fixup

* make fixup

* Stop expanding 1D input tensors when using dummy loss

* Adjust T5 test given the new compile()

* make fixup

* Skipping test for convnext

* Removing old T5-specific Keras test now that we have a common one

* make fixup

* make fixup

* Only skip convnext test on CPU

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Avoiding TF import issues

* make fixup

* Update compile() to support TF 2.3

* Skipping model.fit() on template classes for now

* Skipping model.fit() on template class tests for now

* Replace ad-hoc solution with find_labels

* make fixup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding missing type hints for BigBird model   (#16555)

* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

* type hints update

* making dependent modesls coherent

* Type hints for BigBird

* removing typos

Co-authored-by: matt <rocketknight1@gmail.com>

* [deepspeed] fix typo, adjust config name (#16597)

* 🖍 apply feedback

Co-authored-by: Cathy <815244047@qq.com>
Co-authored-by: Jim Rohrer <jrohrer1@gmail.com>
Co-authored-by: Ferdinand Schlatt <fschlatt@gmail.com>
Co-authored-by: Dahlbomii <101373053+Dahlbomii@users.noreply.github.com>
Co-authored-by: Gunjan Chhablani <chhablani.gunjan@gmail.com>
Co-authored-by: Rishav Chandra Varma <rishavchandra.v16@iiits.in>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Daniel Stancl <46073029+stancld@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Andres Codas <andrescodas@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-04-05 12:48:42 -05:00
Stas Bekman
23fc4cba0d
[benchmark tool] trainer-benchmark.py (#14934)
* [benchmark tool] trainer-benchmark.py

* improve

* massive rework/expansion

* fix

* mucho improved

* improved

* fix prefix

* fix

* fix diff calculation

* address suggestions
2022-04-05 10:27:29 -07:00
John Giorgi
b33ab4eb59
Add global_attention_mask to gen_kwargs (#16485)
If global_attention_mask is found in the models inputs (used by certain
models, like LED) in the prediction_step method of Seq2SeqTrainer,
it is added to the gen_kwargs, which are passed to model.decode().
This allows us to properly set the global attention when decoding.
2022-04-05 13:05:27 -04:00
Stas Bekman
9fd5e6bbe6
[deepspeed] fix typo, adjust config name (#16597) 2022-04-05 08:13:12 -07:00
Rishav Chandra Varma
367558b90d
Adding missing type hints for BigBird model (#16555)
* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

* type hints update

* making dependent modesls coherent

* Type hints for BigBird

* removing typos

Co-authored-by: matt <rocketknight1@gmail.com>
2022-04-05 14:50:45 +01:00
Matt
4354005291
Adding new train_step logic to make things less confusing for users (#15994)
* Adding new train_step logic to make things less confusing for users

* DO NOT ASK WHY WE NEED THAT SUBCLASS

* Metrics now working, at least for single-output models with type annotations!

* Updates and TODOs for the new train_step

* Make fixup

* Temporary test workaround until T5 has types

* Temporary test workaround until T5 has types

* I think this actually works! Needs a lot of tests though

* MAke style/quality

* Revert changes to T5 tests

* Deleting the aforementioned unmentionable subclass

* Deleting the aforementioned unmentionable subclass

* Adding a Keras API test

* Style fixes

* Removing unneeded TODO and comments

* Update test_step too

* Stop trying to compute metrics with the dummy_loss, patch up test

* Make style

* make fixup

* Docstring cleanup

* make fixup

* make fixup

* Stop expanding 1D input tensors when using dummy loss

* Adjust T5 test given the new compile()

* make fixup

* Skipping test for convnext

* Removing old T5-specific Keras test now that we have a common one

* make fixup

* make fixup

* Only skip convnext test on CPU

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Avoiding TF import issues

* make fixup

* Update compile() to support TF 2.3

* Skipping model.fit() on template classes for now

* Skipping model.fit() on template class tests for now

* Replace ad-hoc solution with find_labels

* make fixup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-05 14:23:27 +01:00
Patrick von Platen
7ccacdf10f
[Doctests] Correct filenaming (#16599)
* [Doctests] Correct filenaming

* improve quicktour

* make style
2022-04-05 14:15:02 +02:00
Suraj Patil
21decb7731
handle torch_dtype in low cpu mem usage (#16580) 2022-04-05 12:26:03 +02:00
Francesco Saverio Zuppichini
8bf6d28c10
made _load_pretrained_model_low_mem static + bug fix (#16548) 2022-04-05 11:56:36 +02:00
SaulLu
02214cb3cc
add a template to add missing tokenization test (#16553)
* add a template to add missing tokenization test

* add cookiecutter setting

* improve doc

* Update templates/adding_a_missing_tokenization_test/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-05 10:50:22 +02:00
Yih-Dar
765bafb8e4
Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-05 10:00:03 +02:00
Sylvain Gugger
104c065277 Trigger doc build 2022-04-04 14:06:49 -04:00
Andres Codas
1cd2e21d1b
initialize the default rank set on TrainerState (#16530)
* initialize the default rank set on TrainerState

* fix style
2022-04-04 12:20:26 -04:00
Sanchit Gandhi
6f9d8dc156
[SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586) 2022-04-04 17:50:56 +02:00
Joao Gante
dad5ca83b2
TF: Finalize unpack_inputs-related changes (#16499)
* Add unpack_inputs to remaining models

* removed kwargs to `call()` in TF models

* fix TF T5 tests
2022-04-04 16:37:33 +01:00
SaulLu
be9474bd35
add a test checking the format of convert_tokens_to_string's output (#16540)
* add new tests

* add comment to overridden tests
2022-04-04 16:57:24 +02:00
Karim Foda
24a85cca61
Add use_auth to load_datasets for private datasets to PT and TF examples (#16521)
* fix formatting and remove use_auth

* Add use_auth_token to Flax examples
2022-04-04 10:27:45 -04:00
Sylvain Gugger
b9a768b3ff
Enable doc in Spanish (#16518)
* Reorganize doc for multilingual support

* Fix style

* Style

* Toc trees

* Adapt templates
2022-04-04 10:25:46 -04:00
Sylvain Gugger
3951b9f390
Add utility to find model labels (#16526)
* Add utility to find model labels

* Use it in the Trainer

* Update src/transformers/utils/generic.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Quality

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-04 10:06:57 -04:00
Daniel Stancl
ec4da72fe9
Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556) 2022-04-04 14:54:25 +02:00
Nicolas Patry
013a7dbe3d
Making the impossible to connect error actually report the right URL. (#16446) 2022-04-04 14:26:23 +02:00
Patrick von Platen
ad0cba08ea
[FlaxSpeechEncoderDecoder] Fix dtype bug (#16581)
* [FlaxSpeechEncoderDecoder] Fix dtype bug

* more fixes
2022-04-04 13:53:54 +02:00
Yih-Dar
60d27b1f15
Add code samples for TF speech models (#16494)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-01 17:54:01 +02:00
Lysandre Debut
53a4d6b115
Pin tokenizers version <0.13 (#16539)
* Pin tokenizers version <0.13

* Style
2022-04-01 11:53:18 -04:00
NielsRogge
61ee26a892
Improve code example (#16450)
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
2022-04-01 17:19:36 +02:00
Yih-Dar
2199382dfd
Use random_attention_mask for TF tests (#16517)
* use random_attention_mask for TF tests

* Fix for TFCLIP test (for now).

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-01 16:53:07 +02:00
Gunjan Chhablani
823dbf8a41
Remove MBart subclass of XLMRoberta in tokenzier docs (#16546)
* Remove MBart subclass of XLMRoberta in tokenzier

* Fix style

* Copy docs from MBart50 tokenizer
2022-04-01 16:39:28 +02:00
Rishav Chandra Varma
5fe06b9bdd
Adding missing type hints for mBART model (PyTorch) (#16429)
* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

* type hints update

* making dependent modesls coherent

Co-authored-by: matt <rocketknight1@gmail.com>
2022-04-01 15:21:26 +01:00
Gunjan Chhablani
9947dd077c
Add VisualBert type hints (#16544) 2022-04-01 15:02:58 +01:00
Gunjan Chhablani
59a9c83e40
Fix Bart type hints (#16297)
* Add type hints to PLBart PyTorch

* Remove pending merge conflicts

* Fix PLBart Type Hints

* Add changes from review
2022-04-01 14:50:22 +01:00
Dahlbomii
afc5a1ea3a
Type hints added (#16529) 2022-04-01 14:27:41 +01:00
Ferdinand Schlatt
483a9450a0
call on_train_end when trial is pruned (#16536) 2022-04-01 08:50:47 -04:00
Jim Rohrer
9de70f213e
Add ONNX export for BeiT (#16498)
* Add beit onnx conversion support

* Updated docs

* Added cross reference to ViT ONNX config
2022-04-01 10:52:42 +02:00
Cathy
bfeff6cc6a
Fixed a typo in legacy seq2seq_trainer.py (#16531) 2022-04-01 09:17:31 +02:00
Anton Lozhkov
5807054bd3
[research] link to the XTREME-S paper (#16519)
* [research] link to the XTREME-S paper

* Update examples/research_projects/xtreme-s/README.md

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-03-31 23:26:50 +04:00
Sylvain Gugger
e4b234834a
Fix syntax error in generate docstrings (#16516) 2022-03-31 08:45:47 -04:00
Mowaninuola Osifeso
b808d8a596
added type hints to xglm pytorch (#16500)
* added type hints to xglm pytorch

* Update src/transformers/models/xglm/modeling_xglm.py

* Update src/transformers/models/xglm/modeling_xglm.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-03-31 13:43:04 +01:00
Bhadresh Savani
05b4c32908
fixed a typo (#16508) 2022-03-31 07:49:02 -04:00
Santiago Gómez
6a4dbba1a3
Translate accelerate.mdx from english to spanish (#16176)
* Translate accelerate.mdx from english to spanish

* Update docs/source_es/accelerate.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Apply suggestions from code review

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Apply suggestions from code review

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Fix nits and finish translation

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-03-31 07:45:18 -04:00
Liliana Badillo
c551addeb0
Translate installation.mdx to Spanish (#16229)
* Translate installation.mdx to Spanish

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/installation.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Fix nits and finish translation

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-03-31 07:44:47 -04:00
Juanjo do Olmo
98939e6aee
Spanish translation of the file multilingual.mdx (#16329)
* Duplication of the source eng file

* Spanish translation of the file multilingual.mdx

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Update docs/source_es/multilingual.mdx

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Fix nits and finish translation

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-03-31 07:43:31 -04:00
chenbohua3
99a01423b9
make tuple annotation more specific to avoid failures during symbolic_trace (#16490)
* make tuple annotation more specific to avoid failures during symbolic_trace

* make tuple annotation more specific to avoid failures during symbolic_trace
2022-03-31 12:39:46 +01:00
Francesco Saverio Zuppichini
a8b6443e06
Refactor Modeling Outputs (#16341)
* first proposal

* replace model outputs in various models

* conflicts

* docstring

* update poolformer

* minor change in docstring

* CI

* removed poolformer specific outputs from doc

* removed convnext specific outputs from doc

* CI

* weird char in segformer

* conversations

* reverted docstring for BaseModelOutputWithPooling

* update outputs

* changed docstring in BaseModelOutput

* updated docstring in modeling outputs

* typos :)

* fixed typo after copy & paste it all around

* CI

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* segformer

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-03-31 09:32:33 +02:00
Manuel R. Ciosici
857eb87cc4
Support reduce_bucket_size=auto for deepspeed stages <3 (#16496) 2022-03-30 14:12:29 -07:00
Lai Wei
81ac45f85c
update smddp api to v1.4.0 (#16371)
* update smddp api to v1.4.0

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address comments

* fix style

* remove unused import

* fix indent

* disable style check for import

* fix space

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-30 16:28:35 -04:00
Stas Bekman
a73281e3e4
[examples] max samples can't be bigger than the len of dataset (#16501)
* [examples] max samples can't be bigger than then len of dataset

* do tf and flax
2022-03-30 12:33:16 -07:00
Francesco Saverio Zuppichini
c4deb7b3ae
Feature Extractor accepts segmentation_maps (#15964)
* feature extractor accepts

* resolved conversations

* added examples in test for ADE20K

* num_classes -> num_labels

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolving conversations

* resolving conversations

* removed ADE

* CI

* minor changes in conversion script

* reduce_labels in feature extractor

* minor changes

* correct preprocess for instace segmentation maps

* minor changes

* minor changes

* CI

* debugging

* better padding

* going to update labels inside the model

* going to update labels inside the model

* minor changes

* tests

* removed changes in feature_extractor_utils

* conversation

* conversation

* example in feature extractor

* more docstring in modeling

* test

* make style

* doc

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-30 18:46:51 +02:00
Joao Gante
c2f8eaf6bc
TF: unpack inputs on Convbert, GPTJ, LED, and templates (#16491)
* Add unpack_inputs to remaining models

* remove stray use of inputs in the templates; fix tf.debugging of attn masks
2022-03-30 17:12:27 +01:00
tomerip
ae189ef991
Add support for exporting GPT-J to ONNX-TRT (#16492)
Add support for exporting GPT-J to ONNX-TRT

Co-authored-by: Tomer Stav <stavt@amazon.com>
2022-03-30 17:56:03 +02:00
dctelus
d04adc3521
Add length to PreTrainedTokenizer train_new_from_iterator (#16493) 2022-03-30 11:41:04 -04:00