NielsRogge
adc0ff2502
Add CvT ( #17299 )
...
* Adding cvt files
* Adding cvt files
* changes in init file
* Adding cvt files
* changes in init file
* Style fixes
* Address comments from code review
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Format lists in docstring
* Fix copies
* Apply suggestion from code review
Co-authored-by: AnugunjNaman <anugunjjha@gmail.com>
Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-18 17:47:18 +02:00
Sylvain Gugger
4710702837
Fix style
2022-05-18 10:46:40 -04:00
mraunak
5fdb54ece7
Add Information Gain Filtration algorithm ( #16953 )
...
* Add information gain filtration algorithm
* Complying with black requirements
* Added author
* Fixed import order
* flake8 corrections
Co-authored-by: Javier Turek <javier.turek@intel.com>
2022-05-18 10:39:02 -04:00
Kamal Raj
91ede485a7
Fix typo ( #17328 )
2022-05-18 10:29:53 -04:00
Yih-Dar
fe28eb9452
remove ( #17325 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-18 10:06:41 -04:00
Nicolas Patry
2cb2ea3fa1
Accepting real pytorch device as arguments. ( #17318 )
...
* Accepting real pytorch device as arguments.
* is_torch_available.
2022-05-18 10:06:24 -04:00
Nicolas Patry
1c9d1f4ca8
Updating the docs for max_seq_len
in QA pipeline ( #17316 )
2022-05-18 15:46:12 +02:00
Patrick von Platen
60ad73448c
[T5] Fix init in TF and Flax for pretraining ( #17294 )
...
* fix init
* Apply suggestions from code review
* fix
* finish
* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-18 15:08:56 +02:00
Joaq
7ba1d4e51f
Add type hints for ProphetNet (Pytorch) ( #17223 )
...
* added type hints to prophetnet
* reformatted with black
* fix bc black misformatted some parts
* fix imports
* fix imports
* Update src/transformers/models/prophetnet/configuration_prophetnet.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* update OPTIONAL type hint and docstring
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-05-18 13:23:47 +01:00
Carl
d6b8e9cec7
Add trajectory transformer ( #17141 )
...
* Add trajectory transformer
Fix model init
Fix end of lines for .mdx files
Add trajectory transformer model to toctree
Add forward input docs
Fix docs, remove prints, simplify prediction test
Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update docs, more descriptive comments
Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update readme
Small comment update and add conversion script
Rebase and reformat
Fix copies
Fix rebase, remove duplicates
Fix rebase, remove duplicates
* Remove tapex
* Remove tapex
* Remove tapex
2022-05-17 19:07:43 -04:00
Patrick von Platen
c35264007b
fix ( #17310 )
2022-05-17 18:34:31 -04:00
Cesare Campagnano
d9050dc768
[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing ( #17112 )
...
* [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing
* LED docs clarification
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [LED] gradient_checkpointing=True should be passed to TrainingArguments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [LED] docs: remove wrong word
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [LED] docs fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:44:37 +02:00
Jean Vancoppenolle
bad358398a
Add support for pretraining recurring span selection to Splinter ( #17247 )
...
* Add SplinterForSpanSelection for pre-training recurring span selection.
* Formatting.
* Rename SplinterForSpanSelection to SplinterForPreTraining.
* Ensure repo consistency
* Fixup changes
* Address SplinterForPreTraining PR comments
* Incorporate feedback and derive multiple question tokens per example.
* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de>
Co-authored-by: Tobias Günther <tobias.guenther@retresco.de>
Co-authored-by: Tobias Günther <github@tobigue.de>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:42:14 +02:00
Yih-Dar
0511305549
Add PR author in CI report + merged by info ( #17298 )
...
* Add author info to CI report
* Add merged by info
* update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 12:56:58 -04:00
Sylvain Gugger
032d63b976
Fix dummy creation script ( #17304 )
2022-05-17 12:56:24 -04:00
Sylvain Gugger
986dd5c5bf
Fix style
2022-05-17 12:50:14 -04:00
Karim Foda
38ddab10da
Doctest longformer ( #16441 )
...
* Add initial doctring changes
* make fixup
* Add TF doc changes
* fix seq classifier output
* fix quality errors
* t
* swithc head to random init
* Fix expected outputs
* Update src/transformers/models/longformer/modeling_longformer.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-05-17 18:32:12 +02:00
Patrick von Platen
10704e1209
[Test] Fix W2V-Conformer integration test ( #17303 )
...
* [Test] Fix W2V-Conformer integration test
* correct w2v2
* up
2022-05-17 18:20:36 +02:00
regisss
28a0811652
Improve mismatched sizes management when loading a pretrained model ( #17257 )
...
- Add --ignore_mismatched_sizes argument to classification examples
- Expand the error message when loading a model whose head dimensions are different from expected dimensions
2022-05-17 17:58:14 +02:00
Patrick von Platen
1f13ba818e
correct opt ( #17301 )
2022-05-17 15:48:23 +02:00
Matt
349f1c85d3
Rewrite TensorFlow train_step and test_step ( #17057 )
...
* Initial commit
* Better label renaming
* Remove breakpoint before pushing (this is your job)
* Test a lot more in the Keras fit() test
* make fixup
* Clarify the case where we flatten y dicts into tensors
* Clarify the case where we flatten y dicts into tensors
* Extract label name remapping to a method
2022-05-17 14:36:23 +01:00
Matt
651e48e1e5
Fix tests of mixed precision now that experimental is deprecated ( #17300 )
...
* Fix tests of mixed precision now that experimental is deprecated
* Fix mixed precision in training_args_tf.py too
2022-05-17 14:14:17 +01:00
SaulLu
6d211429ec
fix retribert's test_torch_encode_plus_sent_to_model
( #17231 )
2022-05-17 14:33:13 +02:00
NielsRogge
ec7f8af106
[ConvNeXT] Fix drop_path_rate ( #17280 )
...
* Fix drop_path_rate
* Fix TF's drop path rate
2022-05-17 07:37:48 -04:00
Yih-Dar
a26ab95e30
Fix wrong PT/TF categories in CI report ( #17272 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 09:32:47 +02:00
Yih-Dar
1ac2b8fa7f
Fix missing job action button in CI report ( #17270 )
...
* use matrix.machine_type
* fix job names used in job_link
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 08:31:06 +02:00
Patrick von Platen
5a9957358c
Add Wav2Vec2Conformer ( #16812 )
...
* save intermediate
* add wav2vec2 conformer
* add more code
* more
* first test passes
* make all checkpoints work
* update
* up
* more clean ups
* save clean-up
* save clean-up
* save more
* remove bogus
* finalize design conformer
* remove vision
* finish all tests
* more changes
* finish code
* add doc tests
* add slow tests
* fix autoconfig test
* up
* correct docstring
* up
* update
* fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update docs/source/en/model_doc/wav2vec2-conformer.mdx
* upload
* save copied from
* correct configs
* fix model outputs
* add to docs
* fix imports
* finish
* finish code
* correct copied from
* correct again
* correct make fix
* improve make fix copies
* save
* correct fix copy from
* correct init structure
* correct
* fix import
* apply suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2022-05-17 00:43:16 +02:00
Kyungmin Lee
f0395cf58e
Fix test_model_parallelization ( #17249 )
...
* Fix test_model_parallelization
* Modify
2022-05-16 23:30:49 +02:00
Patrick von Platen
e705e1267c
[Tests] Fix slow opt tests ( #17282 )
...
* fix opt tests
* remove unused tok
* make style
* make flake8 happy
* Update tests/models/opt/test_modeling_opt.py
2022-05-16 23:24:20 +02:00
amyeroberts
f6a6388972
Add Tensorflow Swin model ( #16988 )
...
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 22:19:53 +01:00
Kevin Zehnder
6cb7187324
docs(transformers): fix typo ( #17263 )
2022-05-16 17:04:30 -04:00
Sander Land
053a80c606
logging documentation update ( #17174 )
...
* logging documentation
* style
Co-authored-by: Sander Land <sander@chatdesk.com>
2022-05-16 16:47:28 -04:00
Yih-Dar
8600d770d4
Use the PR URL in CI report ( #17269 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 22:02:28 +02:00
Yih-Dar
3fb82f74fd
Fix FlavaForPreTrainingIntegrationTest CI test ( #17232 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 21:14:25 +02:00
Sylvain Gugger
9b0d2860eb
Better error in the Auto API when a dep is missing ( #17289 )
2022-05-16 14:55:46 -04:00
Yih-Dar
66b3e106a1
Make TrainerHyperParameterSigOptIntegrationTest slow test ( #17288 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 14:18:09 -04:00
Sylvain Gugger
ddb1a47ec8
Automatically sort auto mappings ( #17250 )
...
* Automatically sort auto mappings
* Better class extraction
* Some auto class magic
* Adapt test and underlying behavior
* Remove re-used config
* Quality
2022-05-16 13:24:20 -04:00
Nicolas Brousse
2f611f85e2
Mlflowcallback fix nonetype error ( #17171 )
...
* Fix edge cases TypeError: 'NoneType' object is not callable
* fix style
2022-05-16 12:18:30 -04:00
MichelBartels
95b6bef624
Align logits and labels in OPT ( #17237 )
2022-05-16 09:37:39 -04:00
lewtun
a5d1839679
Remove next sentence prediction from supported ONNX tasks ( #17276 )
2022-05-16 15:34:04 +02:00
Loubna Ben Allal
05a90579a8
CodeParrot data pretokenization ( #16932 )
...
* add pretokenization arguments
* add pretokenization script
* add support for pretokenized data
* reformat code
* fix run command for training
* fix model call from config
* remove a package
* add comments on pretokenization in the readme
* remove explicit parallelization
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* update readme
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* update readme -remove username
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* update readme -remove username
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* keep data parallelization
* reformat code
* reformat code
* update readme
* reformat code
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
2022-05-16 15:32:16 +02:00
Loubna Ben Allal
e730e12567
Update codeparrot data preprocessing ( #16944 )
...
* add new preprocessing arguments
* add new filters
* add new filters to readme
* fix config and test count, update function names and docstrings
* reformat code
* update readme
* Update readme
* rename config_test filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* rename few_assignments filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* rename tokenizer in arguments
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* rename functions and add limit_line argument for config_test filter
* update threshold for config_test filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
2022-05-16 14:43:25 +02:00
cavdard
518dd1277e
Updated checkpoint support for Sagemaker Model Parallel ( #17219 )
...
* adding partial checkpoint support for optimizer state
* formatted trainer.py
* Refactoring based on comments
* reformatting
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 08:17:25 -04:00
Kenneth Enevoldsen
71d18d0831
fixed bug in run_mlm_flax_stream.py ( #17203 )
...
* fixed bug run_mlm_flax_stream.py
Fixed bug caused by an update to tokenizer keys introduced in recent transformers versions (between `4.6.2` and `4.18.0`) where additional keys were introduced to the tokenizer output.
* Update run_mlm_flax_stream.py
* adding missing paranthesis
* formatted to black
* remove cols from dataset instead
* reformat to black
* moved rem. columns to map
* formatted to black
Co-authored-by: KennethEnevoldsen <kennethcenevolsen@gmail.com>
2022-05-16 13:40:27 +02:00
Stas Bekman
71abd3ade1
[WIP] [doc] performance/scalability revamp ( #15723 )
...
* [doc] performance/scalability revamp
* link the new docs
* no :
* mixed precision
* work on the first doc
* expand the main doc
* Trigger CI
* style
* revamp single GPU training section
* work on training performance
* remove files not used anymore or will be added later
* final touches
* fix rebase
* Add hardware section to toctree
* fix toctree again
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* remove `fast_tokenizers` entry that was copied in rebase
* add warning about DP vs DDP
* remove todo
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix missing closure of codeblock
* Update docs/source/en/perf_train_gpu_many.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* sync with #16860
* update toc
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 13:36:41 +02:00
Joao Gante
d3d87b451e
TF - Fix convnext classification example ( #17261 )
2022-05-16 12:24:01 +01:00
cloudhan
e86faecfd4
Fix obvious typos in flax decoder impl ( #17279 )
...
Change config.encoder_ffn_dim -> config.decoder_ffn_dim for decoder.
2022-05-16 13:08:04 +02:00
Ignacio Talavera
ee393c009a
Guide to create custom models in Spanish ( #17158 )
...
* file copied and toctree updated
* Intro and configuration translated
* model section translated
* enter hotfix
* Translation over, correction pending
* Typos and corrections
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 16:19:29 -04:00
Gerardo Huerta Robles
16be422912
Translated version of model_sharing.mdx doc to spanish ( #16184 )
...
* Translated version of model_sharing to spanish
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Addind model sharing to _toctree.yml
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 16:18:46 -04:00
Fellip Silva Alves
f9024814e1
[ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial ( #17076 )
...
* [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial
* Delete docs/source/pt-br directory
* [ fast_tokenizers.mdx ] - Continuing work on file
* [ fast_tokenizers.mdx ] - Continuing work on file
* Add fast tokenizers to _toctree.yml
* Eliminated config and toctree.yml
* Nits in fast_tokenizers.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 16:18:14 -04:00