Jean Vancoppenolle
bad358398a
Add support for pretraining recurring span selection to Splinter ( #17247 )
...
* Add SplinterForSpanSelection for pre-training recurring span selection.
* Formatting.
* Rename SplinterForSpanSelection to SplinterForPreTraining.
* Ensure repo consistency
* Fixup changes
* Address SplinterForPreTraining PR comments
* Incorporate feedback and derive multiple question tokens per example.
* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de>
Co-authored-by: Tobias Günther <tobias.guenther@retresco.de>
Co-authored-by: Tobias Günther <github@tobigue.de>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:42:14 +02:00
Yih-Dar
0511305549
Add PR author in CI report + merged by info ( #17298 )
...
* Add author info to CI report
* Add merged by info
* update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 12:56:58 -04:00
Sylvain Gugger
032d63b976
Fix dummy creation script ( #17304 )
2022-05-17 12:56:24 -04:00
Sylvain Gugger
986dd5c5bf
Fix style
2022-05-17 12:50:14 -04:00
Karim Foda
38ddab10da
Doctest longformer ( #16441 )
...
* Add initial doctring changes
* make fixup
* Add TF doc changes
* fix seq classifier output
* fix quality errors
* t
* swithc head to random init
* Fix expected outputs
* Update src/transformers/models/longformer/modeling_longformer.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-05-17 18:32:12 +02:00
Patrick von Platen
10704e1209
[Test] Fix W2V-Conformer integration test ( #17303 )
...
* [Test] Fix W2V-Conformer integration test
* correct w2v2
* up
2022-05-17 18:20:36 +02:00
regisss
28a0811652
Improve mismatched sizes management when loading a pretrained model ( #17257 )
...
- Add --ignore_mismatched_sizes argument to classification examples
- Expand the error message when loading a model whose head dimensions are different from expected dimensions
2022-05-17 17:58:14 +02:00
Patrick von Platen
1f13ba818e
correct opt ( #17301 )
2022-05-17 15:48:23 +02:00
Matt
349f1c85d3
Rewrite TensorFlow train_step and test_step ( #17057 )
...
* Initial commit
* Better label renaming
* Remove breakpoint before pushing (this is your job)
* Test a lot more in the Keras fit() test
* make fixup
* Clarify the case where we flatten y dicts into tensors
* Clarify the case where we flatten y dicts into tensors
* Extract label name remapping to a method
2022-05-17 14:36:23 +01:00
Matt
651e48e1e5
Fix tests of mixed precision now that experimental is deprecated ( #17300 )
...
* Fix tests of mixed precision now that experimental is deprecated
* Fix mixed precision in training_args_tf.py too
2022-05-17 14:14:17 +01:00
SaulLu
6d211429ec
fix retribert's test_torch_encode_plus_sent_to_model
( #17231 )
2022-05-17 14:33:13 +02:00
NielsRogge
ec7f8af106
[ConvNeXT] Fix drop_path_rate ( #17280 )
...
* Fix drop_path_rate
* Fix TF's drop path rate
2022-05-17 07:37:48 -04:00
Yih-Dar
a26ab95e30
Fix wrong PT/TF categories in CI report ( #17272 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 09:32:47 +02:00
Yih-Dar
1ac2b8fa7f
Fix missing job action button in CI report ( #17270 )
...
* use matrix.machine_type
* fix job names used in job_link
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-17 08:31:06 +02:00
Patrick von Platen
5a9957358c
Add Wav2Vec2Conformer ( #16812 )
...
* save intermediate
* add wav2vec2 conformer
* add more code
* more
* first test passes
* make all checkpoints work
* update
* up
* more clean ups
* save clean-up
* save clean-up
* save more
* remove bogus
* finalize design conformer
* remove vision
* finish all tests
* more changes
* finish code
* add doc tests
* add slow tests
* fix autoconfig test
* up
* correct docstring
* up
* update
* fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update docs/source/en/model_doc/wav2vec2-conformer.mdx
* upload
* save copied from
* correct configs
* fix model outputs
* add to docs
* fix imports
* finish
* finish code
* correct copied from
* correct again
* correct make fix
* improve make fix copies
* save
* correct fix copy from
* correct init structure
* correct
* fix import
* apply suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2022-05-17 00:43:16 +02:00
Kyungmin Lee
f0395cf58e
Fix test_model_parallelization ( #17249 )
...
* Fix test_model_parallelization
* Modify
2022-05-16 23:30:49 +02:00
Patrick von Platen
e705e1267c
[Tests] Fix slow opt tests ( #17282 )
...
* fix opt tests
* remove unused tok
* make style
* make flake8 happy
* Update tests/models/opt/test_modeling_opt.py
2022-05-16 23:24:20 +02:00
amyeroberts
f6a6388972
Add Tensorflow Swin model ( #16988 )
...
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 22:19:53 +01:00
Kevin Zehnder
6cb7187324
docs(transformers): fix typo ( #17263 )
2022-05-16 17:04:30 -04:00
Sander Land
053a80c606
logging documentation update ( #17174 )
...
* logging documentation
* style
Co-authored-by: Sander Land <sander@chatdesk.com>
2022-05-16 16:47:28 -04:00
Yih-Dar
8600d770d4
Use the PR URL in CI report ( #17269 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 22:02:28 +02:00
Yih-Dar
3fb82f74fd
Fix FlavaForPreTrainingIntegrationTest CI test ( #17232 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 21:14:25 +02:00
Sylvain Gugger
9b0d2860eb
Better error in the Auto API when a dep is missing ( #17289 )
2022-05-16 14:55:46 -04:00
Yih-Dar
66b3e106a1
Make TrainerHyperParameterSigOptIntegrationTest slow test ( #17288 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-16 14:18:09 -04:00
Sylvain Gugger
ddb1a47ec8
Automatically sort auto mappings ( #17250 )
...
* Automatically sort auto mappings
* Better class extraction
* Some auto class magic
* Adapt test and underlying behavior
* Remove re-used config
* Quality
2022-05-16 13:24:20 -04:00
Nicolas Brousse
2f611f85e2
Mlflowcallback fix nonetype error ( #17171 )
...
* Fix edge cases TypeError: 'NoneType' object is not callable
* fix style
2022-05-16 12:18:30 -04:00
MichelBartels
95b6bef624
Align logits and labels in OPT ( #17237 )
2022-05-16 09:37:39 -04:00
lewtun
a5d1839679
Remove next sentence prediction from supported ONNX tasks ( #17276 )
2022-05-16 15:34:04 +02:00
Loubna Ben Allal
05a90579a8
CodeParrot data pretokenization ( #16932 )
...
* add pretokenization arguments
* add pretokenization script
* add support for pretokenized data
* reformat code
* fix run command for training
* fix model call from config
* remove a package
* add comments on pretokenization in the readme
* remove explicit parallelization
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* update readme
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* update readme -remove username
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* update readme -remove username
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* keep data parallelization
* reformat code
* reformat code
* update readme
* reformat code
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
2022-05-16 15:32:16 +02:00
Loubna Ben Allal
e730e12567
Update codeparrot data preprocessing ( #16944 )
...
* add new preprocessing arguments
* add new filters
* add new filters to readme
* fix config and test count, update function names and docstrings
* reformat code
* update readme
* Update readme
* rename config_test filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* rename few_assignments filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* rename tokenizer in arguments
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* rename functions and add limit_line argument for config_test filter
* update threshold for config_test filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
2022-05-16 14:43:25 +02:00
cavdard
518dd1277e
Updated checkpoint support for Sagemaker Model Parallel ( #17219 )
...
* adding partial checkpoint support for optimizer state
* formatted trainer.py
* Refactoring based on comments
* reformatting
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 08:17:25 -04:00
Kenneth Enevoldsen
71d18d0831
fixed bug in run_mlm_flax_stream.py ( #17203 )
...
* fixed bug run_mlm_flax_stream.py
Fixed bug caused by an update to tokenizer keys introduced in recent transformers versions (between `4.6.2` and `4.18.0`) where additional keys were introduced to the tokenizer output.
* Update run_mlm_flax_stream.py
* adding missing paranthesis
* formatted to black
* remove cols from dataset instead
* reformat to black
* moved rem. columns to map
* formatted to black
Co-authored-by: KennethEnevoldsen <kennethcenevolsen@gmail.com>
2022-05-16 13:40:27 +02:00
Stas Bekman
71abd3ade1
[WIP] [doc] performance/scalability revamp ( #15723 )
...
* [doc] performance/scalability revamp
* link the new docs
* no :
* mixed precision
* work on the first doc
* expand the main doc
* Trigger CI
* style
* revamp single GPU training section
* work on training performance
* remove files not used anymore or will be added later
* final touches
* fix rebase
* Add hardware section to toctree
* fix toctree again
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* remove `fast_tokenizers` entry that was copied in rebase
* add warning about DP vs DDP
* remove todo
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix missing closure of codeblock
* Update docs/source/en/perf_train_gpu_many.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* sync with #16860
* update toc
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 13:36:41 +02:00
Joao Gante
d3d87b451e
TF - Fix convnext classification example ( #17261 )
2022-05-16 12:24:01 +01:00
cloudhan
e86faecfd4
Fix obvious typos in flax decoder impl ( #17279 )
...
Change config.encoder_ffn_dim -> config.decoder_ffn_dim for decoder.
2022-05-16 13:08:04 +02:00
Ignacio Talavera
ee393c009a
Guide to create custom models in Spanish ( #17158 )
...
* file copied and toctree updated
* Intro and configuration translated
* model section translated
* enter hotfix
* Translation over, correction pending
* Typos and corrections
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
* Update docs/source/es/create_a_model.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 16:19:29 -04:00
Gerardo Huerta Robles
16be422912
Translated version of model_sharing.mdx doc to spanish ( #16184 )
...
* Translated version of model_sharing to spanish
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Update docs/source_es/model_sharing.mdx
* Addind model sharing to _toctree.yml
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 16:18:46 -04:00
Fellip Silva Alves
f9024814e1
[ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial ( #17076 )
...
* [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial
* Delete docs/source/pt-br directory
* [ fast_tokenizers.mdx ] - Continuing work on file
* [ fast_tokenizers.mdx ] - Continuing work on file
* Add fast tokenizers to _toctree.yml
* Eliminated config and toctree.yml
* Nits in fast_tokenizers.mdx
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 16:18:14 -04:00
Yih-Dar
50d1867cf8
Add PR title to push CI report ( #17246 )
...
* add PR title to push CI report
* add link
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-13 21:50:40 +02:00
Yih-Dar
506899d147
Fix push CI channel ( #17242 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-13 20:59:56 +02:00
Yih-Dar
7198b63362
install dev. version of accelerate ( #17243 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-13 13:47:09 -04:00
Sylvain Gugger
b96cb1693f
Fix Trainer for Datasets that don't have dict items ( #17239 )
2022-05-13 11:49:23 -04:00
Sylvain Gugger
9c8fde8e19
Handle copyright in add-new-model-like ( #17218 )
2022-05-13 11:47:19 -04:00
Yih-Dar
993553b2f1
fix --gpus option for docker ( #17235 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-13 17:26:26 +02:00
Yih-Dar
38043d8453
Update self-push workflow ( #17177 )
...
* update push ci
* install git-python
* update comment
* update deepspeed jobs
* fix report
* skip 2 more tests that require fairscale
* Fix changes in test_fetcher.py (to deal with `setup.py` is changed)
* set RUN_PT_TF_CROSS_TESTS=1 and final clean-up
* remove SIGOPT_API_TOKEN
* remove echo "$matrix_folders"
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-13 16:28:00 +02:00
Patrick von Platen
18d6b356c5
OPT - fix docstring and improve tests slighly ( #17228 )
...
* correct some stuff
* fix doc tests
* make style
2022-05-13 15:14:50 +02:00
Younes Belkada
dfc76018c1
OPT-fix ( #17229 )
...
* try fixes
* Revert "try fixes"
This reverts commit a8ad75ef69
.
* add correct shape
* add correct path
2022-05-13 15:14:23 +02:00
Rafael Zimmer
85fc455972
Added translation of installation.mdx to Portuguese Issue #16824 ( #16979 )
...
* Added translation of installation.mdx to Portuguese, as well
as default templates of _toctree.yml and _config.py
* [ build_documentation.yml ] - Updated doc_builder to build
documentation in Portuguese.
[ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx.
* [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder.
[ pipeline_tutorial.mdx ] - Grammar changes.
* [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial.
* [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial.
[ training.mdx ] - Added portuguese translation for training tutorial.
* [ preprocessing.mdx ] - WIP
* Update _toctree.yml
* Adding Pré-processamento to _toctree.yml
* Update accelerate.mdx
* Nits and eliminate preprocessing file while it is ready
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-05-13 07:55:44 -04:00
fxmarty
3f936df662
Fix typo in bug report template ( #17178 )
...
* Fix typo
* Force rerun workflows
Co-authored-by: Felix Marty <felix@huggingface.co>
2022-05-12 16:31:12 -04:00
Sylvain Gugger
afe5d42d8d
Black preview ( #17217 )
...
* Black preview
* Fixup too!
* Fix check copies
* Use the same version as the CI
* Bump black
2022-05-12 16:25:55 -04:00