Sylvain Gugger
96881729ce
Remove assert on optional arg
2022-01-13 17:34:41 -05:00
Stas Bekman
1eb40338ac
[deepspeed tests] fix summarization ( #15149 )
2022-01-13 13:48:51 -08:00
Yanming Wang
6e058e84fd
Enable AMP for xla:gpu device in trainer class ( #15022 )
...
* Multiple fixes of trainer class with XLA GPU
* Make fp16 valid for xla:gpu
* Add mark_step in should_log to reduce compilation overhead
2022-01-13 15:21:00 -05:00
Carlos Aguayo
3fc221d077
Update model_sharing.mdx ( #15142 )
...
Fix typo
2022-01-13 12:26:02 -05:00
Manuel R. Ciosici
7b83feb50a
Deprecates AdamW and adds --optim
( #14744 )
...
* Add AdamW deprecation warning
* Add --optim to Trainer
* Update src/transformers/optimization.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/optimization.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/optimization.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/optimization.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/training_args.py
* fix style
* fix
* Regroup adamws together
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Change --adafactor to --optim adafactor
* Use Enum for optimizer values
* fixup! Change --adafactor to --optim adafactor
* fixup! Change --adafactor to --optim adafactor
* fixup! Change --adafactor to --optim adafactor
* fixup! Use Enum for optimizer values
* Improved documentation for --adafactor
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Add mention of no_deprecation_warning
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename OptimizerOptions to OptimizerNames
* Use choices for --optim
* Move optimizer selection code to a function and add a unit test
* Change optimizer names
* Rename method
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename method
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Remove TODO comment
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename variable
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename variable
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename function
* Rename variable
* Parameterize the tests for supported optimizers
* Refactor
* Attempt to make tests pass on CircleCI
* Add a test with apex
* rework to add apex to parameterized; add actual train test
* fix import when torch is not available
* fix optim_test_params when torch is not available
* fix optim_test_params when torch is not available
* re-org
* small re-org
* fix test_fused_adam_no_apex
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Remove .value from OptimizerNames
* Rename optimizer strings s|--adam_|--adamw_|
* Also rename Enum options
* small fix
* Fix instantiation of OptimizerNames. Remove redundant test
* Use ExplicitEnum instead of Enum
* Add unit test with string optimizer
* Change optimizer default to string value
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-01-13 08:14:51 -08:00
Stas Bekman
762416ffa8
[examples/flax/language-modeling] set loglevel ( #15129 )
2022-01-13 15:17:28 +01:00
Yih-Dar
74837171ab
fix doc example - AssertionError: has to be configured as a decoder. ( #15124 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-01-13 06:45:30 -05:00
Lysandre Debut
6950ccec1b
doc-builder -> doc-build ( #15134 )
...
* Updated script
* Commit everything
* Ready for review!
* Update .github/workflows/build_documentation.yml
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
2022-01-13 06:02:24 -05:00
Edoardo Federici
9a94bb8e21
mBART support for run_summarization.py ( #15125 )
...
* Update run_summarization.py
* Fixed languages and added missing code
* fixed obj, docs, removed source_lang and target_lang
* make style, run_summarization.py reformatted
2022-01-12 16:39:33 -05:00
Jake Tae
97f3beed36
Add with torch.no_grad()
to DistilBERT integration test forward pass ( #14979 )
...
* refactor: wrap forward pass around no_grad context
* Update tests/test_modeling_distilbert.py
* fix: rm `no_grad` from non-integration tests
* chore: rm whitespace change
2022-01-12 10:42:39 -05:00
lewtun
021f2ea987
Add ONNX configuration classes to docs ( #15121 )
...
* Add ONNX classes to main package
* Remove permalinks from ONNX guide
* Fix ToC entry
* Revert "Add ONNX classes to main package"
This reverts commit eb794a5b00
.
* Add ONNX classes to main doc
* Fix syntax highlighting in doc
* Fix text
* Add FeaturesManager to doc
* Use paths to reference ONNX classes
* Add FeaturesManager to init
* Add missing ONNX paths
2022-01-12 16:33:32 +01:00
Sylvain Gugger
c425d60bb9
Fix link to deepspeed config
2022-01-12 09:32:53 -05:00
Yih-Dar
6820904454
Fix #14357 ( #15001 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-01-12 14:29:09 +00:00
Leandro von Werra
aa0135f2e0
fix: switch from slow to generic tokenizer class ( #15122 )
2022-01-12 09:12:43 -05:00
Russell Klopfer
27b819b0e3
use block_size instead of max_seq_length in tf run_clm example ( #15036 )
...
* use block_size instead of max_seq_length
* fixup
* remove pad_to_block_size
Co-authored-by: Russell Klopfer <russell@kloper.us>
2022-01-12 08:57:00 -05:00
Nicolas Patry
68cc4ccde2
Pipeline ASR with LM. ( #15071 )
...
* Pipeline ASR with LM.
* Revamped into `self.decoder`.
* Fixing.
* 2nd fix.
* Update src/transformers/pipelines/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fixing.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-12 09:28:19 +01:00
Sylvain Gugger
1a00863e95
Fix typo in doc template
2022-01-11 15:22:15 -05:00
Matt
44eaa2b303
Update TF test_step to match train_step ( #15111 )
...
* Update TF test_step to match train_step
* Update compile() warning to be clearer about what to pass
2022-01-11 19:05:39 +00:00
Vladimir Maryasin
57b980a613
Fix saving FlaubertTokenizer configs ( #14991 )
...
All specific tokenizer config properties must be passed to its base
class (XLMTokenizer) in order to be saved. This was not the case for
do_lowercase config. Thus it was not saved by save_pretrained() method
and saving and reloading the tokenizer changed its behaviour.
This commit fixes it.
2022-01-11 19:19:33 +01:00
lewtun
16f0b7d72c
Update ONNX docs ( #14904 )
...
* Remove docs for deprecated ONNX export
* Tidy up the CLI help messages
* Revamp ONNX docs
* Update auto-config table
* Use DistilBERT as example for consistency
* Wrap up first pass at ONNX docs
* Fix table check
* Add tweaks and introduction
* Add cross-ref
* Fix missing import
* Fix style
* Add permalinks to ONNX configs
* Clarify role of OrderedDict
* Update docs/source/serialization.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add doctest syntax to code blocks
* Remove permalinks
* Revert "Remove permalinks"
This reverts commit 099701daf0
.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-11 18:06:05 +01:00
Sylvain Gugger
704d1feca1
Doc styler tip ( #15105 )
...
* Add new lines before/after tips
* Check end of lines
2022-01-11 11:45:39 -05:00
AK391
68d925195e
Merge branch 'master' into master
2022-01-11 11:11:29 -05:00
Lysandre Debut
7480ded658
Fix failing test ( #15104 )
2022-01-11 15:57:34 +01:00
novice
28e091430e
Add Nystromformer ( #14659 )
...
* Initial commit
* Config and modelling changes
Added Nystromformer-specific attributes to config and removed all decoder functionality from modelling.
* Modelling and test changes
Added Nystrom approximation and removed decoder tests.
* Code quality fixes
* Modeling changes and conversion script
Initial commits to conversion script, modeling changes.
* Minor modeling changes and conversion script
* Modeling changes
* Correct modeling, add tests and documentation
* Code refactor
* Remove tokenizers
* Code refactor
* Update __init__.py
* Fix bugs
* Update src/transformers/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/model_doc/nystromformer.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/configuration_nystromformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/configuration_nystromformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/configuration_nystromformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/configuration_nystromformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/convert_nystromformer_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/nystromformer/configuration_nystromformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update modeling and test_modeling
* Code refactor
* .rst to .mdx
* doc changes
* Doc changes
* Update modeling_nystromformer.py
* Doc changes
* Fix copies
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update configuration_nystromformer.py
* Fix copies
* Update tests/test_modeling_nystromformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update test_modeling_nystromformer.py
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Fix code style
* Update modeling_nystromformer.py
* Update modeling_nystromformer.py
* Fix code style
* Reformat modeling file
* Update modeling_nystromformer.py
* Modify NystromformerForMultipleChoice
* Fix code quality
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Code style changes and torch.no_grad()
* make style
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-11 14:25:49 +01:00
Lysandre Debut
444ea95a80
Print out durations of all scheduled tests ( #15102 )
2022-01-11 08:15:59 -05:00
JejuWayfarer
285131bfb4
change metric_key_prefix in seq2seq_trainer.py ( #15099 )
...
It solves the problem that metric_key_prefix is different from trainer.
2022-01-11 07:44:29 -05:00
Virus
c4fa908fa9
Adds IBERT to models exportable with ONNX ( #14868 )
...
* Add IBertOnnxConfig and tests
* add all the supported features for IBERT and remove outputs in IbertOnnxConfig
* use OnnxConfig
* fix codestyle
* remove serialization.rst
* codestyle
2022-01-11 12:17:08 +01:00
Patrick von Platen
efb35a4107
[Wav2Vec2ProcessorWithLM] improve decoder downlaod ( #15040 )
2022-01-11 05:59:38 -05:00
NielsRogge
6ea6266625
Fix cookiecutter ( #15100 )
2022-01-11 05:57:26 -05:00
Yih-Dar
68810aa26c
fix doc example - TypeError: forward() got an unexpected keyword argument 'input_ids' ( #15092 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-01-11 04:04:23 -05:00
Sylvain Gugger
ca76618d6b
Take gradient accumulation into account when defining samplers ( #15095 )
...
* Take gradient accumulation into account when defining samplers
* style
2022-01-11 03:16:39 -05:00
Sylvain Gugger
9dc8fb2fc7
Add test to check reported training loss ( #15096 )
...
* Add test
* Add tests for the reported train loss
2022-01-11 03:14:11 -05:00
AK391
5cd7086fdb
XLM-ProphetNet Spaces badge
2022-01-11 00:11:31 -05:00
AK391
4e3208662e
DPR Spaces badge
2022-01-10 13:50:40 -05:00
AK391
ac2c06d492
ProphetNet spaces badge
2022-01-10 13:43:34 -05:00
AK391
bf0201e184
MBART spaces badge
2022-01-10 13:37:17 -05:00
Yih-Dar
b67fd797be
Add TFVisionEncoderDecoderModel ( #14148 )
...
* Start the work on TFVisionEncoderDecoderModel
* Expose TFVisionEncoderDecoderModel
* fix import
* Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()
* reorder
* Apply the fix for checkpoint loading as in #14016
* remove attention_mask + fix VISION_DUMMY_INPUTS
* A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting
* fix wrong condition: shape_list(input_ids) == 2
* add tests
* use personal TFViTModel checkpoint (for now)
* Add equivalence tests + projection layer
* style
* make sure projection layer can run
* Add examples
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Clean comments (need to work on TODOs for PyTorch models)
* Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel
* fixes
* Revert changes in PT code.
* Update tests/test_modeling_tf_vision_encoder_decoder.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add test_inference_coco_en for TF test
* fix quality
* fix name
* build doc
* add main_input_name
* Fix ckpt name in test
* fix diff between master and this PR
* fix doc
* fix style and quality
* fix missing doc
* fix labels handling
* Delete auto.rst
* Add the changes done in #14016
* fix prefix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-10 13:30:14 -05:00
AK391
c9504b2f50
MT5 Spaces badge
2022-01-10 12:57:08 -05:00
AK391
daec528ca9
T5 Spaces badge
2022-01-10 12:51:39 -05:00
AK391
0554e4d5c5
MarianMT Spaces badge
2022-01-10 12:47:12 -05:00
AK391
7ec6aad23d
Pegasus Spaces badge
2022-01-10 12:39:22 -05:00
AK391
03f8b9c9e0
BART Spaces badge
2022-01-10 12:33:59 -05:00
Stas Bekman
37bc0b4e53
[performance doc] Power and Cooling ( #14935 )
...
* [performance doc] Power and Cooling
* more docs
* Update docs/source/performance.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* reword
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-10 09:21:04 -08:00
AK391
20f169b523
Longformer Spaces badge
2022-01-10 12:14:18 -05:00
Suraj Patil
3e9fdcf019
[DOC] fix doc examples for bart-like models ( #15093 )
...
* fix doc examples
* remove double colons
2022-01-10 18:13:28 +01:00
AK391
4fbc924d0a
Funnel Transformer spaces badge
2022-01-10 12:06:05 -05:00
Sylvain Gugger
61d18ae035
Happy New Year! ( #15094 )
2022-01-10 12:05:57 -05:00
AK391
222c09a635
ELECTRA Spaces badge
2022-01-10 11:53:23 -05:00
Stas Bekman
31838d3e11
[doc] normalize HF Transformers string ( #15023 )
2022-01-10 08:44:33 -08:00
AK391
84f360e862
FlauBERT spaces badge
2022-01-10 11:41:10 -05:00