Suraj Patil
f6e74a63ca
Add m2m100 ( #10236 )
...
* m2m_100
* no layernorm_embedding
* sinusoidal positional embeddings
* update pos embeddings
* add default config values
* tokenizer
* add conversion script
* fix config
* fix pos embed
* remove _float_tensor
* update tokenizer
* update lang codes
* handle lang codes
* fix pos embeds
* fix spm key
* put embedding weights on device
* remove qa and seq classification heads
* fix convert script
* lang codes pn one line
* fix embeds
* fix tokenizer
* fix tokenizer
* add fast tokenizer
* style
* M2M100MT => M2M100
* fix copyright, style
* tokenizer converter
* vocab file
* remove fast tokenizer
* fix embeds
* fix tokenizer
* fix tests
* add tokenizer tests
* add integration test
* quality
* fix model name
* fix test
* doc
* doc
* fix doc
* add copied from statements
* fix tokenizer tests
* apply review suggestions
* fix urls
* fix shift_tokens_right
* apply review suggestions
* fix
* fix doc
* add lang code to id
* remove unused function
* update checkpoint names
* fix copy
* fix tokenizer
* fix checkpoint names
* fix merge issue
* style
2021-03-06 22:14:16 +05:30
Stas Bekman
88a951e3cc
offline mode for firewalled envs ( #10407 )
...
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
Lysandre Debut
6b58e15507
Fix torch 1.8.0 segmentation fault ( #10546 )
...
* Only run one test
* Patch segfault
* Fix summarization pipeline
* Ready for merge
2021-03-05 12:10:19 -05:00
Nicolas Patry
54e55b52d4
Fixing conversation test for torch 1.8 ( #10545 )
2021-03-05 09:24:14 -05:00
Patrick von Platen
c503a1c15e
[ProphetNet] Bart-like Refactor ( #10501 )
...
* first step to refactor
* make all fast tests pass
* make all slow tests pass
* save intermediate
* correct cache
* finish PR
* make fp16 work
2021-03-04 23:27:12 +03:00
Sylvain Gugger
6290169eb3
Rework TPU checkpointing in Trainer ( #10504 )
...
* Rework TPU checkpointing in Trainer
* Wraps the barrier in a dist test
* Address review comments
* Remove line
2021-03-04 11:46:11 -05:00
Mehrad Moradshahi
1750e62900
Generate can return cross-attention weights too ( #10493 )
2021-03-03 13:57:02 +05:30
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument ( #10286 )
...
* Introduce save_strategy training argument
* deprecate EvaluationStrategy
* collapse EvaluationStrategy and LoggingStrategy into a single
IntervalStrategy enum
* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Kai Fricke
98569d4ba2
Add Ray Tune hyperparameter search integration test ( #10414 )
2021-02-26 10:18:33 -05:00
Julien Chaumond
83d2d55c94
[ci, flax] non-existing models are unlikely to pass tests ( #10409 )
...
😂
2021-02-26 12:35:36 +03:00
Sylvain Gugger
26f8b2cb10
Make Barthez tokenizer tests a bit faster ( #10399 )
...
* Make Barthez tokenizer tests a bit faster
* Quality
2021-02-25 11:42:25 -05:00
Sehoon Kim
63645b3b11
I-BERT model support ( #10153 )
...
* IBertConfig, IBertTokentizer added
* IBert Model names moified
* tokenizer bugfix
* embedding -> QuantEmbedding
* quant utils added
* quant_mode added to configuration
* QuantAct added, Embedding layer + QuantAct addition
* QuantAct added
* unused path removed, QKV quantized
* self attention layer all quantized, except softmax
* temporarl commit
* all liner layers quantized
* quant_utils bugfix
* bugfix: requantization missing
* IntGELU added
* IntSoftmax added
* LayerNorm implemented
* LayerNorm implemented all
* names changed: roberta->ibert
* config not inherit from ROberta
* No support for CausalLM
* static quantization added, quantize_model.py removed
* import modules uncommented
* copyrights fixed
* minor bugfix
* quant_modules, quant_utils merged as one file
* import * fixed
* unused runfile removed
* make style run
* configutration.py docstring fixed
* refactoring: comments removed, function name fixed
* unused dependency removed
* typo fixed
* comments(Copied from), assertion string added
* refactoring: super(..) -> super(), etc.
* refactoring
* refarctoring
* make style
* refactoring
* cuda -> to(x.device)
* weight initialization removed
* QuantLinear set_param removed
* QuantEmbedding set_param removed
* IntLayerNorm set_param removed
* assert string added
* assertion error message fixed
* is_decoder removed
* enc-dec arguments/functions removed
* Converter removed
* quant_modules docstring fixed
* conver_slow_tokenizer rolled back
* quant_utils docstring fixed
* unused aruments e.g. use_cache removed from config
* weight initialization condition fixed
* x_min, x_max initialized with small values to avoid div-zero exceptions
* testing code for ibert
* test emb, linear, gelu, softmax added
* test ln and act added
* style reformatted
* force_dequant added
* error tests overrided
* make style
* Style + Docs
* force dequant tests added
* Fix fast tokenizer in init
* Fix doc
* Remove space
* docstring, IBertConfig, chunk_size
* test_modeling_ibert refactoring
* quant_modules.py refactoring
* e2e integration test added
* tokenizers removed
* IBertConfig added to tokenizer_auto.py
* bugfix
* fix docs & test
* fix style num 2
* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur
2d458b2c7d
ConvBERT fix torch <> tf weights conversion ( #10314 )
...
* convbert conversion test
* fin
* fin
* fin
* clean up tf<->pt conversion
* remove from_pt
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-02-24 14:55:34 +03:00
Sylvain Gugger
9e147d31f6
Deprecate prepare_seq2seq_batch ( #10287 )
...
* Deprecate prepare_seq2seq_batch
* Fix last tests
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* More review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-22 12:36:16 -05:00
Julien Plu
19e737b93e
Making TF Longformer-like models compliant with AMP ( #10233 )
...
* AMP
* Add LED
* Apply style
* Fix longformer
2021-02-22 15:41:56 +01:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… ( #10018 )
...
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;
* DeBERTa-v2
* Fix v2 model loading issue (#10129 )
* Doc members
* Update src/transformers/models/deberta/modeling_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Address Sylvain's comments
* Address Patrick's comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Julien Plu
34df26ec3a
Making TF OpenAI GPT model compliant with AMP and XLA ( #10261 )
...
* Fix AMP and XLA
* Remove useless var
2021-02-19 09:33:25 -05:00
Julien Plu
3e116ed331
Making TF TransfoXL model compliant with AMP ( #10264 )
...
* Fix AMP
* Apply style
* Remove unused import
2021-02-19 06:58:07 -05:00
Julien Plu
86caeb7636
Fix XLA and AMP ( #10262 )
2021-02-19 06:57:16 -05:00
Julien Plu
3d72d47f09
Making TF MPNet model compliant with XLA ( #10260 )
...
* Fix XLA
* Rework cast
* Apply style
2021-02-19 06:56:41 -05:00
Julien Plu
fb56bf2584
Making TF MobileBert model compliant with AMP ( #10259 )
...
* Fix AMP
* Trigger CI
* Rework cast
2021-02-19 06:55:25 -05:00
Julien Plu
2fc6284f04
Making TF Lxmert model compliant with AMP ( #10257 )
...
* Fix AMP
* Rework cast
* Apply style
2021-02-19 06:54:14 -05:00
Stas Bekman
4eddc459a9
[trainer] implement support for full fp16 in evaluation/predict ( #10268 )
...
* implement --fp16_full_eval
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
* add test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-18 17:02:35 -08:00
Stas Bekman
d9a81fc0c5
fix func signature ( #10271 )
2021-02-18 16:44:42 -08:00
Stas Bekman
97e688bc22
[Trainer] memory tracker metrics ( #10225 )
...
* memory tracker metrics
* go back to eval for somewhat consistency
* handle no-gpu case
* deal with stackable eval calls
* restore callback order
* style
* simplify the API
* add test
* docs
* consistently use eval_ prefix
* improve docs
* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* rename method
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-18 09:27:32 -08:00
Julien Plu
2acae50a0c
Reduce the time spent for the TF slow tests ( #10152 )
...
* rework savedmodel slow test
* Improve savedmodel tests
* Remove useless content
2021-02-18 15:52:57 +01:00
Julien Plu
14ed3b978e
Fix AMP ( #10216 )
2021-02-18 06:29:43 -05:00
Julien Plu
bdf1669e3f
Making TF GPT2 compliant with XLA and AMP ( #10230 )
...
* Fix XLA and AMP
* Fix AMP and XLA
* Apply style
* Apply Patrick's comment
2021-02-18 09:36:01 +01:00
Julien Plu
7246785a67
Make TF CTRL compliant with XLA and AMP ( #10209 )
...
* Fix XLA and AMP
* Apply style
* Remove useless cast
2021-02-17 18:54:15 +01:00
Julien Plu
fdb2351ebb
Making TF XLM-like models XLA and AMP compliant ( #10211 )
...
* Fix Flaubert and XLM
* Remove useless cast
* Tiny fix
* Tiny fix
2021-02-17 18:02:48 +01:00
Julien Plu
83d803ba02
Making TF BART-like models XLA and AMP compliant ( #10191 )
...
* Update BART
* Update Blenderbot
* Update BlenderbotSmall
* Update Marian
* Update MBart
* Update MBart
* Update Pegasus
* Update template
* Fix Marian and Pegasus
* Apply style
* Default initializer
* Default initializer
* Default initializer
* Remove int32 casts
* Fix template
* Remove more cast
2021-02-17 17:48:56 +01:00
Daniel Stancl
8d79e5ca49
Fix head masking for TFT5 ( #9877 )
...
* Fix head_mask and decoder_head_mask in TFT5 models
* Enable test_headmasking both fot TFT5 tester
and TFT5EncoderOnly tester
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-02-17 19:00:09 +03:00
Sylvain Gugger
7169d1ea7b
Store FLOS as floats to avoid overflow. ( #10213 )
2021-02-16 11:15:15 -05:00
Julien Plu
5c2d66a2f5
Unlock XLA test for convbert ( #10207 )
2021-02-16 07:59:41 -05:00
Lysandre Debut
8cbd0bd137
Specify dataset dtype ( #10195 )
...
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2021-02-15 12:57:17 -05:00
Julien Plu
31b0560ab4
Add AMP for Albert ( #10141 )
2021-02-15 17:18:33 +01:00
Suraj Patil
6fc940ed09
Add mBART-50 ( #10154 )
...
* add tokenizer for mBART-50
* update tokenizers
* make src_lang and tgt_lang optional
* update tokenizer test
* add setter
* update docs
* update conversion script
* update docs
* update conversion script
* update tokenizer
* update test
* update docs
* doc
* address Sylvain's suggestions
* fix test
* fix formatting
* nits
2021-02-15 20:58:54 +05:30
Julien Plu
c8d3fa0dfd
Check TF ops for ONNX compliance ( #10025 )
...
* Add check-ops script
* Finish to implement check_tf_ops and start the test
* Make the test mandatory only for BERT
* Update tf_ops folder
* Remove useless classes
* Add the ONNX test for GPT2 and BART
* Add a onnxruntime slow test + better opset flexibility
* Fix test + apply style
* fix tests
* Switch min opset from 12 to 10
* Update src/transformers/file_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Fix GPT2
* Remove extra shape_list usage
* Fix GPT2
* Address Morgan's comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-15 07:55:10 -05:00
Nicolas Patry
900daec24e
Fixing NER pipeline for list inputs. ( #10184 )
...
Fixes #10168
2021-02-15 06:22:45 -05:00
Nicolas Patry
c9837a0d27
Conversion from slow to fast for BPE spm vocabs contained an error. ( #10120 )
...
* Conversion from slow to fast for BPE spm vocabs contained an error.
- There is only 1 test currently (tokenizers + slow) that used the modified path
and it's reformer, which does not contain any ids modification so the
bug was silent for now.
- The real issue is that vocab variable was overloaded by
SentencePieceExtractor, leading to Slow specific vocab oddities to be
completely ignored
- The bug was reported here https://github.com/huggingface/transformers/issues/9518
- Ran the complete tokenization test suite with slow without error
(`RUN_SLOW=1 pytest -sv tests/test_tokenization_*`)
* Remove rebase error.
* Adding the fixture.
2021-02-13 08:24:53 -05:00
Julien Chaumond
641f418e10
[hf_api] delete deprecated methods and tests (2)
2021-02-12 21:46:17 +01:00
Julien Chaumond
eed31db948
[hf_api] delete deprecated methods and tests ( #10159 )
...
* [hf_api] delete deprecated methods and tests
cc @lhoestq
* Update test_hf_api.py
2021-02-12 15:35:06 -05:00
Patrick von Platen
495c157d6f
[Wav2Vec2] Improve Tokenizer & Model for batched inference ( #10117 )
...
* save intermediate
* finish batch the same as fairseq
* add normalization
* fix batched input
* add better comment
* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py
* add nice docstring
* add tokenizer tests
* make all slow tests pass
* finish PR
* correct import
2021-02-11 15:40:54 +03:00
Suraj Patil
c130e67dce
remove adjust_logits_during_generation method ( #10087 )
...
* add forced logits processors
* delete adjust_logits method
* add forced_eos_token_id argument in config
* add tests for forced logits processors
* update gen utils tests
* add forced option to tf generate
* remove adjust_logits method from tf models
* update adjust_logits for marian
* delete _force_token_id_to_be_generated method
* style
* import warnings
* pass max_length to _get_logits_processor
* set forced_eos_token_id to None
* set forced attributes in conf utils
* typo
* fix rag generate
* add forced_eos_token_id in rag config
* remove force_bos_token_to_be_generated from BartConfig
* remove _force_token_ids_generation from FSMT
* nit
* fix negative constant
* apply suggestions from code review
2021-02-10 22:39:09 +05:30
Julien Plu
22a32cf485
Fix TF LED/Longformer attentions computation ( #10007 )
...
* Fix test
* Remove commented test
* Fix name
* Apply style
* Fix check copies
* Remove prints
* Restore boolean
* Fix reshape
2021-02-10 10:58:37 -05:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF ( #10119 )
2021-02-10 10:50:00 -05:00
abhishek thakur
480a9d6ba0
Fix TFConvBertModelIntegrationTest::test_inference_masked_lm Test ( #10104 )
2021-02-09 20:22:54 +01:00
Daniel Stancl
e7381c4596
Add head_mask and decoder_head_mask to TF LED ( #9988 )
...
* Add head masking to TF LED
* Add head_mask to Longformer + one doc piece to LED
* Fix integration tests
2021-02-09 11:45:18 -05:00