transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Suraj Patil	f6e74a63ca	Add m2m100 (#10236 ) * m2m_100 * no layernorm_embedding * sinusoidal positional embeddings * update pos embeddings * add default config values * tokenizer * add conversion script * fix config * fix pos embed * remove _float_tensor * update tokenizer * update lang codes * handle lang codes * fix pos embeds * fix spm key * put embedding weights on device * remove qa and seq classification heads * fix convert script * lang codes pn one line * fix embeds * fix tokenizer * fix tokenizer * add fast tokenizer * style * M2M100MT => M2M100 * fix copyright, style * tokenizer converter * vocab file * remove fast tokenizer * fix embeds * fix tokenizer * fix tests * add tokenizer tests * add integration test * quality * fix model name * fix test * doc * doc * fix doc * add copied from statements * fix tokenizer tests * apply review suggestions * fix urls * fix shift_tokens_right * apply review suggestions * fix * fix doc * add lang code to id * remove unused function * update checkpoint names * fix copy * fix tokenizer * fix checkpoint names * fix merge issue * style	2021-03-06 22:14:16 +05:30
Stas Bekman	88a951e3cc	offline mode for firewalled envs (#10407 ) * offline mode start * add specific values * fix fallback * add test * better values check and range * test that actually works * document the offline mode * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more strict check * cleaner test * pt-only test * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-05 17:27:48 -08:00
Lysandre Debut	6b58e15507	Fix torch 1.8.0 segmentation fault (#10546 ) * Only run one test * Patch segfault * Fix summarization pipeline * Ready for merge	2021-03-05 12:10:19 -05:00
Nicolas Patry	54e55b52d4	Fixing conversation test for torch 1.8 (#10545 )	2021-03-05 09:24:14 -05:00
Patrick von Platen	c503a1c15e	[ProphetNet] Bart-like Refactor (#10501 ) * first step to refactor * make all fast tests pass * make all slow tests pass * save intermediate * correct cache * finish PR * make fp16 work	2021-03-04 23:27:12 +03:00
Sylvain Gugger	6290169eb3	Rework TPU checkpointing in Trainer (#10504 ) * Rework TPU checkpointing in Trainer * Wraps the barrier in a dist test * Address review comments * Remove line	2021-03-04 11:46:11 -05:00
Mehrad Moradshahi	1750e62900	Generate can return cross-attention weights too (#10493 )	2021-03-03 13:57:02 +05:30
Patrick von Platen	0234de8418	Add Fine-Tuning for Wav2Vec2 (#10145 ) * add encode labels function to tokenizer * start adding finetuning * init dropout * upload * correct convert script * apply changes * fix second typo * make first dummy training run * adapt convert script * push confg for comparison * remove conf * finish training * adapt data collator * add research folder * update according to fairseq feedback * some minor corrections * refactor masking indices a bit * some minor changes * clean tokenizer * finish clean-up * remove previous logic * update run script * correct training * finish changes * finish model * correct bug * fix training a bit more * add some tests * finish gradient checkpointing * finish example * correct gradient checkpointing * improve tokenization method * revert changes in tokenizer * revert general change * adapt fine-tuning * update * save intermediate test * Update README.md * finish finetuning * delete conversion script * Update src/transformers/models/wav2vec2/configuration_wav2vec2.py * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * finish wav2vec2 script * finish wav2vec2 fine-tuning * finalize test * correct test * adapt tests * finish * remove test file Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-01 12:13:17 +03:00
Tanmay Garg	256482ac92	Introduce save_strategy training argument (#10286 ) * Introduce save_strategy training argument * deprecate EvaluationStrategy * collapse EvaluationStrategy and LoggingStrategy into a single IntervalStrategy enum * modify tests to use modified enum	2021-02-27 19:34:22 -05:00
Kai Fricke	98569d4ba2	Add Ray Tune hyperparameter search integration test (#10414 )	2021-02-26 10:18:33 -05:00
Julien Chaumond	83d2d55c94	[ci, flax] non-existing models are unlikely to pass tests (#10409 ) 😂	2021-02-26 12:35:36 +03:00
Sylvain Gugger	26f8b2cb10	Make Barthez tokenizer tests a bit faster (#10399 ) * Make Barthez tokenizer tests a bit faster * Quality	2021-02-25 11:42:25 -05:00
Sehoon Kim	63645b3b11	I-BERT model support (#10153 ) * IBertConfig, IBertTokentizer added * IBert Model names moified * tokenizer bugfix * embedding -> QuantEmbedding * quant utils added * quant_mode added to configuration * QuantAct added, Embedding layer + QuantAct addition * QuantAct added * unused path removed, QKV quantized * self attention layer all quantized, except softmax * temporarl commit * all liner layers quantized * quant_utils bugfix * bugfix: requantization missing * IntGELU added * IntSoftmax added * LayerNorm implemented * LayerNorm implemented all * names changed: roberta->ibert * config not inherit from ROberta * No support for CausalLM * static quantization added, quantize_model.py removed * import modules uncommented * copyrights fixed * minor bugfix * quant_modules, quant_utils merged as one file * import * fixed * unused runfile removed * make style run * configutration.py docstring fixed * refactoring: comments removed, function name fixed * unused dependency removed * typo fixed * comments(Copied from), assertion string added * refactoring: super(..) -> super(), etc. * refactoring * refarctoring * make style * refactoring * cuda -> to(x.device) * weight initialization removed * QuantLinear set_param removed * QuantEmbedding set_param removed * IntLayerNorm set_param removed * assert string added * assertion error message fixed * is_decoder removed * enc-dec arguments/functions removed * Converter removed * quant_modules docstring fixed * conver_slow_tokenizer rolled back * quant_utils docstring fixed * unused aruments e.g. use_cache removed from config * weight initialization condition fixed * x_min, x_max initialized with small values to avoid div-zero exceptions * testing code for ibert * test emb, linear, gelu, softmax added * test ln and act added * style reformatted * force_dequant added * error tests overrided * make style * Style + Docs * force dequant tests added * Fix fast tokenizer in init * Fix doc * Remove space * docstring, IBertConfig, chunk_size * test_modeling_ibert refactoring * quant_modules.py refactoring * e2e integration test added * tokenizers removed * IBertConfig added to tokenizer_auto.py * bugfix * fix docs & test * fix style num 2 * final fixes Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-25 10:06:42 -05:00
Patrick von Platen	cb38ffcc5e	[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324 ) * push to show * small improvement * small improvement * Update src/transformers/feature_extraction_utils.py * Update src/transformers/feature_extraction_utils.py * implement base * add common tests * make all tests pass for wav2vec2 * make padding work & add more tests * finalize feature extractor utils * add call method to feature extraction * finalize feature processor * finish tokenizer * finish general processor design * finish tests * typo * remove bogus file * finish docstring * add docs * finish docs * small fix * correct docs * save intermediate * load changes * apply changes * apply changes to doc * change tests * apply surajs recommend * final changes * Apply suggestions from code review * fix typo * fix import * correct docstring	2021-02-25 17:42:46 +03:00
abhishek thakur	2d458b2c7d	ConvBERT fix torch <> tf weights conversion (#10314 ) * convbert conversion test * fin * fin * fin * clean up tf<->pt conversion * remove from_pt Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2021-02-24 14:55:34 +03:00
Sylvain Gugger	9e147d31f6	Deprecate prepare_seq2seq_batch (#10287 ) * Deprecate prepare_seq2seq_batch * Fix last tests * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com> * More review comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-02-22 12:36:16 -05:00
Julien Plu	19e737b93e	Making TF Longformer-like models compliant with AMP (#10233 ) * AMP * Add LED * Apply style * Fix longformer	2021-02-22 15:41:56 +01:00
Pengcheng He	9a7e63729f	Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018 ) * Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models; * DeBERTa-v2 * Fix v2 model loading issue (#10129) * Doc members * Update src/transformers/models/deberta/modeling_deberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address Sylvain's comments * Address Patrick's comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-19 18:34:44 -05:00
Julien Plu	34df26ec3a	Making TF OpenAI GPT model compliant with AMP and XLA (#10261 ) * Fix AMP and XLA * Remove useless var	2021-02-19 09:33:25 -05:00
Julien Plu	3e116ed331	Making TF TransfoXL model compliant with AMP (#10264 ) * Fix AMP * Apply style * Remove unused import	2021-02-19 06:58:07 -05:00
Julien Plu	86caeb7636	Fix XLA and AMP (#10262 )	2021-02-19 06:57:16 -05:00
Julien Plu	3d72d47f09	Making TF MPNet model compliant with XLA (#10260 ) * Fix XLA * Rework cast * Apply style	2021-02-19 06:56:41 -05:00
Julien Plu	fb56bf2584	Making TF MobileBert model compliant with AMP (#10259 ) * Fix AMP * Trigger CI * Rework cast	2021-02-19 06:55:25 -05:00
Julien Plu	2fc6284f04	Making TF Lxmert model compliant with AMP (#10257 ) * Fix AMP * Rework cast * Apply style	2021-02-19 06:54:14 -05:00
Stas Bekman	4eddc459a9	[trainer] implement support for full fp16 in evaluation/predict (#10268 ) * implement --fp16_full_eval * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * add test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-18 17:02:35 -08:00
Stas Bekman	d9a81fc0c5	fix func signature (#10271 )	2021-02-18 16:44:42 -08:00
Stas Bekman	97e688bc22	[Trainer] memory tracker metrics (#10225 ) * memory tracker metrics * go back to eval for somewhat consistency * handle no-gpu case * deal with stackable eval calls * restore callback order * style * simplify the API * add test * docs * consistently use eval_ prefix * improve docs * Update src/transformers/trainer_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename method * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-18 09:27:32 -08:00
Julien Plu	2acae50a0c	Reduce the time spent for the TF slow tests (#10152 ) * rework savedmodel slow test * Improve savedmodel tests * Remove useless content	2021-02-18 15:52:57 +01:00
Julien Plu	14ed3b978e	Fix AMP (#10216 )	2021-02-18 06:29:43 -05:00
Julien Plu	bdf1669e3f	Making TF GPT2 compliant with XLA and AMP (#10230 ) * Fix XLA and AMP * Fix AMP and XLA * Apply style * Apply Patrick's comment	2021-02-18 09:36:01 +01:00
Julien Plu	7246785a67	Make TF CTRL compliant with XLA and AMP (#10209 ) * Fix XLA and AMP * Apply style * Remove useless cast	2021-02-17 18:54:15 +01:00
Julien Plu	fdb2351ebb	Making TF XLM-like models XLA and AMP compliant (#10211 ) * Fix Flaubert and XLM * Remove useless cast * Tiny fix * Tiny fix	2021-02-17 18:02:48 +01:00
Julien Plu	83d803ba02	Making TF BART-like models XLA and AMP compliant (#10191 ) * Update BART * Update Blenderbot * Update BlenderbotSmall * Update Marian * Update MBart * Update MBart * Update Pegasus * Update template * Fix Marian and Pegasus * Apply style * Default initializer * Default initializer * Default initializer * Remove int32 casts * Fix template * Remove more cast	2021-02-17 17:48:56 +01:00
Daniel Stancl	8d79e5ca49	Fix head masking for TFT5 (#9877 ) * Fix head_mask and decoder_head_mask in TFT5 models * Enable test_headmasking both fot TFT5 tester and TFT5EncoderOnly tester Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2021-02-17 19:00:09 +03:00
Sylvain Gugger	7169d1ea7b	Store FLOS as floats to avoid overflow. (#10213 )	2021-02-16 11:15:15 -05:00
Julien Plu	5c2d66a2f5	Unlock XLA test for convbert (#10207 )	2021-02-16 07:59:41 -05:00
Lysandre Debut	8cbd0bd137	Specify dataset dtype (#10195 ) Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2021-02-15 12:57:17 -05:00
Julien Plu	31b0560ab4	Add AMP for Albert (#10141 )	2021-02-15 17:18:33 +01:00
Suraj Patil	6fc940ed09	Add mBART-50 (#10154 ) * add tokenizer for mBART-50 * update tokenizers * make src_lang and tgt_lang optional * update tokenizer test * add setter * update docs * update conversion script * update docs * update conversion script * update tokenizer * update test * update docs * doc * address Sylvain's suggestions * fix test * fix formatting * nits	2021-02-15 20:58:54 +05:30
Julien Plu	c8d3fa0dfd	Check TF ops for ONNX compliance (#10025 ) * Add check-ops script * Finish to implement check_tf_ops and start the test * Make the test mandatory only for BERT * Update tf_ops folder * Remove useless classes * Add the ONNX test for GPT2 and BART * Add a onnxruntime slow test + better opset flexibility * Fix test + apply style * fix tests * Switch min opset from 12 to 10 * Update src/transformers/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix GPT2 * Remove extra shape_list usage * Fix GPT2 * Address Morgan's comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-02-15 07:55:10 -05:00
Nicolas Patry	900daec24e	Fixing NER pipeline for list inputs. (#10184 ) Fixes #10168	2021-02-15 06:22:45 -05:00
Nicolas Patry	c9837a0d27	Conversion from slow to fast for BPE spm vocabs contained an error. (#10120 ) * Conversion from slow to fast for BPE spm vocabs contained an error. - There is only 1 test currently (tokenizers + slow) that used the modified path and it's reformer, which does not contain any ids modification so the bug was silent for now. - The real issue is that vocab variable was overloaded by SentencePieceExtractor, leading to Slow specific vocab oddities to be completely ignored - The bug was reported here https://github.com/huggingface/transformers/issues/9518 - Ran the complete tokenization test suite with slow without error (`RUN_SLOW=1 pytest -sv tests/test_tokenization_`) Remove rebase error. * Adding the fixture.	2021-02-13 08:24:53 -05:00
Julien Chaumond	641f418e10	[hf_api] delete deprecated methods and tests (2)	2021-02-12 21:46:17 +01:00
Julien Chaumond	eed31db948	[hf_api] delete deprecated methods and tests (#10159 ) * [hf_api] delete deprecated methods and tests cc @lhoestq * Update test_hf_api.py	2021-02-12 15:35:06 -05:00
Patrick von Platen	495c157d6f	[Wav2Vec2] Improve Tokenizer & Model for batched inference (#10117 ) * save intermediate * finish batch the same as fairseq * add normalization * fix batched input * add better comment * Update src/transformers/models/wav2vec2/modeling_wav2vec2.py * add nice docstring * add tokenizer tests * make all slow tests pass * finish PR * correct import	2021-02-11 15:40:54 +03:00
Suraj Patil	c130e67dce	remove adjust_logits_during_generation method (#10087 ) * add forced logits processors * delete adjust_logits method * add forced_eos_token_id argument in config * add tests for forced logits processors * update gen utils tests * add forced option to tf generate * remove adjust_logits method from tf models * update adjust_logits for marian * delete _force_token_id_to_be_generated method * style * import warnings * pass max_length to _get_logits_processor * set forced_eos_token_id to None * set forced attributes in conf utils * typo * fix rag generate * add forced_eos_token_id in rag config * remove force_bos_token_to_be_generated from BartConfig * remove _force_token_ids_generation from FSMT * nit * fix negative constant * apply suggestions from code review	2021-02-10 22:39:09 +05:30
Julien Plu	22a32cf485	Fix TF LED/Longformer attentions computation (#10007 ) * Fix test * Remove commented test * Fix name * Apply style * Fix check copies * Remove prints * Restore boolean * Fix reshape	2021-02-10 10:58:37 -05:00
Lysandre Debut	0d8e554d42	Line endings should be LF across repo and not CRLF (#10119 )	2021-02-10 10:50:00 -05:00
abhishek thakur	480a9d6ba0	Fix TFConvBertModelIntegrationTest::test_inference_masked_lm Test (#10104 )	2021-02-09 20:22:54 +01:00
Daniel Stancl	e7381c4596	Add head_mask and decoder_head_mask to TF LED (#9988 ) * Add head masking to TF LED * Add head_mask to Longformer + one doc piece to LED * Fix integration tests	2021-02-09 11:45:18 -05:00

1 2 3 4 5 ...

875 Commits