transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	b29eb247d3	Split checkpoint from model_name_or_path in examples (#11492 ) * Split checkpoint from model_name_or_path in examples * Address review comments * Address review comments	2021-04-29 18:33:47 -04:00
Michael Benayoun	d6ec54ba36	solved coefficient issue for the TF version of gelu_fast (#11514 ) Co-authored-by: Michael Benayoun <michael@huggingface.co>	2021-04-29 21:47:26 +02:00
Sylvain Gugger	ad1f7bef13	Reformat to make code clearer in tokenizer call (#11497 ) * Reformat to make code clearer * Reformat to make code clearer	2021-04-29 07:51:09 -04:00
Patrick von Platen	f748bd4242	[Flax] Add docstrings & model outputs (#11498 ) * add attentions & hidden states * add model outputs + docs * finish docs * finish tests * finish impl * del @ * finish * finish * correct test * apply sylvains suggestions * Update src/transformers/models/bert/modeling_flax_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * simplify more Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-29 12:04:51 +02:00
Hamel Husain	3f6add8bab	fix #1149 (#11493 )	2021-04-28 11:16:41 -04:00
Hamel Husain	c0eb218a55	Update `PreTrainedTokenizerBase` to check/handle batch length for `text_pair` parameter (#11486 ) * Update tokenization_utils_base.py * add assertion * check batch len * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add error message Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-28 10:11:17 -04:00
Sylvain Gugger	2d27900b5d	Update min versions in README and add Flax (#11472 ) * Update min versions in README and add Flax * Adapt index	2021-04-28 09:10:06 -04:00
Suraj Patil	8d43c71a1c	fix docs for decoder_input_ids (#11466 ) * fix docs for decoder_input_ids * revert the changes for bart and mbart	2021-04-27 19:36:36 +05:30
Hamel Husain	7ceff67e1a	Finish Making Quick Tour respect the model object (#11467 ) * finish quicktour * fix import * fix print * explain config default better * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-27 10:04:12 -04:00
Hamel Husain	88ac60f7b5	update QuickTour docs to reflect model output object (#11462 ) * update docs to reflect model output object * run make style`	2021-04-26 22:18:37 -04:00
Ashwin Geet D'Sa	741d48f5c7	Remove max length beam scorer (#11378 ) * removed max_len * removed max_length from BeamSearchScorer * correct max length * finish * del vim * finish & add test Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-27 00:28:40 +02:00
Stas Bekman	bc2571e61c	[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418 ) * adding Z-inf * revamp config process * up version requirement * wip * massive rewrite * cleanup * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * consistent json commas * act on suggestions * leave this feature for 0.3.16 * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-26 10:40:32 -07:00
Jaimeen Ahn	0661abc545	Variable Correction for Consistency in Distillation Example (#11444 ) As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively, the correction makes the example work	2021-04-26 13:30:48 -04:00
Bhadresh Savani	1d30ec95c7	[Examples] Fixes inconsistency around eval vs val and predict vs test (#11380 ) * added changes for uniformity * modified files * corrected typo * fixed qa scripts * fix typos * fixed predict typo in qa no trainer * fixed test file * reverted trainer changes * reverted trainer changes in custom exmaples * updated readme * added changes in deepspeed test * added changes for predict and eval	2021-04-26 09:24:31 -07:00
Sylvain Gugger	7959d83599	Give each test a different repo name (#11453 )	2021-04-26 11:52:23 -04:00
Sylvain Gugger	b03b2a653d	Style	2021-04-26 11:45:04 -04:00
Stas Bekman	ce11318e7e	make sure to test against the local checkout (#11437 )	2021-04-26 08:42:43 -07:00
Stas Bekman	a753cafdc0	[docs] fix invalid class name (#11438 ) * fix invalid class name * proper ref * proper ref	2021-04-26 08:37:32 -07:00
Kostas Stathoulopoulos	6715e3b6a1	Clarify description of the is_split_into_words argument (#11449 ) * Improve documentation for is_split_into_words argument * Change description wording	2021-04-26 11:29:36 -04:00
Sylvain Gugger	ab2cabb964	Pass along seed to DistributedSampler (#11406 ) * Pass along seed to DistributedSampler * Add seed to DistributedLengthGroupedSampler	2021-04-26 10:26:52 -04:00
LSinev	b24ead87e1	fix some typos in docs, comments, logging/errors (#11432 )	2021-04-26 09:14:25 -04:00
Amine Abdaoui	e3e70f9551	docs(examples): fix link to TPU launcher script (#11427 )	2021-04-26 09:08:43 -04:00
Sylvain Gugger	d7633a4e46	Add basic support for FP16 in SageMaker model parallelism (#11407 ) * Add FP16 support for SageMaker MP * Add print debugs * Squeeze * Remove debug statements * Add defensive check * Typo	2021-04-26 08:55:14 -04:00
Daniel Stancl	38a716cd41	TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699 ) * Add cross_attn_head_mask to BART * Fix cross_attentions in TFBart-like models * This commit enables returning of `cross_attentions` for TFBart-like models * It also fixes attention head masking in cross-attenion module * Update TF model templates * Fix missing , in TF model templates * Fix typo: congig -> config	2021-04-26 14:16:21 +02:00
Sylvain Gugger	4bd6b54fa4	Pin black to 21.4b0	2021-04-26 08:12:54 -04:00
Sylvain Gugger	c1625b3261	With style	2021-04-26 08:07:29 -04:00
Sylvain Gugger	4b72cfd958	Pin black to 20.8.b1	2021-04-26 08:06:50 -04:00
Patrick von Platen	32dbb2d954	make style (#11442 )	2021-04-26 13:50:34 +02:00
Vasudev Gupta	04ab2ca639	add pooling layer support (#11439 )	2021-04-26 09:05:53 +02:00
abiolaTresor	30f065890e	updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434 )	2021-04-26 10:28:51 +05:30
cronoik	35cd8eed88	EncoderDecoderConfigs should not create new objects (#11300 ) * removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel * rollback to current version of the huggingface master branch * reworked version that ties the encoder and decoder config of the parent encoderdecoder instance * overwrite of resize_token_embeddings throws an error now * review comment suggestion Co-authored-by: Suraj Patil <surajp815@gmail.com> * implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig * added test to avoid diverging configs of wrapper class and wrapped classes * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py * make style Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-25 11:45:46 +02:00
Daniel Stancl	f45cb66bf6	Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964 ) * Add head_mask & decoder_head_mask + some corrections * Fix head masking for N-grams * Enable test_headmasking for encoder and decod * Fix one typo regarding in modeling_propgetnet.py * Enable test_headmasking for ProphetNetStandaloneDecoderModelTest and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py * make style * Fix cross_head_mask * Fix attention head mask naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Still need to merge #10605 to master to pass the tests	2021-04-25 11:06:16 +02:00
Sylvain Gugger	52166f672e	Style	2021-04-23 20:40:17 -04:00
cronoik	9cac4fab07	documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410 )	2021-04-23 20:19:15 -04:00
Sylvain Gugger	b7fc043fce	Merge branch 'master' of github.com:huggingface/transformers	2021-04-23 18:47:55 -04:00
Sylvain Gugger	81a6c7cd39	Use 3 workers for torch tests	2021-04-23 18:47:46 -04:00
Philip May	195bfd118a	Enable option for subword regularization in `XLMRobertaTokenizer` (#11149 ) * enable subword regularization. * fix tokenizer storage * fix docstring formatting * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by: Stefan Schweter <stefan@schweter.it> * fix docstring formatting * add test for subword regularization tokenizer * improve comments of test * add sp_model_kwargs * reformat docstring to match the style * add some more documentation * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve docstring * empty commit to trigger CI * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docstring formatting for sphinx Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-23 17:52:31 -04:00
Sylvain Gugger	1ef152eb48	Default to accuracy metric (#11405 )	2021-04-23 14:49:59 -04:00
Daniel Stancl	e3ff165aa5	Fix cross-attention head mask for Torch encoder-decoder models (#10605 ) * Fix cross-attention head mask for Torch BART models * Fix head masking for cross-attention module for the following models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart, Pegasus * Enable test_headmasking for M2M_100 model * Fix cross_head_mask for FSMT, LED and T5 * This commit fixes `head_mask` for cross-attention modules in the following models: FSMT, LED, T5 * It also contains some smaller changes in doc so that it is be perfectly clear the shape of `cross_head_mask` is the same as of `decoder_head_mask` * Update template * Fix template for BartForCausalLM * Fix cross_head_mask for Speech2Text models * Fix cross_head_mask in templates * Fix args order in BartForCausalLM template * Fix doc in BART templates * Make more explicit naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Fix doc * make style quality * Fix speech2text docstring	2021-04-23 18:58:06 +02:00
Sylvain Gugger	ca6b80cadb	Wrong branch Sylvain...	2021-04-23 12:46:54 -04:00
Sylvain Gugger	3951fc55ee	Try to trigger failure more	2021-04-23 12:44:54 -04:00
Sylvain Gugger	bd41a0f74d	Style	2021-04-23 12:32:37 -04:00
Nicola De Cao	1811883e80	Fixing bug in generation (#11297 ) When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.	2021-04-23 18:24:26 +02:00
Kiran R	5c00918681	added support for exporting of t5 to onnx with past_key_values (#10651 )	2021-04-23 18:14:20 +02:00
Patrick von Platen	50f4539b82	push (#11400 )	2021-04-23 15:36:27 +02:00
Sylvain Gugger	bf2e0cf70b	Trainer push to hub (#11328 ) * Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-23 09:17:37 -04:00
Teven	7bc86bea68	Fixed trainer total_flos relaoding in distributed mode (#11383 ) * Fixed trainer total_flos relaoding in distributed mode * logging flos at the end of training	2021-04-23 07:53:33 -04:00
Patrick von Platen	74e84f1fa6	make blenderbot test slow (#11395 )	2021-04-23 07:49:09 -04:00
Yoshitomo Matsubara	c3d6f33918	fixed typos (#11391 )	2021-04-23 07:48:42 -04:00
Max Del	a90d3f1862	Fix typo in text (#11396 )	2021-04-23 07:37:19 -04:00

1 2 3 4 5 ...

7087 Commits