transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-24 06:48:58 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	d7633a4e46	Add basic support for FP16 in SageMaker model parallelism (#11407 ) * Add FP16 support for SageMaker MP * Add print debugs * Squeeze * Remove debug statements * Add defensive check * Typo	2021-04-26 08:55:14 -04:00
Daniel Stancl	38a716cd41	TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699 ) * Add cross_attn_head_mask to BART * Fix cross_attentions in TFBart-like models * This commit enables returning of `cross_attentions` for TFBart-like models * It also fixes attention head masking in cross-attenion module * Update TF model templates * Fix missing , in TF model templates * Fix typo: congig -> config	2021-04-26 14:16:21 +02:00
Sylvain Gugger	4bd6b54fa4	Pin black to 21.4b0	2021-04-26 08:12:54 -04:00
Sylvain Gugger	c1625b3261	With style	2021-04-26 08:07:29 -04:00
Sylvain Gugger	4b72cfd958	Pin black to 20.8.b1	2021-04-26 08:06:50 -04:00
Patrick von Platen	32dbb2d954	make style (#11442 )	2021-04-26 13:50:34 +02:00
Vasudev Gupta	04ab2ca639	add pooling layer support (#11439 )	2021-04-26 09:05:53 +02:00
abiolaTresor	30f065890e	updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434 )	2021-04-26 10:28:51 +05:30
cronoik	35cd8eed88	EncoderDecoderConfigs should not create new objects (#11300 ) * removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel * rollback to current version of the huggingface master branch * reworked version that ties the encoder and decoder config of the parent encoderdecoder instance * overwrite of resize_token_embeddings throws an error now * review comment suggestion Co-authored-by: Suraj Patil <surajp815@gmail.com> * implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig * added test to avoid diverging configs of wrapper class and wrapped classes * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py * make style Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-25 11:45:46 +02:00
Daniel Stancl	f45cb66bf6	Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964 ) * Add head_mask & decoder_head_mask + some corrections * Fix head masking for N-grams * Enable test_headmasking for encoder and decod * Fix one typo regarding in modeling_propgetnet.py * Enable test_headmasking for ProphetNetStandaloneDecoderModelTest and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py * make style * Fix cross_head_mask * Fix attention head mask naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Still need to merge #10605 to master to pass the tests	2021-04-25 11:06:16 +02:00
Sylvain Gugger	52166f672e	Style	2021-04-23 20:40:17 -04:00
cronoik	9cac4fab07	documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410 )	2021-04-23 20:19:15 -04:00
Sylvain Gugger	b7fc043fce	Merge branch 'master' of github.com:huggingface/transformers	2021-04-23 18:47:55 -04:00
Sylvain Gugger	81a6c7cd39	Use 3 workers for torch tests	2021-04-23 18:47:46 -04:00
Philip May	195bfd118a	Enable option for subword regularization in `XLMRobertaTokenizer` (#11149 ) * enable subword regularization. * fix tokenizer storage * fix docstring formatting * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by: Stefan Schweter <stefan@schweter.it> * fix docstring formatting * add test for subword regularization tokenizer * improve comments of test * add sp_model_kwargs * reformat docstring to match the style * add some more documentation * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve docstring * empty commit to trigger CI * Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docstring formatting for sphinx Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-23 17:52:31 -04:00
Sylvain Gugger	1ef152eb48	Default to accuracy metric (#11405 )	2021-04-23 14:49:59 -04:00
Daniel Stancl	e3ff165aa5	Fix cross-attention head mask for Torch encoder-decoder models (#10605 ) * Fix cross-attention head mask for Torch BART models * Fix head masking for cross-attention module for the following models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart, Pegasus * Enable test_headmasking for M2M_100 model * Fix cross_head_mask for FSMT, LED and T5 * This commit fixes `head_mask` for cross-attention modules in the following models: FSMT, LED, T5 * It also contains some smaller changes in doc so that it is be perfectly clear the shape of `cross_head_mask` is the same as of `decoder_head_mask` * Update template * Fix template for BartForCausalLM * Fix cross_head_mask for Speech2Text models * Fix cross_head_mask in templates * Fix args order in BartForCausalLM template * Fix doc in BART templates * Make more explicit naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Fix doc * make style quality * Fix speech2text docstring	2021-04-23 18:58:06 +02:00
Sylvain Gugger	ca6b80cadb	Wrong branch Sylvain...	2021-04-23 12:46:54 -04:00
Sylvain Gugger	3951fc55ee	Try to trigger failure more	2021-04-23 12:44:54 -04:00
Sylvain Gugger	bd41a0f74d	Style	2021-04-23 12:32:37 -04:00
Nicola De Cao	1811883e80	Fixing bug in generation (#11297 ) When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.	2021-04-23 18:24:26 +02:00
Kiran R	5c00918681	added support for exporting of t5 to onnx with past_key_values (#10651 )	2021-04-23 18:14:20 +02:00
Patrick von Platen	50f4539b82	push (#11400 )	2021-04-23 15:36:27 +02:00
Sylvain Gugger	bf2e0cf70b	Trainer push to hub (#11328 ) * Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-23 09:17:37 -04:00
Teven	7bc86bea68	Fixed trainer total_flos relaoding in distributed mode (#11383 ) * Fixed trainer total_flos relaoding in distributed mode * logging flos at the end of training	2021-04-23 07:53:33 -04:00
Patrick von Platen	74e84f1fa6	make blenderbot test slow (#11395 )	2021-04-23 07:49:09 -04:00
Yoshitomo Matsubara	c3d6f33918	fixed typos (#11391 )	2021-04-23 07:48:42 -04:00
Max Del	a90d3f1862	Fix typo in text (#11396 )	2021-04-23 07:37:19 -04:00
Patrick von Platen	2dc2d79ac7	correct conversion (#11394 )	2021-04-23 11:59:34 +02:00
Patrick von Platen	b48cf7124c	correct typo (#11393 )	2021-04-23 11:34:59 +02:00
Patrick von Platen	8c9b5fcbaf	[Flax] Big FlaxBert Refactor (#11364 ) * improve flax * refactor * typos * Update src/transformers/modeling_flax_utils.py * Apply suggestions from code review * Update src/transformers/modeling_flax_utils.py * fix typo * improve error tolerance * typo * correct nasty saving bug * fix from pretrained * correct tree map * add note * correct weight tying	2021-04-23 09:53:09 +02:00
Sylvain Gugger	3ed5e97ba0	Fix Trainer with remove_unused_columns=False (#11382 ) * Fix Trainer with remove_unused_columns=False * Typo	2021-04-22 11:16:24 -04:00
PenutChen	0f3ad1507e	Fix typo (#11369 )	2021-04-22 10:10:16 -04:00
Matt	2617396094	Correctly cast num_train_epochs to int (#11379 )	2021-04-22 13:49:59 +01:00
Takuya Makino	881945c0b5	Add space (#11373 )	2021-04-22 17:48:58 +05:30
johnson7788	5b5e4ca366	[run_translation.py] fix typo (#11372 ) fix typo Co-authored-by: johnson <johnson@github.com>	2021-04-22 17:47:11 +05:30
Patrick von Platen	58d8795d74	[Flax] Correct typo (#11374 ) * finish * fix copy	2021-04-22 13:11:44 +02:00
Patrick von Platen	880154d2e1	[Wav2Vec2] Fix special tokens for Wav2Vec2 tokenizer (#11349 ) * fix wav2vec2 tok * up	2021-04-22 12:23:08 +02:00
Sylvain Gugger	6f14eab50b	Add in torchhub	2021-04-21 19:17:29 -04:00
Sylvain Gugger	ff26f8ee3a	Add huggingface_hub dep for #11328	2021-04-21 19:12:58 -04:00
wlhgtc	5e04d70868	Fix token_type_ids error for big_bird model. (#11355 ) * MOD: fit chinese wwm to new datasets * MOD: move wwm to new folder * MOD: formate code * Styling * MOD add param and recover trainer * MOD: add token_type_ids method for big bird * MOD: format code * MOD: format code Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2021-04-21 19:37:57 +02:00
Stas Bekman	5aaf5aac0b	[contributing doc] explain/link to good first issue (#11346 ) * explain/link to good first issue * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-21 10:10:11 -07:00
Matt	6fe79e57d7	Move old TF text classification script to legacy (#11361 ) And update README to explain the work-in-progress!	2021-04-21 17:36:18 +01:00
Patrick von Platen	50595a3336	Remove boiler plate code (#11340 ) * remove boiler plate code * adapt roberta * correct docs * finish refactor	2021-04-21 18:34:38 +02:00
Matt	ac588594e2	Merge new TF example script (#11360 ) First of the new and more idiomatic TF examples!	2021-04-21 17:04:55 +01:00
Stas Bekman	9f72e8f4e1	[testing doc] bring doc up to date (#11359 ) * bring doc up to date * fix	2021-04-21 08:51:00 -07:00
lewtun	41f3133a3a	Extract metric_key_prefix during NotebookProgressCallback.on_evaluate (#11347 ) * Pass metric_key_prefix as kwarg to on_evaluate * Replace eval_loss with metric_key_prefix_loss * Default to "eval" if metric_key_prefix not in kwargs * Add kwargs to CallbackHandler.on_evaluate signature * Revert "Add kwargs to CallbackHandler.on_evaluate signature" This reverts commit `8d4c85ed51`. * Revert "Pass metric_key_prefix as kwarg to on_evaluate" This reverts commit `7766bfe271`. * Extract metric_key_prefix from metrics	2021-04-21 11:12:09 -04:00
Sylvain Gugger	dabeb15292	Examples reorg (#11350 ) * Base move * Examples reorganization * Update references * Put back test data * Move conftest * More fixes * Move test data to test fixtures * Update path * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments and clean Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-21 11:11:20 -04:00
Stas Bekman	ca7ff64f5b	[deepspeed] fix resume from checkpoint (#11352 ) This PR fixes a bug that most likely somehow got exposed (not caused) by https://github.com/huggingface/transformers/pull/11318 - surprisingly the same test worked just fine before that other PR.	2021-04-21 07:48:15 -07:00
Sylvain Gugger	74712e22f3	Honor contributors to models (#11329 ) * Honor contributors to models * Fix typo * Address review comments * Add more authors	2021-04-21 09:47:27 -04:00

1 2 3 4 5 ...

7165 Commits