transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Fan Zhang	7e73601f32	modify qa-trainer (#11872 ) * modify qa-trainer * fix flax model	2021-06-01 08:28:41 -04:00
Shamane Siri	9ec0f01b6c	RAG-2nd2end-revamp (#11893 ) * initial * code quality test * code quality * added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver * minor change in test_modeling_rag * fixed tests * Update examples/research_projects/rag-end2end-retriever/README.md typo corrected as suggested by lhoestq Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update examples/research_projects/rag-end2end-retriever/finetune_rag.py type change suggested by lhoestq Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update src/transformers/models/rag/retrieval_rag.py Adding this change as mentioned by lhoestq. Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * completed the minor changes suggested by the reviewers Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>	2021-06-01 07:32:26 +01:00
Suraj Patil	ad25fd62bd	Add FlaxCLIP (#11883 ) * add flax CLIP * default input_shape * add tests * fix test * fix name * fix docs * fix shapes * attend at least 1 token * flax conv to torch conv * return floats * fix equivalence tests * fix import * return attention_weights and update tests * fix dosctrings * address patricks comments * input_shape arg * add tests for get_image_features and get_text_features methods * fix tests	2021-06-01 09:44:31 +05:30
Philip May	cfca638acb	Add MT5ForConditionalGeneration as supported arch. to summarization README (#11961 ) * Add MT5ForConditionalGeneration as supported arch. * Update README.md	2021-05-31 21:24:33 +05:30
Nicholas Vadivelu	1ab147d648	Remove redundant `nn.log_softmax` in `run_flax_glue.py` (#11920 ) * Remove redundant `nn.log_softmax` in `run_flax_glue.py` `optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference. * Remove unused 'flax.linen' import	2021-05-31 15:29:04 +01:00
Philip May	fb60c309c6	fix assert (#11935 )	2021-05-31 04:02:10 -04:00
Lysandre	04a9709c27	Remove `datasets` submodule	2021-05-31 09:18:49 +02:00
Lysandre Debut	8d171628fe	Test optuna and ray (#11924 )	2021-05-28 07:52:01 -04:00
Jayendra	af1a10bff4	[Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918 ) * Added logic to return attention from flax-bert model and added test cases to check that * Added new line at the end of file to test_modeling_flax_common.py * fixing code style * Fixing Roberta and Elextra models too from cpoying bert * Added temporary hack to not run test_attention_outputs for FlaxGPT2 * Returning attention weights from GPT2 and changed the tests accordingly. * last fixes * bump flax dependency Co-authored-by: jayendra <jayendra@infocusp.in> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-28 16:16:56 +05:30
Bhadresh Savani	e1205e478a	Added Sequence Classification class in GPTNeo (#11906 ) * seq classification changes * fix tests	2021-05-28 06:27:02 -04:00
Nicolas Patry	80d712fac6	Adding new argument `max_new_tokens` for generate. (#11476 ) * Adding new argument `max_new_tokens` for generate. This is a proposal to add a new argument `max_new_tokens` to `generate`. This include a `MaxNewTokensCriteria` that enables callers that don't know about the token length ahead (like pipelines callers) to manage more easily the length of their generated output. * Adding a test for the user warning when both`max_length` and `max_new_tokens` are used together. * Removed redundant `no_grad`.	2021-05-27 14:22:58 +02:00
Josh Tanner	2dd6fb2585	Update deepspeed config to reflect hyperparameter search parameters (#11896 ) * rebuild deepspeed config for hyperparameter search * reformat code to fix style issues	2021-05-27 07:53:33 -04:00
Patrick von Platen	42fe0dc23e	Add Emotion Speech Noteboook (#11900 )	2021-05-27 10:46:10 +01:00
Patrick von Platen	996a315e76	Flax Generate (#11777 ) * fix_torch_device_generate_test * remove @ * add * indexing * correct a couple of tests * fix tests * add logits processor * finish top_k, top_p, temp * add docs * correct flax prng key default * improve generate * add generation docs * add docs * make style * revert model outputs change * make style * correct typo * fix tests * fix slow test * add raise * finish generation Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-27 00:18:17 +01:00
Avital Oliver	2df546918e	Link official Cloud TPU JAX docs (#11892 )	2021-05-26 15:44:40 -04:00
joerenner	1530384e5b	changing find_batch_size to work with tokenizer outputs (#11890 ) * changing find_batch_size to work with tokenizer outputs trainer_pt_utils.find_batch_size does not recognize the batch size of BatchEncoding objects. This can cause an error when a trainer relies on find_batch_size to report the number of observed examples in the evaluation loop. * Trigger CI Co-authored-by: jrenner <joseph.renner@inria.fr>	2021-05-26 11:59:06 -04:00
Patrick von Platen	d5a72b6e19	[Flax] Allow dataclasses to be jitted (#11886 ) * fix_torch_device_generate_test * remove @ * change dataclasses to flax ones * fix typo * fix jitted tests * fix bert & electra	2021-05-26 15:01:13 +01:00
talkhaldi	e6126e1932	Correcting comments in T5Stack to reflect correct tuple order (#11330 ) * Correcting comments to reflect correct tuple order In order to match the actual order (line 513 and 516, and as accessed in 968), I've changed the order mentioned in comments L962 and L966-967. * Update modeling_t5.py Updating another comment as well * Removing extra space * Fixing style and quality * style & quality * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-26 14:07:23 +01:00
Daniel Stancl	0b93358447	Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775 ) * Fix Bart * Fix Blenderbot{,_small} * Fix LED * Fix Marian * Fix MBart * Fix Pegasus * Fix T5 * Add test for generation with head_mask * Add a common TF test * Override a test for the LED model as head masking is not yet properly implemented * Remove all head_masks from input preparation for LED * Drop masking for T5 as it needs a bit of refactor	2021-05-26 14:02:44 +01:00
francescorubbo	0b0a598452	Ensure input tensor are on device. (#11874 ) The feature extractor does not create tensors on the appropriate device, so we call `ensure_tensor_on_device` before feeding the processed inputs to the model.	2021-05-26 04:19:37 -04:00
Ahmet Akkoç	a9c797f93d	[Wav2Vec2ForCTC] example typo fixed (#11878 )	2021-05-25 17:06:14 -04:00
Stas Bekman	1b6530104d	[Examples] create model with custom config on the fly (#11798 ) * create custom model on the flight * better wording * add update_from_string * cleanup * cleanup * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more bool options * style * fix logger * add test * add the doc * assert on conflict of options Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-05-25 10:40:49 -07:00
Stas Bekman	6287c929c1	[lm examples] fix overflow in perplexity calc (#11855 ) * fix overflow in perplexity calc * use inf * fix	2021-05-25 08:11:26 -07:00
Patrick von Platen	7630c11f32	[Wav2Vec2] SpecAugment Fast (#11764 ) * first try * finish	2021-05-25 13:59:52 +01:00
Sylvain Gugger	f086652b16	Add option to log only once in multinode training (#11819 ) * Add option to long only once in multinode training * Use an alternate property	2021-05-25 08:03:43 -04:00
Wang Ran (汪然)	b8344a274f	typo (#11858 )	2021-05-25 04:23:46 -04:00
Shiro T	f9880f62ad	fixed a small typo in the doc (#11856 )	2021-05-25 04:18:55 -04:00
Lysandre Debut	6da129cb31	Enable memory metrics in tests that need it (#11859 )	2021-05-25 04:06:19 -04:00
Lysandre Debut	db0b2477cc	Add some tests to the slow suite #11860	2021-05-25 04:06:06 -04:00
Sylvain Gugger	afe479adb5	[Trainer] Report both steps and num samples per second (#11818 ) * [Trainer] Report both steps and num samples per second * Fix batch number * Update src/transformers/trainer_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-05-24 19:51:42 -04:00
Nick Lane-Smith	eaab9397cd	Fix two typos in docs (#11852 ) * typo2 * fix typo	2021-05-24 14:26:02 -04:00
Teven	8a2a3a25af	Fix flos single node (#11844 ) * fixing flos bug/typo in non-distributed setting * storing flos every logging_interval	2021-05-24 20:15:52 +02:00
Sylvain Gugger	adb785b0fe	Switch mem metrics flag (#11851 ) * Switch mem metrics flag * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-05-24 13:30:39 -04:00
Sylvain Gugger	fcdb85e9d2	Fix reference to XLNet (#11846 )	2021-05-24 09:26:40 -04:00
Patrick von Platen	f580604157	[Flax] Fix PyTorch import error (#11839 ) * fix_torch_device_generate_test * remove @ * change pytorch import to flax import	2021-05-24 10:41:10 +01:00
Lysandre Debut	0cbddfb190	Replace double occurrences as the last step (#11367 )	2021-05-24 03:38:59 -04:00
ctheodoris	73fde1defe	Faster list concat for trainer_pt_utils.get_length_grouped_indices() (#11825 ) get_length_grouped_indices() in LengthGroupedSampler and DistributedLengthGroupedSampler is prohibitively slow for large number of megabatches (in test case takes hours for ~270k megabatches with 100 items each) due to slow list concatenation with sum(megabatches, []). Resolves: #11795 Co-authored-by: ctheodoris <cvtheodo@ds.dfci.harvard.edu>	2021-05-22 10:27:20 -04:00
Patrick von Platen	da22245ed9	Add flax text class colab (#11824 ) * fix_torch_device_generate_test * remove @ * add flax glue link	2021-05-21 23:11:58 +01:00
Stas Bekman	a26f4d6208	[Deepspeed] support `zero.Init` in `from_config` (#11805 ) * support zero.Init in from_config * no need for eval test	2021-05-21 09:07:46 -07:00
Patrick von Platen	82335185fe	[Flax] Small fixes in `run_flax_glue.py` (#11820 ) * fix_torch_device_generate_test * remove @ * correct best seed for flax fine-tuning Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-21 16:52:23 +01:00
Sylvain Gugger	b8697bc622	Avoid TensorFlow import in Trainer	2021-05-21 09:23:31 -04:00
yujun	e2c1dd0966	fix roformer config doc (#11813 )	2021-05-21 08:06:11 -04:00
Lysandre Debut	1b652295c5	Patch recursive import (#11812 )	2021-05-21 06:50:01 -04:00
Patrick von Platen	bd9871657b	[Flax] Align GLUE training script with mlm training script (#11778 ) * speed up flax glue * remove unnecessary line * remove folder * remove run in loop Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-21 09:36:56 +01:00
Keren Fuentes	223943872e	Fix failing test on Windows Platform (#11589 ) * add separator for windows * fixes test_is_copy_consistent on Windows * fixing writing encoding issue on extended test (for Windows) * resolving comments	2021-05-20 19:54:23 -04:00
Michael Benayoun	f4a0d6ff86	A cleaner and more scalable implementation of symbolic tracing (#11763 ) Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures: - ALBERT - DistilBERT - MobileBERT - MegatronBERT - GPT2 - GPT Neo Co-authored-by: Michael Benayoun <michael@huggingface.co>	2021-05-20 18:02:29 +02:00
Sylvain Gugger	469384a777	Fix regression in regression (#11785 ) * Fix regression in regression * Add test	2021-05-20 09:55:13 -04:00
Sylvain Gugger	5ad5cc7198	Fix pattern in conf.py (#11784 )	2021-05-20 09:30:31 -04:00
yujun	206f06f2dd	Add new model RoFormer (use rotary position embedding ) (#11684 ) * add roformer * Update docs/source/model_doc/roformer.rst Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/roformer.rst Co-authored-by: Suraj Patil <surajp815@gmail.com> * update * add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding * update docs * make style and make quality * roback * unchanged * rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding * update Copyright year * move # Add modeling imports here to the correct position * max_position_embeddings can be set to 1536 * # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer * # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer * update tokenization_roformer * make style * add staticmethod apply_rotary_position_embeddings * add TF staticmethod apply_rotary_position_embeddings * update torch apply_rotary_position_embeddings * fix tf apply_rotary_position_embeddings error * make style * add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest * add TF rotary_position_embeddings test * update test_modeling_rofomer * Update docs/source/model_doc/roformer.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_roformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_roformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_tf_roformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refact roformer tokenizer * add RoFormerTokenizerFast * add RoFormerTokenizationTest * add require_jieba * update Copyright * update tokenizer & add copy from * add option rotary_value * use rust jieba * use rjieba * use rust jieba * fix test_alignement_methods * slice normalized_string is too slow * add config.embedding_size when embedding_size!=hidden_size * fix pickle tokenizer * Update docs/source/model_doc/roformer.rst Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style and make quality Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-20 08:00:34 -04:00
Lysandre Debut	075fdab4fe	Deprecate commands from the transformers-cli that are in the hf-cli (#11779 )	2021-05-20 03:16:03 -04:00

1 2 3 4 5 ...

7257 Commits