transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 01:02:25 +06:00

Author	SHA1	Message	Date
Patrick von Platen	43f46aa7fd	[RAG] Fix rag from pretrained question encoder generator behavior (#11962 ) * fix_torch_device_generate_test * remove @ * fix rag from pretrained loading * add test * uplaod * finish	2021-06-02 09:17:14 +01:00
dependabot[bot]	6db3a87de2	Bump urllib3 from 1.25.8 to 1.26.5 in /examples/research_projects/lxmert (#11983 ) Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.8 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.25.8...1.26.5) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-06-02 03:40:20 -04:00
Stas Bekman	4ba203d9d3	[Trainer] add train loss and flops metrics reports (#11980 ) * add train loss and flops metrics reports * consistency * add train_loss to skip keys * restore on_train_end call timing	2021-06-01 15:58:31 -07:00
Stas Bekman	7ec596ecda	[DeepSpeed] decouple `DeepSpeedConfigHF` from `Trainer` (#11966 ) * decouple DeepSpeedConfigHF from Trainer * add LoggingLevel ctx manager; add new test * cleanup * add docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * implemented suggested renames * formatter workaround Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-01 13:24:52 -07:00
Alberto Villa	1c3ab3e5d6	Typo in usage example, changed to device instead of torch_device (#11979 )	2021-06-01 14:58:49 -04:00
Patrick von Platen	47a98fc4cb	ByT5 model (#11971 ) * allow tf to use uneven num of layers * add tokenizer * finish docs * finish docs * Apply suggestions from code review * include in index * finish * Update docs/source/model_doc/byt5.rst Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * apply sylvais suggestions * make style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2021-06-01 19:07:37 +01:00
Jeoung-Minju	1eb58b4560	typo correction (#11973 ) * typo correction * type corrections	2021-06-01 12:24:59 -04:00
Stas Bekman	79712e7e7a	[deepspeed] docs (#11940 ) * deepspeed docs * cleanup * cleanup	2021-06-01 09:21:21 -07:00
Lysandre	985d708842	Run the integration tests on schedule tests instead of master tests	2021-06-01 15:58:31 +02:00
Volodymyr Byno	9996558bff	Neptune.ai integration (#11937 ) An option that turns on neptune.ai logging --report_to 'neptune' Additional ENV variables: NEPTUNE_PROJECT NEPTUNE_API_TOKEN NEPTUNE_RUN_NAME (optional) NEPTUNE_STOP_TIMEOUT (optional)	2021-06-01 09:40:52 -04:00
Lysandre Debut	ae6ce28f31	Authorize args when instantiating an AutoModel (#11956 )	2021-06-01 09:27:54 -04:00
Philip May	fcad801825	Add regression tests for slow sentencepiece tokenizers. (#11737 ) * add test_vocab_size for sentencepiece tok. * add test_get_vocab for sentencepiece tok. * add test_convert_token_and_id for sentencepiece tok. * add test_tokenize_and_convert_tokens_to_string for all tok. * improve test_tokenize_and_convert_tokens_to_string for sp. tok. * add common tokenizer integration tests - for albert - for barthez * add tokenizer integration tests to bert gen. * add most tokenizer integration tests * fix camembert tokenizer integration test * add tokenizer integration test to marian * add tokenizer integration test to reformer * add typing and doc to tokenizer_integration_test_util * fix tokenizer integration test of reformer * improve test_sentencepiece_tokenize_and_convert_tokens_to_string * empty commit to trigger CI * fix tokenizer integration test of reformer * remove code not needed anymore * empty commit to trigger CI * empty commit to trigger CI	2021-06-01 09:24:39 -04:00
Josh Tanner	c3d958b2c0	reinitialize wandb config for each hyperparameter search run (#11945 )	2021-06-01 09:18:33 -04:00
Riccardo Bassani	99dbbdb91e	bugfixes training_args.py (#11922 ) modified according to: https://pytorch.org/xla/release/1.8.1/_modules/torch_xla/core/xla_model.html	2021-06-01 09:04:51 -04:00
Fan Zhang	7e73601f32	modify qa-trainer (#11872 ) * modify qa-trainer * fix flax model	2021-06-01 08:28:41 -04:00
Shamane Siri	9ec0f01b6c	RAG-2nd2end-revamp (#11893 ) * initial * code quality test * code quality * added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver * minor change in test_modeling_rag * fixed tests * Update examples/research_projects/rag-end2end-retriever/README.md typo corrected as suggested by lhoestq Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update examples/research_projects/rag-end2end-retriever/finetune_rag.py type change suggested by lhoestq Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update src/transformers/models/rag/retrieval_rag.py Adding this change as mentioned by lhoestq. Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * completed the minor changes suggested by the reviewers Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>	2021-06-01 07:32:26 +01:00
Suraj Patil	ad25fd62bd	Add FlaxCLIP (#11883 ) * add flax CLIP * default input_shape * add tests * fix test * fix name * fix docs * fix shapes * attend at least 1 token * flax conv to torch conv * return floats * fix equivalence tests * fix import * return attention_weights and update tests * fix dosctrings * address patricks comments * input_shape arg * add tests for get_image_features and get_text_features methods * fix tests	2021-06-01 09:44:31 +05:30
Philip May	cfca638acb	Add MT5ForConditionalGeneration as supported arch. to summarization README (#11961 ) * Add MT5ForConditionalGeneration as supported arch. * Update README.md	2021-05-31 21:24:33 +05:30
Nicholas Vadivelu	1ab147d648	Remove redundant `nn.log_softmax` in `run_flax_glue.py` (#11920 ) * Remove redundant `nn.log_softmax` in `run_flax_glue.py` `optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference. * Remove unused 'flax.linen' import	2021-05-31 15:29:04 +01:00
Philip May	fb60c309c6	fix assert (#11935 )	2021-05-31 04:02:10 -04:00
Lysandre	04a9709c27	Remove `datasets` submodule	2021-05-31 09:18:49 +02:00
Lysandre Debut	8d171628fe	Test optuna and ray (#11924 )	2021-05-28 07:52:01 -04:00
Jayendra	af1a10bff4	[Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918 ) * Added logic to return attention from flax-bert model and added test cases to check that * Added new line at the end of file to test_modeling_flax_common.py * fixing code style * Fixing Roberta and Elextra models too from cpoying bert * Added temporary hack to not run test_attention_outputs for FlaxGPT2 * Returning attention weights from GPT2 and changed the tests accordingly. * last fixes * bump flax dependency Co-authored-by: jayendra <jayendra@infocusp.in> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-28 16:16:56 +05:30
Bhadresh Savani	e1205e478a	Added Sequence Classification class in GPTNeo (#11906 ) * seq classification changes * fix tests	2021-05-28 06:27:02 -04:00
Nicolas Patry	80d712fac6	Adding new argument `max_new_tokens` for generate. (#11476 ) * Adding new argument `max_new_tokens` for generate. This is a proposal to add a new argument `max_new_tokens` to `generate`. This include a `MaxNewTokensCriteria` that enables callers that don't know about the token length ahead (like pipelines callers) to manage more easily the length of their generated output. * Adding a test for the user warning when both`max_length` and `max_new_tokens` are used together. * Removed redundant `no_grad`.	2021-05-27 14:22:58 +02:00
Josh Tanner	2dd6fb2585	Update deepspeed config to reflect hyperparameter search parameters (#11896 ) * rebuild deepspeed config for hyperparameter search * reformat code to fix style issues	2021-05-27 07:53:33 -04:00
Patrick von Platen	42fe0dc23e	Add Emotion Speech Noteboook (#11900 )	2021-05-27 10:46:10 +01:00
Patrick von Platen	996a315e76	Flax Generate (#11777 ) * fix_torch_device_generate_test * remove @ * add * indexing * correct a couple of tests * fix tests * add logits processor * finish top_k, top_p, temp * add docs * correct flax prng key default * improve generate * add generation docs * add docs * make style * revert model outputs change * make style * correct typo * fix tests * fix slow test * add raise * finish generation Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-27 00:18:17 +01:00
Avital Oliver	2df546918e	Link official Cloud TPU JAX docs (#11892 )	2021-05-26 15:44:40 -04:00
joerenner	1530384e5b	changing find_batch_size to work with tokenizer outputs (#11890 ) * changing find_batch_size to work with tokenizer outputs trainer_pt_utils.find_batch_size does not recognize the batch size of BatchEncoding objects. This can cause an error when a trainer relies on find_batch_size to report the number of observed examples in the evaluation loop. * Trigger CI Co-authored-by: jrenner <joseph.renner@inria.fr>	2021-05-26 11:59:06 -04:00
Patrick von Platen	d5a72b6e19	[Flax] Allow dataclasses to be jitted (#11886 ) * fix_torch_device_generate_test * remove @ * change dataclasses to flax ones * fix typo * fix jitted tests * fix bert & electra	2021-05-26 15:01:13 +01:00
talkhaldi	e6126e1932	Correcting comments in T5Stack to reflect correct tuple order (#11330 ) * Correcting comments to reflect correct tuple order In order to match the actual order (line 513 and 516, and as accessed in 968), I've changed the order mentioned in comments L962 and L966-967. * Update modeling_t5.py Updating another comment as well * Removing extra space * Fixing style and quality * style & quality * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-26 14:07:23 +01:00
Daniel Stancl	0b93358447	Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775 ) * Fix Bart * Fix Blenderbot{,_small} * Fix LED * Fix Marian * Fix MBart * Fix Pegasus * Fix T5 * Add test for generation with head_mask * Add a common TF test * Override a test for the LED model as head masking is not yet properly implemented * Remove all head_masks from input preparation for LED * Drop masking for T5 as it needs a bit of refactor	2021-05-26 14:02:44 +01:00
francescorubbo	0b0a598452	Ensure input tensor are on device. (#11874 ) The feature extractor does not create tensors on the appropriate device, so we call `ensure_tensor_on_device` before feeding the processed inputs to the model.	2021-05-26 04:19:37 -04:00
Ahmet Akkoç	a9c797f93d	[Wav2Vec2ForCTC] example typo fixed (#11878 )	2021-05-25 17:06:14 -04:00
Stas Bekman	1b6530104d	[Examples] create model with custom config on the fly (#11798 ) * create custom model on the flight * better wording * add update_from_string * cleanup * cleanup * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more bool options * style * fix logger * add test * add the doc * assert on conflict of options Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-05-25 10:40:49 -07:00
Stas Bekman	6287c929c1	[lm examples] fix overflow in perplexity calc (#11855 ) * fix overflow in perplexity calc * use inf * fix	2021-05-25 08:11:26 -07:00
Patrick von Platen	7630c11f32	[Wav2Vec2] SpecAugment Fast (#11764 ) * first try * finish	2021-05-25 13:59:52 +01:00
Sylvain Gugger	f086652b16	Add option to log only once in multinode training (#11819 ) * Add option to long only once in multinode training * Use an alternate property	2021-05-25 08:03:43 -04:00
Wang Ran (汪然)	b8344a274f	typo (#11858 )	2021-05-25 04:23:46 -04:00
Shiro T	f9880f62ad	fixed a small typo in the doc (#11856 )	2021-05-25 04:18:55 -04:00
Lysandre Debut	6da129cb31	Enable memory metrics in tests that need it (#11859 )	2021-05-25 04:06:19 -04:00
Lysandre Debut	db0b2477cc	Add some tests to the slow suite #11860	2021-05-25 04:06:06 -04:00
Sylvain Gugger	afe479adb5	[Trainer] Report both steps and num samples per second (#11818 ) * [Trainer] Report both steps and num samples per second * Fix batch number * Update src/transformers/trainer_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-05-24 19:51:42 -04:00
Nick Lane-Smith	eaab9397cd	Fix two typos in docs (#11852 ) * typo2 * fix typo	2021-05-24 14:26:02 -04:00
Teven	8a2a3a25af	Fix flos single node (#11844 ) * fixing flos bug/typo in non-distributed setting * storing flos every logging_interval	2021-05-24 20:15:52 +02:00
Sylvain Gugger	adb785b0fe	Switch mem metrics flag (#11851 ) * Switch mem metrics flag * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-05-24 13:30:39 -04:00
Sylvain Gugger	fcdb85e9d2	Fix reference to XLNet (#11846 )	2021-05-24 09:26:40 -04:00
Patrick von Platen	f580604157	[Flax] Fix PyTorch import error (#11839 ) * fix_torch_device_generate_test * remove @ * change pytorch import to flax import	2021-05-24 10:41:10 +01:00
Lysandre Debut	0cbddfb190	Replace double occurrences as the last step (#11367 )	2021-05-24 03:38:59 -04:00

... 30 31 32 33 34 ...

8821 Commits