transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 21:30:07 +06:00

Author	SHA1	Message	Date
Patrick von Platen	a1bbcf3f6c	Refactoring the generate() function (#6949 ) * first draft * show design proposition for new generate method * up * make better readable * make first version * gpt2 tests pass * make beam search for gpt2 work * add first encoder-decoder code * delete typo * make t5 work * save indermediate * make bart work with beam search * finish beam search bart / t5 * add default kwargs * make more tests pass * fix no bad words sampler * some fixes and tests for all distribution processors * fix test * fix rag slow tests * merge to master * add nograd to generate * make all slow tests pass * speed up generate * fix edge case bug * small fix * correct typo * add type hints and docstrings * fix typos in tests * add beam search tests * add tests for beam scorer * fix test rag * finish beam search tests * move generation tests in seperate file * fix generation tests * more tests * add aggressive generation tests * fix tests * add gpt2 sample test * add more docstring * add more docs * finish doc strings * apply some more of sylvains and sams comments * fix some typos * make fix copies * apply lysandres and sylvains comments * final corrections on examples * small fix for reformer	2020-11-03 16:04:22 +01:00
Weizhen	2422cda01b	ProphetNet (#7157 ) * add new model prophetnet prophetnet modified modify codes as suggested v1 add prophetnet test files * still bugs, because of changed output formats of encoder and decoder * move prophetnet into the latest version * clean integration tests * clean tokenizers * add xlm config to init * correct typo in init * further refactoring * continue refactor * save parallel * add decoder_attention_mask * fix use_cache vs. past_key_values * fix common tests * change decoder output logits * fix xlm tests * make common tests pass * change model architecture * add tokenizer tests * finalize model structure * no weight mapping * correct n-gram stream attention mask as discussed with qweizhen * remove unused import * fix index.rst * fix tests * delete unnecessary code * add fast integration test * rename weights * final weight remapping * save intermediate * Descriptions for Prophetnet Config File * finish all models * finish new model outputs * delete unnecessary files * refactor encoder layer * add dummy docs * code quality * fix tests * add model pages to doctree * further refactor * more refactor, more tests * finish code refactor and tests * remove unnecessary files * further clean up * add docstring template * finish tokenizer doc * finish prophetnet * fix copies * fix typos * fix tf tests * fix fp16 * fix tf test 2nd try * fix code quality * add test for each model * merge new tests to branch * Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_prophetnet.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update utils/check_repo.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * apply sams and sylvains comments * make style * remove unnecessary code * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/configuration_prophetnet.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * implement lysandres comments * correct docs * fix isort * fix tokenizers * fix copies Co-authored-by: weizhen <weizhen@mail.ustc.edu.cn> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-19 17:36:09 +02:00
Thomas Wolf	ba8c4d0ac0	[Dependencies\|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 ) * splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece 🎉 * and removed hard dependency on tokenizers 🎉 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-18 20:51:24 +02:00
Patrick von Platen	62f5ae68ec	[Seq2Seq] Fix a couple of bugs and clean examples (#7474 ) * clean T5 * fix t5 tests * fix index typo * fix tf common test * fix examples * change positional ordering for Bart and FSTM * add signature test * clean docs and add tests * add docs to encoder decoder * clean docs * correct two doc strings * remove sig test for TF Elektra & Funnel * fix tf t5 slow tests * fix input_ids to inputs in tf * Update src/transformers/modeling_bart.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_bart.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * implement lysandre results * make style * fix encoder decoder typo * fix tf slow tests * fix slow tests * renaming * remove unused input Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-01 17:38:50 +02:00
Sam Shleifer	748425d47d	[T5] allow config.decoder_layers to control decoder size (#7409 ) * Working assymmetrical T5 * rename decoder_layers -> num_decoder_layers * Fix docstring * Allow creation of asymmetric t5 students	2020-09-28 03:08:04 -04:00
Ola Piktus	c754c41c61	RAG (#6813 ) * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * Formatting / renaming prior to actual work * First commit * improve comments * Retrieval evaluation scripts * refactor to include modeling outputs + MPI retriever * Fix rag-token model + refactor * Various fixes + finetuning logic * use_bos fix * Retrieval refactor * Finetuning refactoring and cleanup * Add documentation and cleanup * Remove set_up_rag_env.sh file * Fix retrieval wit HF index * Fix import errors * Fix quality errors * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867 * fix quality * Fix RAG Sequence generation * minor cleanup plus initial tests * fix test * fix tests 2 * Comments fix * post-merge fixes * Improve readme + post-rebase refactor * Extra dependencied for tests * Fix tests * Fix tests 2 * Refactor test requirements * Fix tests 3 * Post-rebase refactor * rename nlp->datasets * RAG integration tests * add tokenizer to slow integration test and allow retriever to run on cpu * add tests; fix position ids warning * change structure * change structure * add from encoder generator * save working solution * make all integration tests pass * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained * don't save paths * delete unnecessary imports * pass config to AutoTokenizer.from_pretrained for Rag tokenizers * init wiki_dpr only once * hardcode legacy index and passages paths (todo: add the right urls) * finalize config * finalize retriver api and config api * LegacyIndex index download refactor * add dpr to autotokenizer * make from pretrained more flexible * fix ragfortokengeneration * small name changes in tokenizer * add labels to models * change default index name * add retrieval tests * finish token generate * align test with previous version and make all tests pass * add tests * finalize tests * implement thoms suggestions * add first version of test * make first tests work * make retriever platform agnostic * naming * style * add legacy index URL * docstrings + simple retrieval test for distributed * clean model api * add doc_ids to retriever's outputs * fix retrieval tests * finish model outputs * finalize model api * fix generate problem for rag * fix generate for other modles * fix some tests * save intermediate * set generate to default * big refactor generate * delete rag_api * correct pip faiss install * fix auto tokenization test * fix faiss install * fix test * move the distributed logic to examples * model page * docs * finish tests * fix dependencies * fix import in __init__ * Refactor eval_rag and finetune scripts * start docstring * add psutil to test * fix tf test * move require torch to top * fix retrieval test * align naming * finish automodel * fix repo consistency * test ragtokenizer save/load * add rag model output docs * fix ragtokenizer save/load from pretrained * fix tokenizer dir * remove torch in retrieval * fix docs * fixe finetune scripts * finish model docs * finish docs * remove auto model for now * add require torch * remove solved todos * integrate sylvains suggestions * sams comments * correct mistake on purpose * improve README * Add generation test cases * fix rag token * clean token generate * fix test * add note to test * fix attention mask * add t5 test for rag * Fix handling prefix in finetune.py * don't overwrite index_name Co-authored-by: Patrick Lewis <plewis@fb.com> Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair> Co-authored-by: Your Name <you@example.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-09-22 18:29:58 +02:00
Patrick von Platen	afc4ece462	[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735 ) * fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test	2020-09-01 12:38:25 +02:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Sam Shleifer	624495706c	T5Tokenizer adds EOS token if not already added (#5866 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-25 14:56:08 -04:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Patrick von Platen	fe0b85e77a	[EncoderDecoder] Add functionality to tie encoder decoder weights (#6538 ) * start adding tie encoder to decoder functionality * finish model tying * make style * Apply suggestions from code review * fix t5 list including cross attention * apply sams suggestions * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add max depth break point Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-19 14:23:45 +02:00
Lysandre Debut	f7cbc13db7	Test model outputs equivalence (#6445 ) * Test model outputs equivalence * Fix failing tests * From dict to kwargs * DistilBERT * Addressing @sgugger and @patrickvonplaten's comments	2020-08-13 11:59:35 -04:00
Stas Bekman	e983da0e7d	cleanup tf unittests: part 2 (#6260 ) * cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification	2020-08-13 04:29:06 -04:00
Stas Bekman	5deed37f9f	cleanup torch unittests (#6196 ) * improve unit tests this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973 before I apply it to the rest * batch 1 * batch 2 * batch 3 * batch 4 * batch 5 * style * non-tf template * last deletion of check_loss_output	2020-08-04 02:42:56 -04:00
Sylvain Gugger	d951c14ae4	Model output test (#6155 ) * Use return_dict=True in all tests * Formatting	2020-07-31 09:44:37 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Sam Shleifer	1a647abf0b	[fix] check code quality (#5772 )	2020-07-15 14:59:38 -04:00
Funtowicz Morgan	d533c7e9b9	[fix] T5 ONNX test: model.to(torch_device) (#5769 ) Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-15 10:11:22 -04:00
Sam Shleifer	d0486c8bc2	[cleanup] T5 test, warnings (#5761 )	2020-07-15 08:23:22 -04:00
Sylvain Gugger	edfd82f5ff	Change model outputs types to self-document outputs (#5438 ) * [WIP] Proposal for model outputs * All Bert models * Make CI green maybe? * Fix ONNX test * Isolate ModelOutput from pt and tf * Formatting * Add Electra models * Auto-generate docstrings from outputs * Add TF outputs * Add some BERT models * Revert TF side * Remove last traces of TF changes * Fail with a clear error message * Add Albert and work through Bart * Add CTRL and DistilBert * Formatting * Progress on Bart * Renames and finish Bart * Formatting * Fix last test * Add DPR * Finish Electra and add FlauBERT * Add GPT2 * Add Longformer * Add MMBT * Add MobileBert * Add GPT * Formatting * Add Reformer * Add Roberta * Add T5 * Add Transformer XL * Fix test * Add XLM + fix XLMForTokenClassification * Style + XLMRoberta * Add XLNet * Formatting * Add doc of return_tuple arg	2020-07-10 11:36:53 -04:00
Abel	6912265711	Make T5 compatible with ONNX (#5518 ) * Default decoder inputs to encoder ones for T5 if neither are specified. * Fixing typo, now all tests are passing. * Changing einsum to operations supported by onnx * Adding a test to ensure T5 can be exported to onnx op>9 * Modified test for onnx export to make it faster * Styling changes. * Styling changes. * Changing notation for matrix multiplication Co-authored-by: Abel Riboulot <tkai@protomail.com>	2020-07-07 11:32:29 +02:00
Sam Shleifer	13deb95a40	Move tests/utils.py -> transformers/testing_utils.py (#5350 )	2020-07-01 10:31:17 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Patrick von Platen	c2a26ec8a6	[Use cache] Align logic of `use_cache` with output_attentions and output_hidden_states (#5194 ) * fix use cache * add bart use cache * fix bart * finish bart	2020-06-24 16:09:17 +02:00
Amil Khare	c852036b4a	[cleanup] Hoist ModelTester objects to top level (#4939 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-06-16 08:03:43 -04:00
Sylvain Gugger	f1fe18465d	Use labels to remove deprecation warnings (#4807 )	2020-06-05 16:41:46 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Patrick von Platen	aa925a52fa	[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468 ) * fix gpu slow tests in pytorch * change model to device syntax	2020-05-19 21:35:04 +02:00
Patrick von Platen	026a5d0888	[T5 fp16] Fix fp16 in T5 (#4436 ) * fix fp16 in t5 * make style * refactor invert_attention_mask fn * fix typo	2020-05-18 17:25:58 +02:00
Julien Chaumond	f54dc3f4d5	[ci] Load pretrained models into the default (long-lived) cache There's an inconsistency right now where: - we load some models into CACHE_DIR - and some models in the default cache - and often, in both for the same models When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth. I'd rather always use the default cache	2020-04-30 22:30:15 -04:00
Patrick von Platen	d13eca11e2	Higher tolerance for past testing in T5 (#3843 )	2020-04-17 11:25:14 -04:00
Patrick von Platen	38f7461df3	[TFT5, Cache] Add cache to TFT5 (#3772 ) * correct gpt2 test inputs * make style * delete modeling_gpt2 change in test file * translate from pytorch * correct tests * fix conflicts * fix conflicts * fix conflicts * fix conflicts * make tensorflow t5 caching work * make style * clean reorder cache * remove unnecessary spaces * fix test	2020-04-16 16:14:52 +02:00
Patrick von Platen	01c37dcdb5	[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734 ) * remove output_past from pt * make style * add optional input length for gpt2 * add use cache to prepare input * save memory in gpt2 * correct gpt2 test inputs * make past input optional for gpt2 * finish use_cache for all models * make style * delete modeling_gpt2 change in test file * correct docstring * correct is true statements for gpt2	2020-04-14 14:40:28 -04:00
Patrick von Platen	ce2298fb5f	[T5, generation] Add decoder caching for T5 (#3682 ) * initial commit to add decoder caching for T5 * better naming for caching * finish T5 decoder caching * correct test * added extensive past testing for T5 * clean files * make tests cleaner * improve docstring * improve docstring * better reorder cache * make style * Update src/transformers/modeling_t5.py Co-Authored-By: Yacine Jernite <yjernite@users.noreply.github.com> * make set output past work for all layers * improve docstring * improve docstring Co-authored-by: Yacine Jernite <yjernite@users.noreply.github.com>	2020-04-10 01:02:50 +02:00
Patrick von Platen	b815edf69f	[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results (#3550 ) * add some t5 integration tests * finish summarization and translation integration tests for T5 - results loook good * add tf test * fix == vs is bug * fix tf beam search error and make tf t5 tests pass	2020-04-01 18:01:33 +02:00
Patrick von Platen	75ec6c9e3a	[T5] make decoder input ids optional for t5 training (#3521 ) * make decoder input ids optional for t5 training * lm_lables should not be shifted in t5 * add tests * finish shift right functionality for PT T5 * move shift right to correct class * cleaner code * replace -100 values with pad token id * add assert statement * remove unnecessary for loop * make style	2020-03-30 13:45:26 +02:00
Patrick von Platen	bbf26c4e61	Support T5 Generation (#3228 ) * fix conflicts * update bart max length test * correct spelling mistakes * implemented model specific encode function * fix merge conflicts * better naming * save intermediate state -> need to rethink strucuture a bit * leave tf problem as it is for now * current version * add layers.pop * remove ipdb * make style * clean return cut decoding * remove ipdbs * Fix restoring layers in the decoders that doesnt exists. * push good intermediate solution for now * fix conflicts * always good to refuse to merge conflicts when rebasing * fix small bug * improve function calls * remove unused file * add correct scope behavior for t5_generate Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>	2020-03-19 23:18:23 +01:00
Julien Chaumond	9cda3620b6	Fix (non-slow) tests on GPU (torch) (#3024 ) * Fix tests on GPU (torch) * Fix bart slow tests Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-02-26 11:59:25 -05:00
Sam Shleifer	53ce3854a1	New BartModel (#2745 ) * Results same as fairseq * Wrote a ton of tests * Struggled with api signatures * added some docs	2020-02-20 18:11:13 -05:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Aymeric Augustin	c824d15aa1	Remove __future__ imports.	2019-12-22 17:47:54 +01:00
Aymeric Augustin	345c23a60f	Replace (TF)CommonTestCases for modeling with a mixin. I suspect the wrapper classes were created in order to prevent the abstract base class (TF)CommonModelTester from being included in test discovery and running, because that would fail. I solved this by replacing the abstract base class with a mixin. Code changes are just de-indenting and automatic reformattings performed by black to use the extra line space.	2019-12-22 15:35:18 +01:00
Aymeric Augustin	7e98e211f0	Remove unittest.main() in test modules. This construct isn't used anymore these days. Running python tests/test_foo.py puts the tests/ directory on PYTHONPATH, which isn't representative of how we run tests. Use python -m unittest tests/test_foo.py instead.	2019-12-22 14:42:03 +01:00
Aymeric Augustin	ced0a94204	Switch test files to the standard test_*.py scheme.	2019-12-22 14:15:13 +01:00

45 Commits