transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 01:32:23 +06:00

Author	SHA1	Message	Date
Sam Shleifer	3487be75ef	[Marian] documentation and AutoModel support (#4152 ) - MarianSentencepieceTokenizer - > MarianTokenizer - Start using unk token. - add docs page - add better generation params to MarianConfig - more conversion utilities	2020-05-10 13:54:57 -04:00
Patrick von Platen	cf08830c28	[Pipeline, Generation] tf generation pipeline bug (#4217 ) * fix PR * move tests to correct place	2020-05-08 08:30:05 -04:00
Jared T Nielsen	8bf7312654	Add AlbertForPreTraining and TFAlbertForPreTraining models. (#4057 ) * Add AlbertForPreTraining and TFAlbertForPreTraining models. * PyTorch conversion * TensorFlow conversion * style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-05-07 19:44:51 -04:00
Julien Chaumond	0ae96ff8a7	BIG Reorganize examples (#4213 ) * Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around	2020-05-07 13:48:44 -04:00
Funtowicz Morgan	0a6cbea0a5	Rewritten batch support in pipelines. (#4154 ) * Rewritten batch support in pipelines. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix imports sorting 🔧 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Set pad_to_max_length=True by default on Pipeline. * Set pad_to_max_length=False for generation pipelines. Most of generation models doesn't have padding token. * Address @joeddav review comment: Uniformized args. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> Address @joeddav review comment: Uniformized *args (second). Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-05-07 09:52:40 -04:00
Patrick von Platen	dca34695d0	Reformer (#3351 ) * first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README	2020-05-07 10:17:01 +02:00
Julien Plu	aad50151f3	TF version of the trainer (#4017 ) * First commit to add a TF version of the trainer. * Make the TF trainer closer to what looks the PT trainer * Refactoring common code between the PT and TF trainer into an util file. * Some bugfix + better similarity with the PT trainer * Add missing class in transformers init * Bugfix over prediction + use classification report instead of simple metrics * Fix name error * Fix optimization tests + style * Apply style * Several bugfix for multi-gpu training * Apply style * Apply style * Add glue example for the TF trainer * Several bugix + address the reviews * Fix on the TF training args file * Add a debug mode * Bugfix in utils_ner.py when segment_ids is None * Apply style * Apply style * Add TPU strategy * Fix selection strategy	2020-05-06 12:56:52 -04:00
Lysandre Debut	79b1c6966b	Pytorch 1.5.0 (#3973 ) * Standard deviation can no longer be set to 0 * Remove torch pinned version * 9th instead of 10th, silly me	2020-05-05 10:23:01 -04:00
Patrick von Platen	8e67573a64	[EncoderDecoder Tests] Improve tests (#4046 ) * Hoist bert model tester for patric * indent * make tests work * Update tests/test_modeling_bert.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: sshleifer <sshleifer@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-04 02:18:36 +02:00
Sam Shleifer	18db92dd9a	[testing] add timeout_decorator (#3543 )	2020-05-01 09:05:47 -04:00
Julien Chaumond	f39217a5ec	[tests] Light cleanup of tempfile in tests/	2020-04-30 22:30:15 -04:00
Julien Chaumond	f54dc3f4d5	[ci] Load pretrained models into the default (long-lived) cache There's an inconsistency right now where: - we load some models into CACHE_DIR - and some models in the default cache - and often, in both for the same models When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth. I'd rather always use the default cache	2020-04-30 22:30:15 -04:00
Julien Chaumond	ab90353f1a	[cli] {login, upload, s3} display more helpful error messages	2020-04-30 12:51:06 -04:00
Julien Chaumond	452dd0e4d9	[ci] Align test_hf_api.py with API change	2020-04-30 12:06:01 -04:00
Sam Shleifer	2c77842887	[Fix common tests on GPU] send model, ids to torch_device (#4014 )	2020-04-29 09:47:20 -04:00
Sam Shleifer	847e7f3379	MarianMTModel.from_pretrained('Helsinki-NLP/opus-marian-en-de') (#3908 ) Co-Authored-By: Stefan Schweter <stefan@schweter.it>	2020-04-28 18:22:37 -04:00
Patrick von Platen	fa49b9afea	Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383 ) * change encoder decoder style to bart & t5 style * make encoder decoder generation dummy work for bert * make style * clean init config in encoder decoder * add tests for encoder decoder models * refactor and add last tests * refactor and add last tests * fix attn masks for bert encoder decoder * make style * refactor prepare inputs for Bert * refactor * finish encoder decoder * correct typo * add docstring to config * finish * add tests * better naming * make style * fix flake8 * clean docstring * make style * rename	2020-04-28 15:11:09 +02:00
Lorenzo Ampil	f16540fcba	Pipeline for Text Generation: GenerationPipeline (#3758 ) * Add GenerationPipeline * Fix parameter names * Correct parameter __call__ parameters * Add model type attribute and correct function calls for prepare_input * Take out trailing commas from init attributes * Remove unnecessary tokenization line * Implement support for multiple text inputs * Apply generation support for multiple input text prompts * Take out tensor coersion * Take out batch index * Add text prompt to return sequence * Squeeze token tensore before decoding * Return only a single list of sequences if only one prompt was used * Correct results variable name * Add GenerationPipeline to SUPPORTED_TASKS with the alias , initalized w GPT2 * Registedred AutoModelWithLMHead for both pt and t * Update docstring for GenerationPipeline * Add kwargs parameter to mode.generate * Take out kwargs parameter after all * Add generation pipeline example in pipeline docstring * Fix max length by squeezing tokens tensor * Apply ensure_tensor_on_device to pytorch tensor * Include generation step in torch.no_grad * Take out input from prepare_xlm_input and set 'en' as default xlm_language * Apply framework specific encoding during prepare_input * Format w make style * Move GenerationPipeline import to follow proper import sorting * Take out training comma from generation dict * Apply requested changes * Change name to TextGenerationPipeline * Apply TextGenerationPipeline rename to __init___ * Changing alias to * Set input mapping as input to ensure_tensor_on_device * Fix assertion placement * Add test_text_generation * Add TextGenerationPipeline to PipelineCommonTests * Take out whitespace * Format __init__ w black * Fix __init__ style * Forman __init___ * Add line to end of __init__ * Correct model tokenizer set for test_text_generation * Ensure to return list of list, not list of string (to pass test) * Limit test models to only 3 to limit runtime to address circleCI timeout error * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Remove argument docstring, __init__, add additional __call__ arguments, and reformat results to list of dict * Fix blank result list * Add TextGenerationPipeline to pipelines.rst * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Fix typos from adding PADDING_TEXT_TOKEN_LENGTH * Fix incorrectly moved result list * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Add back generation line and make style * Take out blank whitespace * Apply new alis, text-generation, to test_pipelines * Fix text generation alias in test * Update src/transformers/pipelines.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-22 09:37:03 -04:00
Julien Chaumond	dd9d483d03	Trainer (#3800 ) * doc * [tests] Add sample files for a regression task * [HUGE] Trainer * Feedback from @sshleifer * Feedback from @thomwolf + logging tweak * [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes * [glue] Use default max_seq_length of 128 like before * [glue] move DataTrainingArguments around * [ner] Change interface of InputExample, and align run_{tf,pl} * Re-align the pl scripts a little bit * ner * [ner] Add integration test * Fix language_modeling with API tweak * [ci] Tweak loss target * Don't break console output * amp.initialize: model must be on right device before * [multiple-choice] update for Trainer * Re-align to `827d6d6ef0`	2020-04-21 20:11:56 -04:00
Thomas Wolf	827d6d6ef0	Cleanup fast tokenizers integration (#3706 ) * First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By: Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by: Stefan Schweter <stefan@schweter.it>	2020-04-18 13:43:57 +02:00
Lysandre Debut	8b63a01d95	XLM tokenizer should encode with bos token (#3791 ) * XLM tokenizer should encode with bos token * Update tests	2020-04-17 11:28:55 -04:00
Patrick von Platen	1d4a35b396	Higher tolerance for past testing in TF T5 (#3844 )	2020-04-17 11:26:16 -04:00
Patrick von Platen	d13eca11e2	Higher tolerance for past testing in T5 (#3843 )	2020-04-17 11:25:14 -04:00
Pierric Cistac	6d00033e97	Question Answering support for Albert and Roberta in TF (#3812 ) * Add TFAlbertForQuestionAnswering * Add TFRobertaForQuestionAnswering * Update TFAutoModel with Roberta/Albert for QA * Clean `super` TF Albert calls	2020-04-17 10:45:30 -04:00
Patrick von Platen	baca8fa8e6	clean pipelines (#3795 )	2020-04-16 10:21:34 -04:00
Patrick von Platen	38f7461df3	[TFT5, Cache] Add cache to TFT5 (#3772 ) * correct gpt2 test inputs * make style * delete modeling_gpt2 change in test file * translate from pytorch * correct tests * fix conflicts * fix conflicts * fix conflicts * fix conflicts * make tensorflow t5 caching work * make style * clean reorder cache * remove unnecessary spaces * fix test	2020-04-16 16:14:52 +02:00
Patrick von Platen	01c37dcdb5	[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734 ) * remove output_past from pt * make style * add optional input length for gpt2 * add use cache to prepare input * save memory in gpt2 * correct gpt2 test inputs * make past input optional for gpt2 * finish use_cache for all models * make style * delete modeling_gpt2 change in test file * correct docstring * correct is true statements for gpt2	2020-04-14 14:40:28 -04:00
Teven	352d5472b0	Shift labels internally within TransfoXLLMHeadModel when called with labels (#3716 ) * Shifting labels inside TransfoXLLMHead * Changed doc to reflect change * Updated pytorch test * removed IDE whitespace changes * black reformat Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-04-13 18:11:23 +02:00
Julien Chaumond	b169ac9c2b	[examples] Generate argparsers from type hints on dataclasses (#3669 ) * [examples] Generate argparsers from type hints on dataclasses * [HfArgumentParser] way simpler API * Restore run_language_modeling.py for easier diff * [HfArgumentParser] final tweaks from code review	2020-04-10 12:21:58 -04:00
Sam Shleifer	7a7fdf71f8	Multilingual BART - (#3602 ) - support mbart-en-ro weights - add MBartTokenizer	2020-04-10 11:25:39 -04:00
Patrick von Platen	ce2298fb5f	[T5, generation] Add decoder caching for T5 (#3682 ) * initial commit to add decoder caching for T5 * better naming for caching * finish T5 decoder caching * correct test * added extensive past testing for T5 * clean files * make tests cleaner * improve docstring * improve docstring * better reorder cache * make style * Update src/transformers/modeling_t5.py Co-Authored-By: Yacine Jernite <yjernite@users.noreply.github.com> * make set output past work for all layers * improve docstring * improve docstring Co-authored-by: Yacine Jernite <yjernite@users.noreply.github.com>	2020-04-10 01:02:50 +02:00
LysandreJik	31baeed614	Update quotes cc @julien-c	2020-04-09 09:09:00 -04:00
Lysandre Debut	6435b9f908	Updating the TensorFlow models to work as expected with tokenizers v3.0.0 (#3684 ) * Updating modeling tf files; adding tests * Merge `encode_plus` and `batch_encode_plus`	2020-04-08 16:22:44 -04:00
Sam Shleifer	715aa5b135	[Bart] Replace config.output_past with use_cache kwarg (#3632 )	2020-04-07 19:08:26 -04:00
Sam Shleifer	0a4b1068e1	Speedup torch summarization tests (#3663 )	2020-04-07 14:01:30 -04:00
Funtowicz Morgan	96ab75b8dd	Tokenizers v3.0.0 (#3185 ) * Renamed num_added_tokens to num_special_tokens_to_add Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Cherry-Pick: Partially fix space only input without special tokens added to the output #3091 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added property is_fast on PretrainedTokenizer and PretrainedTokenizerFast Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Make fast tokenizers unittests work on Windows. * Entirely refactored unittest for tokenizers fast. * Remove ABC class for CommonFastTokenizerTest * Added embeded_special_tokens tests from allenai @dirkgr * Make embeded_special_tokens tests from allenai more generic * Uniformize vocab_size as a property for both Fast and normal tokenizers * Move special tokens handling out of PretrainedTokenizer (SpecialTokensMixin) * Ensure providing None input raise the same ValueError than Python tokenizer + tests. * Fix invalid input for assert_padding when testing batch_encode_plus * Move add_special_tokens from constructor to tokenize/encode/[batch_]encode_plus methods parameter. * Ensure tokenize() correctly forward add_special_tokens to rust. * Adding None checking on top on encode / encode_batch for TransfoXLTokenizerFast. Avoid stripping on None values. * unittests ensure tokenize() also throws a ValueError if provided None * Added add_special_tokens unittest for all supported models. * Style * Make sure TransfoXL test run only if PyTorch is provided. * Split up tokenizers tests for each model type. * Fix invalid unittest with new tokenizers API. * Filter out Roberta openai detector models from unittests. * Introduce BatchEncoding on fast tokenizers path. This new structure exposes all the mappings retrieved from Rust. It also keeps the current behavior with model forward. * Introduce BatchEncoding on slow tokenizers path. Backward compatibility. * Improve error message on BatchEncoding for slow path * Make add_prefix_space True by default on Roberta fast to match Python in majority of cases. * Style and format. * Added typing on all methods for PretrainedTokenizerFast * Style and format * Added path for feeding pretokenized (List[str]) input to PretrainedTokenizerFast. * Style and format * encode_plus now supports pretokenized inputs. * Remove user warning about add_special_tokens when working on pretokenized inputs. * Always go through the post processor. * Added support for pretokenized input pairs on encode_plus * Added is_pretokenized flag on encode_plus for clarity and improved error message on input TypeError. * Added pretokenized inputs support on batch_encode_plus * Update BatchEncoding methods name to match Encoding. * Bump setup.py tokenizers dependency to 0.7.0rc1 * Remove unused parameters in BertTokenizerFast * Make sure Roberta returns token_type_ids for unittests. * Added missing typings * Update add_tokens prototype to match tokenizers side and allow AddedToken * Bumping tokenizers to 0.7.0rc2 * Added documentation for BatchEncoding * Added (unused) is_pretokenized parameter on PreTrainedTokenizer encode_plus/batch_encode_plus methods. * Added higher-level typing for tokenize / encode_plus / batch_encode_plus. * Fix unittests failing because add_special_tokens was defined as a constructor parameter on Rust Tokenizers. * Fix text-classification pipeline using the wrong tokenizer * Make pipelines works with BatchEncoding * Turn off add_special_tokens on tokenize by default. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Remove add_prefix_space from tokenize call in unittest. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Style and quality Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Correct message for batch_encode_plus none input exception. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix invalid list comprehension for offset_mapping overriding content every iteration. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * TransfoXL uses Strip normalizer. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Bump tokenizers dependency to 0.7.0rc3 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Support AddedTokens for special_tokens and use left stripping on mask for Roberta. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * SpecilaTokenMixin can use slots to faster access to underlying attributes. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Remove update_special_tokens from fast tokenizers. * Ensure TransfoXL unittests are run only when torch is available. * Style. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Style * Style 🙏🙏 * Remove slots on SpecialTokensMixin, need deep dive into pickle protocol. * Remove Roberta warning on __init__. * Move documentation to Google style. Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-04-07 00:29:15 +02:00
Patrick von Platen	2ee410560e	[Generate, Test] Split generate test function into beam search, no beam search (#3601 ) * split beam search and no beam search test * fix test * clean generate tests	2020-04-06 10:37:05 +02:00
Lysandre Debut	d5d7d88612	ELECTRA (#3257 ) * Electra wip * helpers * Electra wip * Electra v1 * ELECTRA may be saved/loaded * Generator & Discriminator * Embedding size instead of halving the hidden size * ELECTRA Tokenizer * Revert BERT helpers * ELECTRA Conversion script * Archive maps * PyTorch tests * Start fixing tests * Tests pass * Same configuration for both models * Compatible with base + large * Simplification + weight tying * Archives * Auto + Renaming to standard names * ELECTRA is uncased * Tests * Slight API changes * Update tests * wip * ElectraForTokenClassification * temp * Simpler arch + tests Removed ElectraForPreTraining which will be in a script * Conversion script * Auto model * Update links to S3 * Split ElectraForPreTraining and ElectraForTokenClassification * Actually test PreTraining model * Remove num_labels from configuration * wip * wip * From discriminator and generator to electra * Slight API changes * Better naming * TensorFlow ELECTRA tests * Accurate conversion script * Added to conversion script * Fast ELECTRA tokenizer * Style * Add ELECTRA to README * Modeling Pytorch Doc + Real style * TF Docs * Docs * Correct links * Correct model intialized * random fixes * style * Addressing Patrick's and Sam's comments * Correct links in docs	2020-04-03 14:10:54 -04:00
Yohei Tamura	8594dd80dd	BertJapaneseTokenizer accept options for mecab (#3566 ) * BertJapaneseTokenizer accept options for mecab * black * fix mecab_option to Option[str]	2020-04-03 11:12:19 -04:00
Patrick von Platen	a4ee4da18a	[T5, TF 2.2] change tf t5 argument naming (#3547 ) * change tf t5 argument naming for TF 2.2 * correct bug in testing	2020-04-01 22:04:20 +02:00
Patrick von Platen	b815edf69f	[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results (#3550 ) * add some t5 integration tests * finish summarization and translation integration tests for T5 - results loook good * add tf test * fix == vs is bug * fix tf beam search error and make tf t5 tests pass	2020-04-01 18:01:33 +02:00
Patrick von Platen	b38d552a92	[Generate] Add bad words list argument to the generate function (#3367 ) * add bad words list * make style * add bad_words_tokens * make style * better naming * make style * fix typo	2020-03-31 18:42:31 +02:00
Sam Shleifer	8deff3acf2	[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488 )	2020-03-30 12:28:27 -04:00
Patrick von Platen	75ec6c9e3a	[T5] make decoder input ids optional for t5 training (#3521 ) * make decoder input ids optional for t5 training * lm_lables should not be shifted in t5 * add tests * finish shift right functionality for PT T5 * move shift right to correct class * cleaner code * replace -100 values with pad token id * add assert statement * remove unnecessary for loop * make style	2020-03-30 13:45:26 +02:00
Sam Shleifer	f6a23d1911	[BART] add bart-large-xsum weights (#3422 )	2020-03-29 10:51:13 -04:00
Sam Shleifer	3ee431dd4c	[Bart/Memory] Two separate, smaller decoder attention masks (#3371 )	2020-03-26 21:34:15 -04:00
Sam Shleifer	39371ee454	[Bart/Memory] don't create lm_head (#3323 ) * delete lm_head, skips weight tying * Fixed s3	2020-03-26 18:40:39 -04:00
sakares saengkaew	1a6c546c6f	Add missing token classification for XLM (#3277 ) * Add the missing token classification for XLM * fix styling * Add XLMForTokenClassification to AutoModelForTokenClassification class * Fix docstring typo for non-existing class * Add the missing token classification for XLM * fix styling * fix styling * Add XLMForTokenClassification to AutoModelForTokenClassification class * Fix docstring typo for non-existing class * Add missing description for AlbertForTokenClassification * fix styling * Add missing docstring for AlBert * Slow tests should be slow Co-authored-by: Sakares Saengkaew <s.sakares@gmail.com> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-03-26 10:22:13 -04:00
Patrick von Platen	022e8fab97	Adds translation pipeline (#3419 ) * fix merge conflicts * add t5 summarization example * change parameters for t5 summarization * make style * add first code snippet for translation * only add prefixes * add prefix patterns * make style * renaming * fix conflicts * remove unused patterns * solve conflicts * fix merge conflicts * remove translation example * remove summarization example * make sure tensors are in numpy for float comparsion * re-add t5 config * fix t5 import config typo * make style * remove unused numpy statements * update doctstring * import translation pipeline	2020-03-26 13:50:58 +01:00
Patrick von Platen	9c683ef01e	Add t5 to pipeline(task='summarization') (#3413 ) * solve conflicts * move warnings below * incorporate changes * add pad_to_max_length to pipelines * add bug fix for T5 beam search * add prefix patterns * make style * fix conflicts * adapt pipelines for task specific parameters * improve docstring * remove unused patterns	2020-03-26 11:03:13 +01:00

1 2 3 4 5 ...

288 Commits