transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-21 21:49:06 +06:00

Author	SHA1	Message	Date
Stas Bekman	999a1c957a	skip failing FSMT CUDA tests until investigated (#7220 )	2020-09-17 16:53:14 -04:00
Stas Bekman	1eeb206bef	[ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-17 11:31:29 -04:00
Sylvain Gugger	492bb6aa48	Trainer multi label (#7191 ) * Trainer accep multiple labels * Missing import * Fix dosctrings	2020-09-17 08:15:37 -04:00
Julien Plu	af8425b749	Refactoring the TF activations functions (#7150 ) * Refactoring the activations functions into a common file * Apply style * remove unused import * fix tests * Fix tests.	2020-09-16 07:03:47 -04:00
Yih-Dar	4c62c6021a	fix ZeroDivisionError and epoch counting (#7125 ) * fix ZeroDivisionError and epoch counting * Add test for num_train_epochs calculation in trainer.py * Remove @require_non_multigpu for test_num_train_epochs_in_training	2020-09-15 11:51:50 -04:00
Sylvain Gugger	7186ca6240	Multi predictions trainer (#7126 ) * Allow multiple outputs * Formatting * Move the unwrapping before metrics * Fix typo * Add test for non-supported config options	2020-09-15 10:27:24 -04:00
Sylvain Gugger	2bf70e2150	Fix reproducible tests in Trainer (#7119 ) * Fix reproducible tests in Trainer * Deal with multiple GPUs	2020-09-15 03:32:44 -04:00
Sam Shleifer	9e89390ce1	[QOL] add signature for prepare_seq2seq_batch (#7108 )	2020-09-14 20:33:08 -04:00
Stas Bekman	4d39148419	fix deprecation warnings (#7033 ) * fix deprecation warnings * remove tests/test_tokenization_common.py's test_padding_to_max_length * revert test_padding_to_max_length	2020-09-14 07:51:19 -04:00
Stas Bekman	576eec98e0	ignore FutureWarning in tests (#7079 )	2020-09-14 07:50:51 -04:00
Lysandre Debut	bb3106f741	Temporarily skip failing tests due to dependency change (#7118 ) * Temporarily skip failing tests due to dependency change * Remove trace	2020-09-14 07:42:13 -04:00
Suraj Patil	0a8c17d53c	[T5Tokenizer] remove prefix_tokens (#7078 )	2020-09-11 14:18:45 -04:00
Sylvain Gugger	ae736163d0	Add tests and fix various bugs in ModelOutput (#7073 ) * Add tests and fix various bugs in ModelOutput * Update tests/test_model_output.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-09-11 12:01:33 -04:00
Patrick von Platen	221d4c63a3	clean naming (#7068 )	2020-09-11 09:57:53 +02:00
Stas Bekman	8fcbe486e1	these tests require non-multigpu env (#7059 ) * these tests require non-multigpu env * cleanup * clarify	2020-09-10 18:52:55 -04:00
Sylvain Gugger	514486739c	Fix CI with change of name of nlp (#7054 ) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last	2020-09-10 14:51:08 -04:00
Sylvain Gugger	15a189049e	Add TF Funnel Transformer (#7029 ) * Add TF Funnel Transformer * Proper dummy input * Formatting * Update src/transformers/modeling_tf_funnel.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * One review comment forgotten Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-10 10:41:56 -04:00
Patrick von Platen	7fd1febf38	Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594 ) * add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn	2020-09-10 16:40:51 +02:00
Yu Liu	762cba3bda	Albert pretrain datasets/ datacollator (#6168 ) * add dataset for albert pretrain * datacollator for albert pretrain * naming, comprehension, file reading change * data cleaning is no needed after this modification * delete prints * fix a bug * file structure change * add tests for albert datacollator * remove random seed * add back len and get item function * sample file for testing and test code added * format change for black * more format change * Style * var assignment issue resolve * add back wrongly deleted DataCollatorWithPadding in init file * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-09-10 07:56:29 -04:00
Lysandre Debut	15478c1287	Batch encore plus and overflowing tokens fails when non existing overflowing tokens for a sequence (#6677 ) * Patch and test * Fix tests	2020-09-09 06:55:17 -04:00
Julien Chaumond	ed71c21d6a	[from_pretrained] Allow tokenizer_type ≠ model_type (#6995 )	2020-09-09 04:22:59 -04:00
Stas Bekman	d0963486c1	adding TRANSFORMERS_VERBOSITY env var (#6961 ) * introduce TRANSFORMERS_VERBOSITY env var + test + test helpers * cleanup * remove helper function	2020-09-09 04:08:01 -04:00
Sylvain Gugger	d155b38d6e	Funnel transformer (#6908 ) * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Fix copyright * Forgot some layers can be repeated * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/modeling_funnel.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Slow integration test * Make small integration test * Formatting * Add checkpoint and separate classification head * Formatting * Expand list, fix link and add in pretrained models * Styling * Add the model in all summaries * Typo fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-08 08:08:08 -04:00
Boris Dayma	995a958dd1	feat: allow prefix for any generative model (#5885 ) * feat: allow padding_text for any generative model * docs(pipelines.py): correct typo * Update src/transformers/pipelines.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * feat: rename padding_text to prefix * fix: cannot tokenize empty text * fix: pass prefix arg to pipeline * test: add prefix to text-generetation pipeline * style: fix style * style: clean code and variable name more explicit * set arg docstring to optional Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-07 03:03:45 -04:00
Patrick von Platen	e3990d137a	fix (#6946 )	2020-09-04 16:08:54 +02:00
Antonio V Mendoza	ea2c6f1afc	Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793 ) * added template files for LXMERT and competed the configuration_lxmert.py * added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested] * added model card for lxmert * cleaning up lxmert code * Update src/transformers/modeling_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * tested torch lxmert, changed documtention, updated outputs, and other small fixes * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * renaming, other small issues, did not change TF code in this commit * added lxmert question answering model in pytorch * added capability to edit number of qa labels for lxmert * made answer optional for lxmert question answering * add option to return hidden_states for lxmert * changed default qa labels for lxmert * changed config archive path * squshing 3 commits: merged UI + testing improvments + more UI and testing * changed some variable names for lxmert * TF LXMERT * Various fixes to LXMERT * Final touches to LXMERT * AutoTokenizer order * Add LXMERT to index.rst and README.md * Merge commit test fixes + Style update * TensorFlow 2.3.0 sequential model changes variable names Remove inherited test * Update src/transformers/modeling_tf_pytorch_utils.py * Update docs/source/model_doc/lxmert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/lxmert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added suggestions * Fixes * Final fixes for TF model * Fix docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-03 04:02:25 -04:00
Puneetha Pai	4ebb52afdb	test_tf_common: remove un_used mixin class parameters (#6866 )	2020-09-02 10:54:40 -04:00
Stas Bekman	e71f32c0ef	[testing] fix ambiguous test (#6898 ) Since `generate()` does: ``` num_beams = num_beams if num_beams is not None else self.config.num_beams ``` This test fails if `model.config.num_beams > 1` (which is the case in the model I'm porting). This fix makes the test setup unambiguous by passing an explicit `num_beams=1` to `generate()`. Thanks.	2020-09-02 16:18:17 +02:00
Suraj Patil	4230d30f77	[pipelines] Text2TextGenerationPipeline (#6744 ) * add Text2TextGenerationPipeline * remove max length warning * remove comments * remove input_length * fix typo * add tests * use TFAutoModelForSeq2SeqLM * doc * typo * add the doc below TextGenerationPipeline * doc nit * style * delete comment	2020-09-02 07:34:35 -04:00
Patrick von Platen	afc4ece462	[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735 ) * fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test	2020-09-01 12:38:25 +02:00
Sam Shleifer	8af1970e45	Fix marian slow test (#6854 )	2020-08-31 16:10:43 -04:00
Huang Lianzhe	2de7ee0385	Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644 ) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-31 08:25:00 -04:00
Stas Bekman	563485bf95	[tests] fix typos in inputs (#6818 )	2020-08-30 18:19:57 +08:00
Sam Shleifer	0f58903bb6	Pegasus finetune script: add --adafactor (#6811 )	2020-08-29 17:43:32 -04:00
Sam Shleifer	3cac867fac	t5 model should make decoder_attention_mask (#6800 )	2020-08-28 15:22:33 -04:00
Sam Shleifer	20f7786453	Fix style (#6803 )	2020-08-28 15:02:25 -04:00
Sam Shleifer	9336086ab5	prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654 ) * broken test * batch parity * tests pass * boom boom * boom boom * split out bart tokenizer tests * fix tests * boom boom * Fixed dataset bug * Fix marian * Undo extra * Get marian working * Fix t5 tok tests * Test passing * Cleanup * better assert msg * require torch * Fix mbart tests * undo extra decoder_attn_mask change * Fix import * pegasus tokenizer can ignore src_lang kwargs * unused kwarg test cov * boom boom * add todo for pegasus issue * cover one word translation edge case * Cleanup * doc	2020-08-28 11:15:17 -04:00
RafaelWO	cb276b41de	Transformer-XL: Improved tokenization with sacremoses (#6322 ) * Improved tokenization with sacremoses * The TransfoXLTokenizer is now using sacremoses for tokenization * Added tokenization of comma-separated and floating point numbers. * Removed prepare_for_tokenization() from tokenization_transfo_xl.py because punctuation is handled by sacremoses * Added corresponding tests * Removed test comapring TransfoXLTokenizer and TransfoXLTokenizerFast * Added deprecation warning to TransfoXLTokenizerFast * isort change Co-authored-by: Teven <teven.lescao@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-28 09:56:17 -04:00
Stas Bekman	92ac2fa7d1	[transformers-cli] fix logger getter (#6777 )	2020-08-27 20:01:17 -04:00
Lysandre	42fddacd1c	Format	2020-08-27 18:31:51 +02:00
Stas Bekman	dbfe34f2f5	[test schedulers] adjust to test the first step's reading (#6429 ) * [test schedulers] small improvement * cleanup	2020-08-27 12:23:28 -04:00
Stas Bekman	e6b811f0a7	[testing] replace hardcoded paths to allow running tests from anywhere (#6523 ) * [testing] replace hardcoded paths to allow running tests from anywhere * fix the merge conflict	2020-08-27 12:22:18 -04:00
Nikolai Yakovenko	971d1802d0	Add AdaFactor optimizer from fairseq (#6722 ) * AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM. * update PR fixes, add basic test * bug -- incorrect params in test * bugfix -- import Adafactor into test * bugfix -- removed accidental T5 include * resetting T5 to master * bugfix -- include Adafactor in __init__ * longer loop for adafactor test * remove double error class declare * lint * black * isort * Update src/transformers/optimization.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * single docstring * Cleanup docstring Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-27 04:58:13 -04:00
Julien Chaumond	3242e4d942	[model_cards] Fix tiny typos	2020-08-26 23:16:06 +02:00
Patrick von Platen	858b7d5873	[TF Longformer] Improve Speed for TF Longformer (#6447 ) * add tf graph compile tests * fix conflict * remove more tf transpose statements * fix conflicts * fix comment typos * move function to class function * fix black * fix black * make style	2020-08-26 14:55:41 -04:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Lysandre Debut	77abd1e79f	Centralize logging (#6434 ) * Logging * Style * hf_logging > utils.logging * Address @thomwolf's comments * Update test * Update src/transformers/benchmark/benchmark_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Revert bad change Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-26 11:10:36 -04:00
Sam Shleifer	624495706c	T5Tokenizer adds EOS token if not already added (#5866 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-25 14:56:08 -04:00
Sam Shleifer	e11d923bfc	Fix pegasus-xsum integration test (#6726 )	2020-08-25 14:06:28 -04:00
Sylvain Gugger	abc0202194	More tests to Trainer (#6699 ) * More tests to Trainer * Add warning in the doc	2020-08-25 07:07:36 -04:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Sam Shleifer	5bf4465e6c	Regression test for pegasus bugfix (#6606 )	2020-08-20 15:34:43 -04:00
sgugger	86c07e634f	One last threshold to raise	2020-08-20 14:23:09 -04:00
Sylvain Gugger	e8af90c052	Move threshold up for flaky test with Electra (#6622 ) * Move threshold up for flaky test with Electra * Update above as well	2020-08-20 13:59:40 -04:00
Patrick von Platen	505f2d749e	[Tests] fix attention masks in Tests (#6621 ) * fix distilbert * fix typo	2020-08-20 13:23:47 -04:00
Denisa Roberts	c9454507cf	Add tests for Reformer tokenizer (#6485 )	2020-08-20 18:58:44 +02:00
Sylvain Gugger	573bdb0a5d	Add tests to Trainer (#6605 ) * Add tests to Trainer * Test if removing long breaks everything * Remove ugly hack * Fix distributed test * Use float for number of epochs	2020-08-20 11:13:50 -04:00
Suraj Patil	7581884dee	[BartTokenizerFast] add prepare_seq2seq_batch (#6543 )	2020-08-19 10:37:48 -04:00
Patrick von Platen	8bcceaceff	fix model outputs test (#6593 )	2020-08-19 16:18:51 +02:00
Pradhy729	2a7402cbd3	Feed forward chunking others (#6365 ) * Feed forward chunking for Distilbert & Albert * Added ff chunking for many other models * Change model signature * Added chunking for XLM * Cleaned up by removing some variables. * remove test_chunking flag Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-19 14:31:10 +02:00
Patrick von Platen	fe0b85e77a	[EncoderDecoder] Add functionality to tie encoder decoder weights (#6538 ) * start adding tie encoder to decoder functionality * finish model tying * make style * Apply suggestions from code review * fix t5 list including cross attention * apply sams suggestions * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add max depth break point Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-19 14:23:45 +02:00
Sam Shleifer	ab42d74850	Fix bart base test (#6587 )	2020-08-18 21:28:10 -04:00
Sam Shleifer	1529bf9680	add BartConfig.force_bos_token_to_be_generated (#6526 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-18 19:15:50 -04:00
Sam Shleifer	12d7624199	[marian] converter supports models from new Tatoeba project (#6342 )	2020-08-17 23:55:42 -04:00
Suraj Patil	407da12ef1	[T5Tokenizer] add prepare_seq2seq_batch method (#6122 ) * tests	2020-08-17 13:57:19 -04:00
Suraj Patil	2a77813d53	[BartTokenizer] add prepare s2s batch (#6212 ) Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-08-17 11:44:46 -04:00
Funtowicz Morgan	b41cc0b86a	Fix flaky ONNX tests (#6531 )	2020-08-17 09:04:35 -04:00
Kevin Canwen Xu	37709b5909	Remove deprecated assertEquals (#6532 ) `assertEquals` is deprecated: https://stackoverflow.com/questions/930995/assertequals-vs-assertequal-in-python/931011 This PR replaces these deprecated methods.	2020-08-17 17:13:58 +08:00
Masatoshi Suzuki	48c6c6139f	Support additional dictionaries for BERT Japanese tokenizers (#6515 ) * Update BERT Japanese tokenizers * Update CircleCI config to download unidic * Specify to use the latest dictionary packages	2020-08-17 12:00:23 +08:00
Patrick von Platen	1d6e71e116	[EncoderDecoder] Add Cross Attention for GPT2 (#6415 ) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-14 09:43:29 +02:00
Suraj Patil	680f1337c3	MBartForConditionalGeneration (#6441 ) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions	2020-08-14 03:21:16 -04:00
Lysandre Debut	f7cbc13db7	Test model outputs equivalence (#6445 ) * Test model outputs equivalence * Fix failing tests * From dict to kwargs * DistilBERT * Addressing @sgugger and @patrickvonplaten's comments	2020-08-13 11:59:35 -04:00
Stas Bekman	e983da0e7d	cleanup tf unittests: part 2 (#6260 ) * cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification	2020-08-13 04:29:06 -04:00
Joe Davison	bc820476a5	add targets arg to fill-mask pipeline (#6239 ) * add targets arg to fill-mask pipeline * add tests and more error handling * quality * update docstring	2020-08-12 12:48:29 -04:00
Patrick von Platen	0735def8e1	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 ) * add encoder-decoder for roberta * fix headmask * apply Sylvains suggestions * fix typo * Apply suggestions from code review	2020-08-12 18:23:30 +02:00
Sylvain Gugger	e9c3031463	Fixes to make life easier with the nlp library (#6423 ) * allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-08-12 08:00:56 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
Junyuan Zheng	cdf1f7edb2	Fix tokenizer saving and loading error (#6026 ) * fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens * Add tokenizer test * Style * Style 2 Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-11 04:49:16 -04:00
Pradhy729	b25cec13c5	Feed forward chunking (#6024 ) * Chunked feed forward for Bert This is an initial implementation to test applying feed forward chunking for BERT. Will need additional modifications based on output and benchmark results. * Black and cleanup * Feed forward chunking in BertLayer class. * Isort * add chunking for all models * fix docs * Fix typo Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-11 03:12:45 -04:00
Patrick von Platen	00bb0b25ed	TF Longformer (#5764 ) * improve names and tests longformer * more and better tests for longformer * add first tf test * finalize tf basic op functions * fix merge * tf shape test passes * narrow down discrepancies * make longformer local attn tf work * correct tf longformer * add first global attn function * add more global longformer func * advance tf longformer * finish global attn * upload big model * finish all tests * correct false any statement * fix common tests * make all tests pass except keras save load * fix some tests * fix torch test import * finish tests * fix test * fix torch tf tests * add docs * finish docs * Update src/transformers/modeling_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply Lysandres suggestions * reverse to assert statement because function will fail otherwise * applying sylvains recommendations * Update src/transformers/modeling_longformer.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-10 23:25:06 +02:00
Patrick von Platen	3425936643	[EncoderDecoderModel] add a `add_cross_attention` boolean to config (#6377 ) * correct encoder decoder model * Apply suggestions from code review * apply sylvains suggestions	2020-08-10 19:46:48 +02:00
Lysandre Debut	b99098abc7	Patch models (#6326 ) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo	2020-08-10 10:39:17 -04:00
Stas Bekman	1429b920d4	refactor almost identical tests (#6339 ) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt	2020-08-10 05:31:20 -04:00
Julien Plu	0e36e51515	Fix the tests for Electra (#6284 ) * Fix the tests for Electra * Apply style	2020-08-07 09:30:57 -04:00
Sylvain Gugger	6ba540b747	Add a script to check all models are tested and documented (#6298 ) * Add a script to check all models are tested and documented * Apply suggestions from code review Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * Address comments Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-07 09:18:37 -04:00
Philip May	d5bc32ce92	Add strip_accents to basic BertTokenizer. (#6280 ) * Add strip_accents to basic tokenizer * Add tests for strip_accents. * fix style with black * Fix strip_accents test * empty commit to trigger CI * Improved strip_accents check * Add code quality with is not False	2020-08-06 18:52:28 +08:00
Sylvain Gugger	c67d1a0259	Tf model outputs (#6247 ) * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * Add new models and fix issues * Quality improvements * Add T5 * A bit of cleanup * Fix for slow tests * Style	2020-08-05 11:34:39 -04:00
Julien Plu	33966811bd	Add SequenceClassification and MultipleChoice TF models to Electra (#6227 ) * Add SequenceClassification and MultipleChoice TF models to Electra * Apply style * Add summary_proj_to_labels to Electra config * Finally mirroring the PT version of these models * Apply style * Fix Electra test	2020-08-05 09:04:27 -04:00
Patrick von Platen	7f65daa2e1	fix reformer fp16 (#6237 )	2020-08-04 13:02:25 +02:00
Sam Shleifer	6730ecdd3c	Remove redundant coverage (#6224 )	2020-08-04 02:59:21 -04:00
Stas Bekman	5deed37f9f	cleanup torch unittests (#6196 ) * improve unit tests this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973 before I apply it to the rest * batch 1 * batch 2 * batch 3 * batch 4 * batch 5 * style * non-tf template * last deletion of check_loss_output	2020-08-04 02:42:56 -04:00
Julien Plu	9996f697e3	Fix saved model creation (#5468 ) * Fix TF Serving when output_hidden_states and output_attentions are True * Add tests for saved model creation + bug fix for multiple choices models * remove unused import * Fix the input for several layers * Fix test * Fix conflict printing * Apply style * Fix XLM and Flaubert for TensorFlow * Apply style * Fix TF check version * Apply style * Trigger CI	2020-08-03 08:10:40 -04:00
Sylvain Gugger	d951c14ae4	Model output test (#6155 ) * Use return_dict=True in all tests * Formatting	2020-07-31 09:44:37 -04:00
Suraj Patil	838dc06ff5	parse arguments from dict (#4869 ) * add parse_dict to parse arguments from dict * add unit test for parse_dict	2020-07-31 04:44:23 -04:00
Stas Bekman	f250beb8aa	enable easy checkout switch (#5645 ) * enable easy checkout switch allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests. * make isort happy * examples needs one too	2020-07-31 04:34:46 -04:00
Stas Bekman	a2f6d521c1	typos (#6162 ) * 2 small typos * more typos * correct path	2020-07-30 17:18:27 -04:00
guillaume-be	e642c78908	Addition of a DialoguePipeline (#5516 ) * initial commit for pipeline implementation Addition of input processing and history concatenation * Conversation pipeline tested and working for single & multiple conversation inputs * Added docstrings for dialogue pipeline * Addition of dialogue pipeline integration tests * Delete test_t5.py * Fixed max code length * Updated styling * Fixed test broken by formatting tools * Removed unused import * Added unit test for DialoguePipeline * Fixed Tensorflow compatibility * Fixed multi-framework support using framework flag * - Fixed docstring - Added `min_length_for_response` as an initialization parameter - Renamed `args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]` - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input - renamed pipeline name from dialogue to conversational - removed hardcoded default value of 1000 and use config.max_length instead - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation - fixed bug in history truncation method * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised) * - Simplified input tensor conversion * - Updated attention_mask value for Tensorflow compatibility * - Updated last dialogue reference to conversational & fixed integration tests * Fixed conflict with master * Updates following review comments * Updated formatting * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs * Update src/transformers/pipelines.py Updated docsting following review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-30 14:11:39 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Lysandre Debut	3f94170a10	[WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614 ) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice	2020-07-29 14:26:26 -04:00
Funtowicz Morgan	6c002853a6	Added capability to quantize a model while exporting through ONNX. (#6089 ) * Added capability to quantize a model while exporting through ONNX. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> We do not support multiple extensions Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Reformat files Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * More quality Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Ensure test_generate_identified_name compares the same object types Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added documentation everywhere on ONNX exporter Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use pathlib.Path instead of plain-old string Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use f-string everywhere Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use the correct parameters for black formatting Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use Python 3 super() style. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use packaging.version to ensure installed onnxruntime version match requirements Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fixing imports sorting order. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Missing raise(s) Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added quantization documentation Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix some spelling. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix bad list header format Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-29 13:21:29 +02:00
Sam Shleifer	c49cd927f7	[Fix] position_ids tests again (#6100 )	2020-07-28 18:29:35 -04:00
Sam Shleifer	5abe50381a	Fix #6096 : MBartTokenizer's mask token (#6098 )	2020-07-28 18:27:58 -04:00
Sam Shleifer	3c7fbf35a6	MBART: support summarization tasks where max_src_len > max_tgt_len (#6003 ) * MBART: support summarization tasks * fix test * Style * add tokenizer test	2020-07-28 08:18:11 -04:00
Joe Davison	3deffc1d67	Zero shot classification pipeline (#5760 ) * add initial zero-shot pipeline * change default args * update default template * add label string splitting * add str labels support, remove nli from name * style * add input validation and working tf defaults * tests * quality check * add docstring to __call__ * add slow tests * Change truncation to only_first also lower precision on tests for readibility * style	2020-07-27 09:42:58 -04:00
Sylvain Gugger	f5b5c5bd7e	Avoid unnecessary warnings when loading pretrained model (#5922 ) * Avoid unnecessary warnings when loading pretrained model * Fix test * Add other keys to ignore * keys_to_ignore_at_load -> authorized_missing_keys	2020-07-23 18:13:36 -04:00
Sam Shleifer	9827d666eb	MbartTokenizer: do not hardcode vocab size (#5998 )	2020-07-23 15:41:14 -04:00
Stas Bekman	35cb101eae	DataParallel fixes (#5733 ) * DataParallel fixes: 1. switched to a more precise check - if self.args.n_gpu > 1: + if isinstance(model, nn.DataParallel): 2. fix tests - require the same fixup under DataParallel as the training module * another fix	2020-07-20 09:29:12 -04:00
Pradhy729	290b6e18ac	Trainer support for iterabledataset (#5834 ) * Don't pass sampler for iterable dataset * Added check for test and eval dataloaders. * Formatting * Don't pass sampler for iterable dataset * Added check for test and eval dataloaders. * Formatting * Cleaner if nesting. * Added test for trainer and iterable dataset * Formatting for test * Fixed import when torch is available only. * Added require torch decorator to helper class * Moved dataset class inside unittest * Removed nested if and changed model in test * Checking torch availability for IterableDataset	2020-07-20 09:07:37 -04:00
Teven	4b506a37e3	Xlnet outputs (#5883 ) Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.	2020-07-18 17:33:13 +02:00
Teven	a55809241f	Revert "Xlnet outputs (#5881 )" (#5882 ) This reverts commit `13be487212`.	2020-07-18 17:15:40 +02:00
Teven	13be487212	Xlnet outputs (#5881 ) Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.	2020-07-18 16:53:29 +02:00
Teven	615be03f9d	Revert "XLNet `use_cache` refactor (#5770 )" (#5854 ) This reverts commit `0b2da0e592`.	2020-07-17 20:33:44 +02:00
Teven	0b2da0e592	XLNet `use_cache` refactor (#5770 ) Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.	2020-07-17 20:24:16 +02:00
Patrick von Platen	9d37c56bab	[Reformer] - Cache hidden states and buckets to speed up inference (#5578 ) * fix merge rebase * add intermediate reformer code * save intermediate caching results * save intermediate * save intermediate results * save intermediate * upload next step * fix generate tests * make tests work * add named tuple output * Apply suggestions from code review * fix use_cache for False case * fix tensor to gpu * fix tensor to gpu * refactor * refactor and make style	2020-07-17 16:17:42 +02:00
Patrick von Platen	89a78be51f	fix benchmark for longformer (#5808 )	2020-07-16 15:15:10 +02:00
Sam Shleifer	1a647abf0b	[fix] check code quality (#5772 )	2020-07-15 14:59:38 -04:00
Funtowicz Morgan	d533c7e9b9	[fix] T5 ONNX test: model.to(torch_device) (#5769 ) Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-15 10:11:22 -04:00
Sam Shleifer	d0486c8bc2	[cleanup] T5 test, warnings (#5761 )	2020-07-15 08:23:22 -04:00
Sam Shleifer	838950ee44	[fix] mbart_en_ro_generate test now identical to fairseq (#5731 )	2020-07-14 06:12:24 -04:00
as-stevens	f867000f56	[Reformer classification head] Implement the reformer model classification head for text classification (#5198 ) * Reformer model head classification implementation for text classification * Reformat the reformer model classification code * PR review comments, and test case implementation for reformer for classification head changes * CI/CD reformer for classification head test import error fix * CI/CD test case implementation added ReformerForSequenceClassification to all_model_classes * Code formatting- fixed * Normal test cases added for reformer classification head * Fix test cases implementation for the reformer classification head * removed token_type_id parameter from the reformer classification head * fixed the test case for reformer classification head * merge conflict with master fixed * merge conflict, changed reformer classification to accept the choice_label parameter added in latest code * refactored the the reformer classification head test code * reformer classification head, common transform test cases fixed * final set of the review comment, rearranging the reformer classes and docstring add to classification forward method * fixed the compilation error and text case fix for reformer classification head * Apply suggestions from code review Remove unnecessary dup Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-07-14 09:16:22 +02:00
Stas Bekman	45addfe96d	FlaubertForTokenClassification (#5644 ) * implement FlaubertForTokenClassification as a subclass of XLMForTokenClassification * fix mapping order * add the doc * add common tests	2020-07-13 14:59:53 -04:00
Stas Bekman	443b0cad96	rename the function to match the rest of the test convention (#5692 )	2020-07-13 18:09:49 +08:00
Sylvain Gugger	edfd82f5ff	Change model outputs types to self-document outputs (#5438 ) * [WIP] Proposal for model outputs * All Bert models * Make CI green maybe? * Fix ONNX test * Isolate ModelOutput from pt and tf * Formatting * Add Electra models * Auto-generate docstrings from outputs * Add TF outputs * Add some BERT models * Revert TF side * Remove last traces of TF changes * Fail with a clear error message * Add Albert and work through Bart * Add CTRL and DistilBert * Formatting * Progress on Bart * Renames and finish Bart * Formatting * Fix last test * Add DPR * Finish Electra and add FlauBERT * Add GPT2 * Add Longformer * Add MMBT * Add MobileBert * Add GPT * Formatting * Add Reformer * Add Roberta * Add T5 * Add Transformer XL * Fix test * Add XLM + fix XLMForTokenClassification * Style + XLMRoberta * Add XLNet * Formatting * Add doc of return_tuple arg	2020-07-10 11:36:53 -04:00
Lorenzo Ampil	0cc4eae0e6	Fix Inconsistent NER Grouping (Pipeline) (#4987 ) * Add B I handling to grouping * Add fix to include separate entity as last token * move last_idx definition outside loop * Use first entity in entity group as reference for entity type * Add test cases * Take out extra class accidentally added * Return tf ner grouped test to original * Take out redundant last entity * Get last_idx safely Co-authored-by: ColleterVi <36503688+ColleterVi@users.noreply.github.com> * Fix first entity comment * Create separate functions for group_sub_entities and group_entities (splitting call method to testable functions) * Take out unnecessary last_idx * Remove additional forward pass test * Move token classification basic tests to separate class * Move token classification basic tests back to monocolumninputtestcase * Move base ner tests to nerpipelinetests * Take out unused kwargs * Add back mandatory_keys argument * Add unitary tests for group_entities in _test_ner_pipeline * Fix last entity handling * Fix grouping fucntion used * Add typing to group_sub_entities and group_entities Co-authored-by: ColleterVi <36503688+ColleterVi@users.noreply.github.com>	2020-07-08 16:18:17 -04:00
Patrick von Platen	f82a2a5e8e	[Benchmark] Add benchmarks for TF Training (#5594 ) * tf_train * adapt timing for tpu * fix timing * fix timing * fix timing * fix timing * update notebook * add tests	2020-07-08 12:11:09 +02:00
Sam Shleifer	353b8f1e7a	Add mbart-large-cc25, support translation finetuning (#5129 ) improve unittests for finetuning, especially w.r.t testing frozen parameters fix freeze_embeds for T5 add streamlit setup.cfg	2020-07-07 13:23:01 -04:00
Patrick von Platen	4dc65591b5	[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395 ) * add first version of clm tf * make style * add more tests for bert * update tf clm loss * fix tests * correct tf ner script * add mlm loss * delete bogus file * clean tf auto model + add tests * finish adding clm loss everywhere * fix training in distilbert * fix flake8 * save intermediate * fix tf t5 naming * remove prints * finish up * up * fix tf gpt2 * fix new test utils import * fix flake8 * keep backward compatibility * Update src/transformers/modeling_tf_albert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_electra.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_mobilebert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_distilbert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply sylvains suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-07 18:15:53 +02:00
Quentin Lhoest	4fedc1256c	Fix tests imports dpr (#5576 ) * fix test imports * fix max_length * style * fix tests	2020-07-07 16:35:12 +02:00
Sam Shleifer	d4886173b2	[Bart] enable test_torchscript, update test_tie_weights (#5457 ) * Passing all but one torchscript test * Style * move comment * remove unneeded assert	2020-07-07 10:06:48 -04:00
Quentin Lhoest	fbd8792195	Add DPR model (#5279 ) * beginning of dpr modeling * wip * implement forward * remove biencoder + better init weights * export dpr model to embed model for nlp lib * add new api * remove old code * make style * fix dumb typo * don't load bert weights * docs * docs * style * move the `k` parameter * fix init_weights * add pretrained configs * minor * update config names * style * better config * style * clean code based on PR comments * change Dpr to DPR * fix config * switch encoder config to a dict * style * inheritance -> composition * add messages in assert startements * add dpr reader tokenizer * one tokenizer per model * fix base_model_prefix * fix imports * typo * add convert script * docs * change tokenizers conf names * style * change tokenizers conf names * minor * minor * fix wrong names * minor * remove unused convert functions * rename convert script * use return_tensors in tokenizers * remove n_questions dim * move generate logic to tokenizer * style * add docs * docs * quality * docs * add tests * style * add tokenization tests * DPR full tests * Stay true to the attention mask building * update docs * missing param in bert input docs * docs * style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-07-07 08:56:12 -04:00
Abel	6912265711	Make T5 compatible with ONNX (#5518 ) * Default decoder inputs to encoder ones for T5 if neither are specified. * Fixing typo, now all tests are passing. * Changing einsum to operations supported by onnx * Adding a test to ensure T5 can be exported to onnx op>9 * Modified test for onnx export to make it faster * Styling changes. * Styling changes. * Changing notation for matrix multiplication Co-authored-by: Abel Riboulot <tkai@protomail.com>	2020-07-07 11:32:29 +02:00
Patrick von Platen	989ae326b5	[Reformer] Adapt Reformer MaskedLM Attn mask (#5560 ) * fix attention mask * fix slow test * refactor attn masks * fix fp16 generate test	2020-07-07 10:48:06 +02:00
Shashank Gupta	3dcb748e31	Added data collator for permutation (XLNet) language modeling and related calls (#5522 ) * Added data collator for XLNet language modeling and related calls Added DataCollatorForXLNetLanguageModeling in data/data_collator.py to generate necessary inputs for language modeling training with XLNetLMHeadModel. Also added related arguments, logic and calls in examples/language-modeling/run_language_modeling.py. Resolves: #4739, #2008 (partially) * Changed name to `DataCollatorForPermutationLanguageModeling` Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`. Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use. CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of similar to `mems` for XLNet). Changed calls and imports appropriately. * Added detailed comments, changed variable names Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative. * Added tests for new data collator Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences. * Fixed styling issues	2020-07-07 10:17:37 +02:00
Anthony MOI	5787e4c159	Various tokenizers fixes (#5558 ) * BertTokenizerFast - Do not specify strip_accents by default * Bump tokenizers to new version * Add test for AddedToken serialization	2020-07-06 18:27:53 -04:00
Sam Shleifer	58cca47c16	[cleanup] TF T5 tests only init t5-base once. (#5410 )	2020-07-03 14:27:49 -04:00
Lysandre Debut	17ade127b9	Exposing prepare_for_model for both slow & fast tokenizers (#5479 ) * Exposing prepare_for_model for both slow & fast tokenizers * Update method signature * The traditional style commit * Hide the warnings behind the verbose flag * update default truncation strategy and prepare_for_model * fix tests and prepare_for_models methods Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-07-03 16:51:21 +02:00
Teven	6726416e4a	Changed expected_output_ids in TransfoXL generation test (#5462 ) * Changed expected_output_ids in TransfoXL generation test to match #4826 generation PR. * making black happy * making isort happy	2020-07-02 11:56:44 +02:00
Patrick von Platen	d16e36c7e5	[Reformer] Add Masked LM Reformer (#5426 ) * fix conflicts * fix * happy rebasing	2020-07-01 22:43:18 +02:00
Joe Davison	35befd9ce3	Fix tensor label type inference in default collator (#5250 ) * allow tensor label inputs to default collator * replace try/except with type check	2020-07-01 10:40:14 -06:00
Patrick von Platen	fe81f7d12c	finish reformer qa head (#5433 )	2020-07-01 12:27:14 -04:00
Patrick von Platen	d697b6ca75	[Longformer] Major Refactor (#5219 ) * refactor naming * add small slow test * refactor * refactor naming * rename selected to extra * big global attention refactor * make style * refactor naming * save intermed * refactor functions * finish function refactor * fix tests * fix longformer * fix longformer * fix longformer * fix all tests but one * finish longformer * address sams and izs comments * fix transpose	2020-07-01 17:43:32 +02:00
Sam Shleifer	e0d58ddb65	[fix] Marian tests import (#5442 )	2020-07-01 11:42:22 -04:00
Funtowicz Morgan	608d5a7c44	Raises PipelineException on FillMaskPipeline when there are != 1 mask_token in the input (#5389 ) * Added PipelineException Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fill-mask pipeline raises exception when more than one mask_token detected. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Put everything in a function. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added tests on pipeline fill-mask when input has != 1 mask_token Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix numel() computation for TF Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing PR comments. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove function typing to avoid import on specific framework. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Quality. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Retry typing with @julien-c tip. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Quality². Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify fill-mask mask_token checking. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Trigger CI	2020-07-01 17:27:47 +02:00
Sam Shleifer	43cb03a93d	MarianTokenizer.prepare_translation_batch uses new tokenizer API (#5182 )	2020-07-01 10:32:50 -04:00
Sam Shleifer	13deb95a40	Move tests/utils.py -> transformers/testing_utils.py (#5350 )	2020-07-01 10:31:17 -04:00
Sam Shleifer	32d2031458	[fix] slow fill_mask test failure (#5406 )	2020-06-30 15:28:15 -04:00
Patrick von Platen	4bcc35cd69	[Docs] Benchmark docs (#5360 ) * first doc version * add benchmark docs * fix typos * improve README * Update docs/source/benchmarks.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix naming and docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-29 16:08:57 +02:00
Sam Shleifer	28a690a80e	[mBART] skip broken forward pass test, stronger integration test (#5327 )	2020-06-28 15:08:28 -04:00

1 2 3 4 5 ...

634 Commits