transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-21 13:38:31 +06:00

Author	SHA1	Message	Date
Stas Bekman	1eeb206bef	[ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-17 11:31:29 -04:00
Sylvain Gugger	492bb6aa48	Trainer multi label (#7191 ) * Trainer accep multiple labels * Missing import * Fix dosctrings	2020-09-17 08:15:37 -04:00
Julien Plu	af8425b749	Refactoring the TF activations functions (#7150 ) * Refactoring the activations functions into a common file * Apply style * remove unused import * fix tests * Fix tests.	2020-09-16 07:03:47 -04:00
Yih-Dar	4c62c6021a	fix ZeroDivisionError and epoch counting (#7125 ) * fix ZeroDivisionError and epoch counting * Add test for num_train_epochs calculation in trainer.py * Remove @require_non_multigpu for test_num_train_epochs_in_training	2020-09-15 11:51:50 -04:00
Sylvain Gugger	7186ca6240	Multi predictions trainer (#7126 ) * Allow multiple outputs * Formatting * Move the unwrapping before metrics * Fix typo * Add test for non-supported config options	2020-09-15 10:27:24 -04:00
Sylvain Gugger	2bf70e2150	Fix reproducible tests in Trainer (#7119 ) * Fix reproducible tests in Trainer * Deal with multiple GPUs	2020-09-15 03:32:44 -04:00
Sam Shleifer	9e89390ce1	[QOL] add signature for prepare_seq2seq_batch (#7108 )	2020-09-14 20:33:08 -04:00
Stas Bekman	4d39148419	fix deprecation warnings (#7033 ) * fix deprecation warnings * remove tests/test_tokenization_common.py's test_padding_to_max_length * revert test_padding_to_max_length	2020-09-14 07:51:19 -04:00
Stas Bekman	576eec98e0	ignore FutureWarning in tests (#7079 )	2020-09-14 07:50:51 -04:00
Lysandre Debut	bb3106f741	Temporarily skip failing tests due to dependency change (#7118 ) * Temporarily skip failing tests due to dependency change * Remove trace	2020-09-14 07:42:13 -04:00
Suraj Patil	0a8c17d53c	[T5Tokenizer] remove prefix_tokens (#7078 )	2020-09-11 14:18:45 -04:00
Sylvain Gugger	ae736163d0	Add tests and fix various bugs in ModelOutput (#7073 ) * Add tests and fix various bugs in ModelOutput * Update tests/test_model_output.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-09-11 12:01:33 -04:00
Patrick von Platen	221d4c63a3	clean naming (#7068 )	2020-09-11 09:57:53 +02:00
Stas Bekman	8fcbe486e1	these tests require non-multigpu env (#7059 ) * these tests require non-multigpu env * cleanup * clarify	2020-09-10 18:52:55 -04:00
Sylvain Gugger	514486739c	Fix CI with change of name of nlp (#7054 ) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last	2020-09-10 14:51:08 -04:00
Sylvain Gugger	15a189049e	Add TF Funnel Transformer (#7029 ) * Add TF Funnel Transformer * Proper dummy input * Formatting * Update src/transformers/modeling_tf_funnel.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * One review comment forgotten Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-10 10:41:56 -04:00
Patrick von Platen	7fd1febf38	Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594 ) * add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn	2020-09-10 16:40:51 +02:00
Yu Liu	762cba3bda	Albert pretrain datasets/ datacollator (#6168 ) * add dataset for albert pretrain * datacollator for albert pretrain * naming, comprehension, file reading change * data cleaning is no needed after this modification * delete prints * fix a bug * file structure change * add tests for albert datacollator * remove random seed * add back len and get item function * sample file for testing and test code added * format change for black * more format change * Style * var assignment issue resolve * add back wrongly deleted DataCollatorWithPadding in init file * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-09-10 07:56:29 -04:00
Lysandre Debut	15478c1287	Batch encore plus and overflowing tokens fails when non existing overflowing tokens for a sequence (#6677 ) * Patch and test * Fix tests	2020-09-09 06:55:17 -04:00
Julien Chaumond	ed71c21d6a	[from_pretrained] Allow tokenizer_type ≠ model_type (#6995 )	2020-09-09 04:22:59 -04:00
Stas Bekman	d0963486c1	adding TRANSFORMERS_VERBOSITY env var (#6961 ) * introduce TRANSFORMERS_VERBOSITY env var + test + test helpers * cleanup * remove helper function	2020-09-09 04:08:01 -04:00
Sylvain Gugger	d155b38d6e	Funnel transformer (#6908 ) * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Fix copyright * Forgot some layers can be repeated * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/modeling_funnel.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Slow integration test * Make small integration test * Formatting * Add checkpoint and separate classification head * Formatting * Expand list, fix link and add in pretrained models * Styling * Add the model in all summaries * Typo fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-08 08:08:08 -04:00
Boris Dayma	995a958dd1	feat: allow prefix for any generative model (#5885 ) * feat: allow padding_text for any generative model * docs(pipelines.py): correct typo * Update src/transformers/pipelines.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * feat: rename padding_text to prefix * fix: cannot tokenize empty text * fix: pass prefix arg to pipeline * test: add prefix to text-generetation pipeline * style: fix style * style: clean code and variable name more explicit * set arg docstring to optional Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-07 03:03:45 -04:00
Patrick von Platen	e3990d137a	fix (#6946 )	2020-09-04 16:08:54 +02:00
Antonio V Mendoza	ea2c6f1afc	Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793 ) * added template files for LXMERT and competed the configuration_lxmert.py * added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested] * added model card for lxmert * cleaning up lxmert code * Update src/transformers/modeling_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * tested torch lxmert, changed documtention, updated outputs, and other small fixes * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * renaming, other small issues, did not change TF code in this commit * added lxmert question answering model in pytorch * added capability to edit number of qa labels for lxmert * made answer optional for lxmert question answering * add option to return hidden_states for lxmert * changed default qa labels for lxmert * changed config archive path * squshing 3 commits: merged UI + testing improvments + more UI and testing * changed some variable names for lxmert * TF LXMERT * Various fixes to LXMERT * Final touches to LXMERT * AutoTokenizer order * Add LXMERT to index.rst and README.md * Merge commit test fixes + Style update * TensorFlow 2.3.0 sequential model changes variable names Remove inherited test * Update src/transformers/modeling_tf_pytorch_utils.py * Update docs/source/model_doc/lxmert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/lxmert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added suggestions * Fixes * Final fixes for TF model * Fix docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-03 04:02:25 -04:00
Puneetha Pai	4ebb52afdb	test_tf_common: remove un_used mixin class parameters (#6866 )	2020-09-02 10:54:40 -04:00
Stas Bekman	e71f32c0ef	[testing] fix ambiguous test (#6898 ) Since `generate()` does: ``` num_beams = num_beams if num_beams is not None else self.config.num_beams ``` This test fails if `model.config.num_beams > 1` (which is the case in the model I'm porting). This fix makes the test setup unambiguous by passing an explicit `num_beams=1` to `generate()`. Thanks.	2020-09-02 16:18:17 +02:00
Suraj Patil	4230d30f77	[pipelines] Text2TextGenerationPipeline (#6744 ) * add Text2TextGenerationPipeline * remove max length warning * remove comments * remove input_length * fix typo * add tests * use TFAutoModelForSeq2SeqLM * doc * typo * add the doc below TextGenerationPipeline * doc nit * style * delete comment	2020-09-02 07:34:35 -04:00
Patrick von Platen	afc4ece462	[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735 ) * fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test	2020-09-01 12:38:25 +02:00
Sam Shleifer	8af1970e45	Fix marian slow test (#6854 )	2020-08-31 16:10:43 -04:00
Huang Lianzhe	2de7ee0385	Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644 ) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-31 08:25:00 -04:00
Stas Bekman	563485bf95	[tests] fix typos in inputs (#6818 )	2020-08-30 18:19:57 +08:00
Sam Shleifer	0f58903bb6	Pegasus finetune script: add --adafactor (#6811 )	2020-08-29 17:43:32 -04:00
Sam Shleifer	3cac867fac	t5 model should make decoder_attention_mask (#6800 )	2020-08-28 15:22:33 -04:00
Sam Shleifer	20f7786453	Fix style (#6803 )	2020-08-28 15:02:25 -04:00
Sam Shleifer	9336086ab5	prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654 ) * broken test * batch parity * tests pass * boom boom * boom boom * split out bart tokenizer tests * fix tests * boom boom * Fixed dataset bug * Fix marian * Undo extra * Get marian working * Fix t5 tok tests * Test passing * Cleanup * better assert msg * require torch * Fix mbart tests * undo extra decoder_attn_mask change * Fix import * pegasus tokenizer can ignore src_lang kwargs * unused kwarg test cov * boom boom * add todo for pegasus issue * cover one word translation edge case * Cleanup * doc	2020-08-28 11:15:17 -04:00
RafaelWO	cb276b41de	Transformer-XL: Improved tokenization with sacremoses (#6322 ) * Improved tokenization with sacremoses * The TransfoXLTokenizer is now using sacremoses for tokenization * Added tokenization of comma-separated and floating point numbers. * Removed prepare_for_tokenization() from tokenization_transfo_xl.py because punctuation is handled by sacremoses * Added corresponding tests * Removed test comapring TransfoXLTokenizer and TransfoXLTokenizerFast * Added deprecation warning to TransfoXLTokenizerFast * isort change Co-authored-by: Teven <teven.lescao@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-28 09:56:17 -04:00
Stas Bekman	92ac2fa7d1	[transformers-cli] fix logger getter (#6777 )	2020-08-27 20:01:17 -04:00
Lysandre	42fddacd1c	Format	2020-08-27 18:31:51 +02:00
Stas Bekman	dbfe34f2f5	[test schedulers] adjust to test the first step's reading (#6429 ) * [test schedulers] small improvement * cleanup	2020-08-27 12:23:28 -04:00
Stas Bekman	e6b811f0a7	[testing] replace hardcoded paths to allow running tests from anywhere (#6523 ) * [testing] replace hardcoded paths to allow running tests from anywhere * fix the merge conflict	2020-08-27 12:22:18 -04:00
Nikolai Yakovenko	971d1802d0	Add AdaFactor optimizer from fairseq (#6722 ) * AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM. * update PR fixes, add basic test * bug -- incorrect params in test * bugfix -- import Adafactor into test * bugfix -- removed accidental T5 include * resetting T5 to master * bugfix -- include Adafactor in __init__ * longer loop for adafactor test * remove double error class declare * lint * black * isort * Update src/transformers/optimization.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * single docstring * Cleanup docstring Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-27 04:58:13 -04:00
Julien Chaumond	3242e4d942	[model_cards] Fix tiny typos	2020-08-26 23:16:06 +02:00
Patrick von Platen	858b7d5873	[TF Longformer] Improve Speed for TF Longformer (#6447 ) * add tf graph compile tests * fix conflict * remove more tf transpose statements * fix conflicts * fix comment typos * move function to class function * fix black * fix black * make style	2020-08-26 14:55:41 -04:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Lysandre Debut	77abd1e79f	Centralize logging (#6434 ) * Logging * Style * hf_logging > utils.logging * Address @thomwolf's comments * Update test * Update src/transformers/benchmark/benchmark_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Revert bad change Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-26 11:10:36 -04:00
Sam Shleifer	624495706c	T5Tokenizer adds EOS token if not already added (#5866 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-25 14:56:08 -04:00
Sam Shleifer	e11d923bfc	Fix pegasus-xsum integration test (#6726 )	2020-08-25 14:06:28 -04:00
Sylvain Gugger	abc0202194	More tests to Trainer (#6699 ) * More tests to Trainer * Add warning in the doc	2020-08-25 07:07:36 -04:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Sam Shleifer	5bf4465e6c	Regression test for pegasus bugfix (#6606 )	2020-08-20 15:34:43 -04:00
sgugger	86c07e634f	One last threshold to raise	2020-08-20 14:23:09 -04:00
Sylvain Gugger	e8af90c052	Move threshold up for flaky test with Electra (#6622 ) * Move threshold up for flaky test with Electra * Update above as well	2020-08-20 13:59:40 -04:00
Patrick von Platen	505f2d749e	[Tests] fix attention masks in Tests (#6621 ) * fix distilbert * fix typo	2020-08-20 13:23:47 -04:00
Denisa Roberts	c9454507cf	Add tests for Reformer tokenizer (#6485 )	2020-08-20 18:58:44 +02:00
Sylvain Gugger	573bdb0a5d	Add tests to Trainer (#6605 ) * Add tests to Trainer * Test if removing long breaks everything * Remove ugly hack * Fix distributed test * Use float for number of epochs	2020-08-20 11:13:50 -04:00
Suraj Patil	7581884dee	[BartTokenizerFast] add prepare_seq2seq_batch (#6543 )	2020-08-19 10:37:48 -04:00
Patrick von Platen	8bcceaceff	fix model outputs test (#6593 )	2020-08-19 16:18:51 +02:00
Pradhy729	2a7402cbd3	Feed forward chunking others (#6365 ) * Feed forward chunking for Distilbert & Albert * Added ff chunking for many other models * Change model signature * Added chunking for XLM * Cleaned up by removing some variables. * remove test_chunking flag Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-19 14:31:10 +02:00
Patrick von Platen	fe0b85e77a	[EncoderDecoder] Add functionality to tie encoder decoder weights (#6538 ) * start adding tie encoder to decoder functionality * finish model tying * make style * Apply suggestions from code review * fix t5 list including cross attention * apply sams suggestions * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add max depth break point Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-19 14:23:45 +02:00
Sam Shleifer	ab42d74850	Fix bart base test (#6587 )	2020-08-18 21:28:10 -04:00
Sam Shleifer	1529bf9680	add BartConfig.force_bos_token_to_be_generated (#6526 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-18 19:15:50 -04:00
Sam Shleifer	12d7624199	[marian] converter supports models from new Tatoeba project (#6342 )	2020-08-17 23:55:42 -04:00
Suraj Patil	407da12ef1	[T5Tokenizer] add prepare_seq2seq_batch method (#6122 ) * tests	2020-08-17 13:57:19 -04:00
Suraj Patil	2a77813d53	[BartTokenizer] add prepare s2s batch (#6212 ) Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-08-17 11:44:46 -04:00
Funtowicz Morgan	b41cc0b86a	Fix flaky ONNX tests (#6531 )	2020-08-17 09:04:35 -04:00
Kevin Canwen Xu	37709b5909	Remove deprecated assertEquals (#6532 ) `assertEquals` is deprecated: https://stackoverflow.com/questions/930995/assertequals-vs-assertequal-in-python/931011 This PR replaces these deprecated methods.	2020-08-17 17:13:58 +08:00
Masatoshi Suzuki	48c6c6139f	Support additional dictionaries for BERT Japanese tokenizers (#6515 ) * Update BERT Japanese tokenizers * Update CircleCI config to download unidic * Specify to use the latest dictionary packages	2020-08-17 12:00:23 +08:00
Patrick von Platen	1d6e71e116	[EncoderDecoder] Add Cross Attention for GPT2 (#6415 ) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-14 09:43:29 +02:00
Suraj Patil	680f1337c3	MBartForConditionalGeneration (#6441 ) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions	2020-08-14 03:21:16 -04:00
Lysandre Debut	f7cbc13db7	Test model outputs equivalence (#6445 ) * Test model outputs equivalence * Fix failing tests * From dict to kwargs * DistilBERT * Addressing @sgugger and @patrickvonplaten's comments	2020-08-13 11:59:35 -04:00
Stas Bekman	e983da0e7d	cleanup tf unittests: part 2 (#6260 ) * cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification	2020-08-13 04:29:06 -04:00
Joe Davison	bc820476a5	add targets arg to fill-mask pipeline (#6239 ) * add targets arg to fill-mask pipeline * add tests and more error handling * quality * update docstring	2020-08-12 12:48:29 -04:00
Patrick von Platen	0735def8e1	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 ) * add encoder-decoder for roberta * fix headmask * apply Sylvains suggestions * fix typo * Apply suggestions from code review	2020-08-12 18:23:30 +02:00
Sylvain Gugger	e9c3031463	Fixes to make life easier with the nlp library (#6423 ) * allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-08-12 08:00:56 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
Junyuan Zheng	cdf1f7edb2	Fix tokenizer saving and loading error (#6026 ) * fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens * Add tokenizer test * Style * Style 2 Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-11 04:49:16 -04:00
Pradhy729	b25cec13c5	Feed forward chunking (#6024 ) * Chunked feed forward for Bert This is an initial implementation to test applying feed forward chunking for BERT. Will need additional modifications based on output and benchmark results. * Black and cleanup * Feed forward chunking in BertLayer class. * Isort * add chunking for all models * fix docs * Fix typo Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-11 03:12:45 -04:00
Patrick von Platen	00bb0b25ed	TF Longformer (#5764 ) * improve names and tests longformer * more and better tests for longformer * add first tf test * finalize tf basic op functions * fix merge * tf shape test passes * narrow down discrepancies * make longformer local attn tf work * correct tf longformer * add first global attn function * add more global longformer func * advance tf longformer * finish global attn * upload big model * finish all tests * correct false any statement * fix common tests * make all tests pass except keras save load * fix some tests * fix torch test import * finish tests * fix test * fix torch tf tests * add docs * finish docs * Update src/transformers/modeling_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply Lysandres suggestions * reverse to assert statement because function will fail otherwise * applying sylvains recommendations * Update src/transformers/modeling_longformer.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-10 23:25:06 +02:00
Patrick von Platen	3425936643	[EncoderDecoderModel] add a `add_cross_attention` boolean to config (#6377 ) * correct encoder decoder model * Apply suggestions from code review * apply sylvains suggestions	2020-08-10 19:46:48 +02:00
Lysandre Debut	b99098abc7	Patch models (#6326 ) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo	2020-08-10 10:39:17 -04:00
Stas Bekman	1429b920d4	refactor almost identical tests (#6339 ) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt	2020-08-10 05:31:20 -04:00
Julien Plu	0e36e51515	Fix the tests for Electra (#6284 ) * Fix the tests for Electra * Apply style	2020-08-07 09:30:57 -04:00
Sylvain Gugger	6ba540b747	Add a script to check all models are tested and documented (#6298 ) * Add a script to check all models are tested and documented * Apply suggestions from code review Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * Address comments Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-07 09:18:37 -04:00
Philip May	d5bc32ce92	Add strip_accents to basic BertTokenizer. (#6280 ) * Add strip_accents to basic tokenizer * Add tests for strip_accents. * fix style with black * Fix strip_accents test * empty commit to trigger CI * Improved strip_accents check * Add code quality with is not False	2020-08-06 18:52:28 +08:00
Sylvain Gugger	c67d1a0259	Tf model outputs (#6247 ) * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * Add new models and fix issues * Quality improvements * Add T5 * A bit of cleanup * Fix for slow tests * Style	2020-08-05 11:34:39 -04:00
Julien Plu	33966811bd	Add SequenceClassification and MultipleChoice TF models to Electra (#6227 ) * Add SequenceClassification and MultipleChoice TF models to Electra * Apply style * Add summary_proj_to_labels to Electra config * Finally mirroring the PT version of these models * Apply style * Fix Electra test	2020-08-05 09:04:27 -04:00
Patrick von Platen	7f65daa2e1	fix reformer fp16 (#6237 )	2020-08-04 13:02:25 +02:00
Sam Shleifer	6730ecdd3c	Remove redundant coverage (#6224 )	2020-08-04 02:59:21 -04:00
Stas Bekman	5deed37f9f	cleanup torch unittests (#6196 ) * improve unit tests this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973 before I apply it to the rest * batch 1 * batch 2 * batch 3 * batch 4 * batch 5 * style * non-tf template * last deletion of check_loss_output	2020-08-04 02:42:56 -04:00
Julien Plu	9996f697e3	Fix saved model creation (#5468 ) * Fix TF Serving when output_hidden_states and output_attentions are True * Add tests for saved model creation + bug fix for multiple choices models * remove unused import * Fix the input for several layers * Fix test * Fix conflict printing * Apply style * Fix XLM and Flaubert for TensorFlow * Apply style * Fix TF check version * Apply style * Trigger CI	2020-08-03 08:10:40 -04:00
Sylvain Gugger	d951c14ae4	Model output test (#6155 ) * Use return_dict=True in all tests * Formatting	2020-07-31 09:44:37 -04:00
Suraj Patil	838dc06ff5	parse arguments from dict (#4869 ) * add parse_dict to parse arguments from dict * add unit test for parse_dict	2020-07-31 04:44:23 -04:00
Stas Bekman	f250beb8aa	enable easy checkout switch (#5645 ) * enable easy checkout switch allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests. * make isort happy * examples needs one too	2020-07-31 04:34:46 -04:00
Stas Bekman	a2f6d521c1	typos (#6162 ) * 2 small typos * more typos * correct path	2020-07-30 17:18:27 -04:00
guillaume-be	e642c78908	Addition of a DialoguePipeline (#5516 ) * initial commit for pipeline implementation Addition of input processing and history concatenation * Conversation pipeline tested and working for single & multiple conversation inputs * Added docstrings for dialogue pipeline * Addition of dialogue pipeline integration tests * Delete test_t5.py * Fixed max code length * Updated styling * Fixed test broken by formatting tools * Removed unused import * Added unit test for DialoguePipeline * Fixed Tensorflow compatibility * Fixed multi-framework support using framework flag * - Fixed docstring - Added `min_length_for_response` as an initialization parameter - Renamed `args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]` - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input - renamed pipeline name from dialogue to conversational - removed hardcoded default value of 1000 and use config.max_length instead - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation - fixed bug in history truncation method * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised) * - Simplified input tensor conversion * - Updated attention_mask value for Tensorflow compatibility * - Updated last dialogue reference to conversational & fixed integration tests * Fixed conflict with master * Updates following review comments * Updated formatting * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs * Update src/transformers/pipelines.py Updated docsting following review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-30 14:11:39 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Lysandre Debut	3f94170a10	[WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614 ) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice	2020-07-29 14:26:26 -04:00

1 2 3 4 5 ...

583 Commits