transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-27 00:09:00 +06:00

Author	SHA1	Message	Date
Patrick von Platen	1d6e71e116	[EncoderDecoder] Add Cross Attention for GPT2 (#6415 ) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-14 09:43:29 +02:00
Suraj Patil	680f1337c3	MBartForConditionalGeneration (#6441 ) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions	2020-08-14 03:21:16 -04:00
Lysandre Debut	f7cbc13db7	Test model outputs equivalence (#6445 ) * Test model outputs equivalence * Fix failing tests * From dict to kwargs * DistilBERT * Addressing @sgugger and @patrickvonplaten's comments	2020-08-13 11:59:35 -04:00
Stas Bekman	e983da0e7d	cleanup tf unittests: part 2 (#6260 ) * cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification	2020-08-13 04:29:06 -04:00
Joe Davison	bc820476a5	add targets arg to fill-mask pipeline (#6239 ) * add targets arg to fill-mask pipeline * add tests and more error handling * quality * update docstring	2020-08-12 12:48:29 -04:00
Patrick von Platen	0735def8e1	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 ) * add encoder-decoder for roberta * fix headmask * apply Sylvains suggestions * fix typo * Apply suggestions from code review	2020-08-12 18:23:30 +02:00
Sylvain Gugger	e9c3031463	Fixes to make life easier with the nlp library (#6423 ) * allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-08-12 08:00:56 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
Junyuan Zheng	cdf1f7edb2	Fix tokenizer saving and loading error (#6026 ) * fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens * Add tokenizer test * Style * Style 2 Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-11 04:49:16 -04:00
Pradhy729	b25cec13c5	Feed forward chunking (#6024 ) * Chunked feed forward for Bert This is an initial implementation to test applying feed forward chunking for BERT. Will need additional modifications based on output and benchmark results. * Black and cleanup * Feed forward chunking in BertLayer class. * Isort * add chunking for all models * fix docs * Fix typo Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-11 03:12:45 -04:00
Patrick von Platen	00bb0b25ed	TF Longformer (#5764 ) * improve names and tests longformer * more and better tests for longformer * add first tf test * finalize tf basic op functions * fix merge * tf shape test passes * narrow down discrepancies * make longformer local attn tf work * correct tf longformer * add first global attn function * add more global longformer func * advance tf longformer * finish global attn * upload big model * finish all tests * correct false any statement * fix common tests * make all tests pass except keras save load * fix some tests * fix torch test import * finish tests * fix test * fix torch tf tests * add docs * finish docs * Update src/transformers/modeling_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply Lysandres suggestions * reverse to assert statement because function will fail otherwise * applying sylvains recommendations * Update src/transformers/modeling_longformer.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-10 23:25:06 +02:00
Patrick von Platen	3425936643	[EncoderDecoderModel] add a `add_cross_attention` boolean to config (#6377 ) * correct encoder decoder model * Apply suggestions from code review * apply sylvains suggestions	2020-08-10 19:46:48 +02:00
Lysandre Debut	b99098abc7	Patch models (#6326 ) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo	2020-08-10 10:39:17 -04:00
Stas Bekman	1429b920d4	refactor almost identical tests (#6339 ) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt	2020-08-10 05:31:20 -04:00
Julien Plu	0e36e51515	Fix the tests for Electra (#6284 ) * Fix the tests for Electra * Apply style	2020-08-07 09:30:57 -04:00
Sylvain Gugger	6ba540b747	Add a script to check all models are tested and documented (#6298 ) * Add a script to check all models are tested and documented * Apply suggestions from code review Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * Address comments Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-07 09:18:37 -04:00
Philip May	d5bc32ce92	Add strip_accents to basic BertTokenizer. (#6280 ) * Add strip_accents to basic tokenizer * Add tests for strip_accents. * fix style with black * Fix strip_accents test * empty commit to trigger CI * Improved strip_accents check * Add code quality with is not False	2020-08-06 18:52:28 +08:00
Sylvain Gugger	c67d1a0259	Tf model outputs (#6247 ) * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * Add new models and fix issues * Quality improvements * Add T5 * A bit of cleanup * Fix for slow tests * Style	2020-08-05 11:34:39 -04:00
Julien Plu	33966811bd	Add SequenceClassification and MultipleChoice TF models to Electra (#6227 ) * Add SequenceClassification and MultipleChoice TF models to Electra * Apply style * Add summary_proj_to_labels to Electra config * Finally mirroring the PT version of these models * Apply style * Fix Electra test	2020-08-05 09:04:27 -04:00
Patrick von Platen	7f65daa2e1	fix reformer fp16 (#6237 )	2020-08-04 13:02:25 +02:00
Sam Shleifer	6730ecdd3c	Remove redundant coverage (#6224 )	2020-08-04 02:59:21 -04:00
Stas Bekman	5deed37f9f	cleanup torch unittests (#6196 ) * improve unit tests this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973 before I apply it to the rest * batch 1 * batch 2 * batch 3 * batch 4 * batch 5 * style * non-tf template * last deletion of check_loss_output	2020-08-04 02:42:56 -04:00
Julien Plu	9996f697e3	Fix saved model creation (#5468 ) * Fix TF Serving when output_hidden_states and output_attentions are True * Add tests for saved model creation + bug fix for multiple choices models * remove unused import * Fix the input for several layers * Fix test * Fix conflict printing * Apply style * Fix XLM and Flaubert for TensorFlow * Apply style * Fix TF check version * Apply style * Trigger CI	2020-08-03 08:10:40 -04:00
Sylvain Gugger	d951c14ae4	Model output test (#6155 ) * Use return_dict=True in all tests * Formatting	2020-07-31 09:44:37 -04:00
Suraj Patil	838dc06ff5	parse arguments from dict (#4869 ) * add parse_dict to parse arguments from dict * add unit test for parse_dict	2020-07-31 04:44:23 -04:00
Stas Bekman	f250beb8aa	enable easy checkout switch (#5645 ) * enable easy checkout switch allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests. * make isort happy * examples needs one too	2020-07-31 04:34:46 -04:00
Stas Bekman	a2f6d521c1	typos (#6162 ) * 2 small typos * more typos * correct path	2020-07-30 17:18:27 -04:00
guillaume-be	e642c78908	Addition of a DialoguePipeline (#5516 ) * initial commit for pipeline implementation Addition of input processing and history concatenation * Conversation pipeline tested and working for single & multiple conversation inputs * Added docstrings for dialogue pipeline * Addition of dialogue pipeline integration tests * Delete test_t5.py * Fixed max code length * Updated styling * Fixed test broken by formatting tools * Removed unused import * Added unit test for DialoguePipeline * Fixed Tensorflow compatibility * Fixed multi-framework support using framework flag * - Fixed docstring - Added `min_length_for_response` as an initialization parameter - Renamed `args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]` - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input - renamed pipeline name from dialogue to conversational - removed hardcoded default value of 1000 and use config.max_length instead - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation - fixed bug in history truncation method * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised) * - Simplified input tensor conversion * - Updated attention_mask value for Tensorflow compatibility * - Updated last dialogue reference to conversational & fixed integration tests * Fixed conflict with master * Updates following review comments * Updated formatting * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs * Update src/transformers/pipelines.py Updated docsting following review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-30 14:11:39 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Lysandre Debut	3f94170a10	[WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614 ) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice	2020-07-29 14:26:26 -04:00
Funtowicz Morgan	6c002853a6	Added capability to quantize a model while exporting through ONNX. (#6089 ) * Added capability to quantize a model while exporting through ONNX. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> We do not support multiple extensions Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Reformat files Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * More quality Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Ensure test_generate_identified_name compares the same object types Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added documentation everywhere on ONNX exporter Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use pathlib.Path instead of plain-old string Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use f-string everywhere Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use the correct parameters for black formatting Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use Python 3 super() style. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use packaging.version to ensure installed onnxruntime version match requirements Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fixing imports sorting order. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Missing raise(s) Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added quantization documentation Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix some spelling. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix bad list header format Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-29 13:21:29 +02:00
Sam Shleifer	c49cd927f7	[Fix] position_ids tests again (#6100 )	2020-07-28 18:29:35 -04:00
Sam Shleifer	5abe50381a	Fix #6096 : MBartTokenizer's mask token (#6098 )	2020-07-28 18:27:58 -04:00
Sam Shleifer	3c7fbf35a6	MBART: support summarization tasks where max_src_len > max_tgt_len (#6003 ) * MBART: support summarization tasks * fix test * Style * add tokenizer test	2020-07-28 08:18:11 -04:00
Joe Davison	3deffc1d67	Zero shot classification pipeline (#5760 ) * add initial zero-shot pipeline * change default args * update default template * add label string splitting * add str labels support, remove nli from name * style * add input validation and working tf defaults * tests * quality check * add docstring to __call__ * add slow tests * Change truncation to only_first also lower precision on tests for readibility * style	2020-07-27 09:42:58 -04:00
Sylvain Gugger	f5b5c5bd7e	Avoid unnecessary warnings when loading pretrained model (#5922 ) * Avoid unnecessary warnings when loading pretrained model * Fix test * Add other keys to ignore * keys_to_ignore_at_load -> authorized_missing_keys	2020-07-23 18:13:36 -04:00
Sam Shleifer	9827d666eb	MbartTokenizer: do not hardcode vocab size (#5998 )	2020-07-23 15:41:14 -04:00
Stas Bekman	35cb101eae	DataParallel fixes (#5733 ) * DataParallel fixes: 1. switched to a more precise check - if self.args.n_gpu > 1: + if isinstance(model, nn.DataParallel): 2. fix tests - require the same fixup under DataParallel as the training module * another fix	2020-07-20 09:29:12 -04:00
Pradhy729	290b6e18ac	Trainer support for iterabledataset (#5834 ) * Don't pass sampler for iterable dataset * Added check for test and eval dataloaders. * Formatting * Don't pass sampler for iterable dataset * Added check for test and eval dataloaders. * Formatting * Cleaner if nesting. * Added test for trainer and iterable dataset * Formatting for test * Fixed import when torch is available only. * Added require torch decorator to helper class * Moved dataset class inside unittest * Removed nested if and changed model in test * Checking torch availability for IterableDataset	2020-07-20 09:07:37 -04:00
Teven	4b506a37e3	Xlnet outputs (#5883 ) Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.	2020-07-18 17:33:13 +02:00
Teven	a55809241f	Revert "Xlnet outputs (#5881 )" (#5882 ) This reverts commit `13be487212`.	2020-07-18 17:15:40 +02:00
Teven	13be487212	Xlnet outputs (#5881 ) Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.	2020-07-18 16:53:29 +02:00
Teven	615be03f9d	Revert "XLNet `use_cache` refactor (#5770 )" (#5854 ) This reverts commit `0b2da0e592`.	2020-07-17 20:33:44 +02:00
Teven	0b2da0e592	XLNet `use_cache` refactor (#5770 ) Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.	2020-07-17 20:24:16 +02:00
Patrick von Platen	9d37c56bab	[Reformer] - Cache hidden states and buckets to speed up inference (#5578 ) * fix merge rebase * add intermediate reformer code * save intermediate caching results * save intermediate * save intermediate results * save intermediate * upload next step * fix generate tests * make tests work * add named tuple output * Apply suggestions from code review * fix use_cache for False case * fix tensor to gpu * fix tensor to gpu * refactor * refactor and make style	2020-07-17 16:17:42 +02:00
Patrick von Platen	89a78be51f	fix benchmark for longformer (#5808 )	2020-07-16 15:15:10 +02:00
Sam Shleifer	1a647abf0b	[fix] check code quality (#5772 )	2020-07-15 14:59:38 -04:00
Funtowicz Morgan	d533c7e9b9	[fix] T5 ONNX test: model.to(torch_device) (#5769 ) Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-15 10:11:22 -04:00
Sam Shleifer	d0486c8bc2	[cleanup] T5 test, warnings (#5761 )	2020-07-15 08:23:22 -04:00
Sam Shleifer	838950ee44	[fix] mbart_en_ro_generate test now identical to fairseq (#5731 )	2020-07-14 06:12:24 -04:00
as-stevens	f867000f56	[Reformer classification head] Implement the reformer model classification head for text classification (#5198 ) * Reformer model head classification implementation for text classification * Reformat the reformer model classification code * PR review comments, and test case implementation for reformer for classification head changes * CI/CD reformer for classification head test import error fix * CI/CD test case implementation added ReformerForSequenceClassification to all_model_classes * Code formatting- fixed * Normal test cases added for reformer classification head * Fix test cases implementation for the reformer classification head * removed token_type_id parameter from the reformer classification head * fixed the test case for reformer classification head * merge conflict with master fixed * merge conflict, changed reformer classification to accept the choice_label parameter added in latest code * refactored the the reformer classification head test code * reformer classification head, common transform test cases fixed * final set of the review comment, rearranging the reformer classes and docstring add to classification forward method * fixed the compilation error and text case fix for reformer classification head * Apply suggestions from code review Remove unnecessary dup Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-07-14 09:16:22 +02:00
Stas Bekman	45addfe96d	FlaubertForTokenClassification (#5644 ) * implement FlaubertForTokenClassification as a subclass of XLMForTokenClassification * fix mapping order * add the doc * add common tests	2020-07-13 14:59:53 -04:00
Stas Bekman	443b0cad96	rename the function to match the rest of the test convention (#5692 )	2020-07-13 18:09:49 +08:00
Sylvain Gugger	edfd82f5ff	Change model outputs types to self-document outputs (#5438 ) * [WIP] Proposal for model outputs * All Bert models * Make CI green maybe? * Fix ONNX test * Isolate ModelOutput from pt and tf * Formatting * Add Electra models * Auto-generate docstrings from outputs * Add TF outputs * Add some BERT models * Revert TF side * Remove last traces of TF changes * Fail with a clear error message * Add Albert and work through Bart * Add CTRL and DistilBert * Formatting * Progress on Bart * Renames and finish Bart * Formatting * Fix last test * Add DPR * Finish Electra and add FlauBERT * Add GPT2 * Add Longformer * Add MMBT * Add MobileBert * Add GPT * Formatting * Add Reformer * Add Roberta * Add T5 * Add Transformer XL * Fix test * Add XLM + fix XLMForTokenClassification * Style + XLMRoberta * Add XLNet * Formatting * Add doc of return_tuple arg	2020-07-10 11:36:53 -04:00
Lorenzo Ampil	0cc4eae0e6	Fix Inconsistent NER Grouping (Pipeline) (#4987 ) * Add B I handling to grouping * Add fix to include separate entity as last token * move last_idx definition outside loop * Use first entity in entity group as reference for entity type * Add test cases * Take out extra class accidentally added * Return tf ner grouped test to original * Take out redundant last entity * Get last_idx safely Co-authored-by: ColleterVi <36503688+ColleterVi@users.noreply.github.com> * Fix first entity comment * Create separate functions for group_sub_entities and group_entities (splitting call method to testable functions) * Take out unnecessary last_idx * Remove additional forward pass test * Move token classification basic tests to separate class * Move token classification basic tests back to monocolumninputtestcase * Move base ner tests to nerpipelinetests * Take out unused kwargs * Add back mandatory_keys argument * Add unitary tests for group_entities in _test_ner_pipeline * Fix last entity handling * Fix grouping fucntion used * Add typing to group_sub_entities and group_entities Co-authored-by: ColleterVi <36503688+ColleterVi@users.noreply.github.com>	2020-07-08 16:18:17 -04:00
Patrick von Platen	f82a2a5e8e	[Benchmark] Add benchmarks for TF Training (#5594 ) * tf_train * adapt timing for tpu * fix timing * fix timing * fix timing * fix timing * update notebook * add tests	2020-07-08 12:11:09 +02:00
Sam Shleifer	353b8f1e7a	Add mbart-large-cc25, support translation finetuning (#5129 ) improve unittests for finetuning, especially w.r.t testing frozen parameters fix freeze_embeds for T5 add streamlit setup.cfg	2020-07-07 13:23:01 -04:00
Patrick von Platen	4dc65591b5	[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395 ) * add first version of clm tf * make style * add more tests for bert * update tf clm loss * fix tests * correct tf ner script * add mlm loss * delete bogus file * clean tf auto model + add tests * finish adding clm loss everywhere * fix training in distilbert * fix flake8 * save intermediate * fix tf t5 naming * remove prints * finish up * up * fix tf gpt2 * fix new test utils import * fix flake8 * keep backward compatibility * Update src/transformers/modeling_tf_albert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_electra.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_mobilebert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_distilbert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply sylvains suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-07 18:15:53 +02:00
Quentin Lhoest	4fedc1256c	Fix tests imports dpr (#5576 ) * fix test imports * fix max_length * style * fix tests	2020-07-07 16:35:12 +02:00
Sam Shleifer	d4886173b2	[Bart] enable test_torchscript, update test_tie_weights (#5457 ) * Passing all but one torchscript test * Style * move comment * remove unneeded assert	2020-07-07 10:06:48 -04:00
Quentin Lhoest	fbd8792195	Add DPR model (#5279 ) * beginning of dpr modeling * wip * implement forward * remove biencoder + better init weights * export dpr model to embed model for nlp lib * add new api * remove old code * make style * fix dumb typo * don't load bert weights * docs * docs * style * move the `k` parameter * fix init_weights * add pretrained configs * minor * update config names * style * better config * style * clean code based on PR comments * change Dpr to DPR * fix config * switch encoder config to a dict * style * inheritance -> composition * add messages in assert startements * add dpr reader tokenizer * one tokenizer per model * fix base_model_prefix * fix imports * typo * add convert script * docs * change tokenizers conf names * style * change tokenizers conf names * minor * minor * fix wrong names * minor * remove unused convert functions * rename convert script * use return_tensors in tokenizers * remove n_questions dim * move generate logic to tokenizer * style * add docs * docs * quality * docs * add tests * style * add tokenization tests * DPR full tests * Stay true to the attention mask building * update docs * missing param in bert input docs * docs * style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-07-07 08:56:12 -04:00
Abel	6912265711	Make T5 compatible with ONNX (#5518 ) * Default decoder inputs to encoder ones for T5 if neither are specified. * Fixing typo, now all tests are passing. * Changing einsum to operations supported by onnx * Adding a test to ensure T5 can be exported to onnx op>9 * Modified test for onnx export to make it faster * Styling changes. * Styling changes. * Changing notation for matrix multiplication Co-authored-by: Abel Riboulot <tkai@protomail.com>	2020-07-07 11:32:29 +02:00
Patrick von Platen	989ae326b5	[Reformer] Adapt Reformer MaskedLM Attn mask (#5560 ) * fix attention mask * fix slow test * refactor attn masks * fix fp16 generate test	2020-07-07 10:48:06 +02:00
Shashank Gupta	3dcb748e31	Added data collator for permutation (XLNet) language modeling and related calls (#5522 ) * Added data collator for XLNet language modeling and related calls Added DataCollatorForXLNetLanguageModeling in data/data_collator.py to generate necessary inputs for language modeling training with XLNetLMHeadModel. Also added related arguments, logic and calls in examples/language-modeling/run_language_modeling.py. Resolves: #4739, #2008 (partially) * Changed name to `DataCollatorForPermutationLanguageModeling` Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`. Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use. CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of similar to `mems` for XLNet). Changed calls and imports appropriately. * Added detailed comments, changed variable names Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative. * Added tests for new data collator Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences. * Fixed styling issues	2020-07-07 10:17:37 +02:00
Anthony MOI	5787e4c159	Various tokenizers fixes (#5558 ) * BertTokenizerFast - Do not specify strip_accents by default * Bump tokenizers to new version * Add test for AddedToken serialization	2020-07-06 18:27:53 -04:00
Sam Shleifer	58cca47c16	[cleanup] TF T5 tests only init t5-base once. (#5410 )	2020-07-03 14:27:49 -04:00
Lysandre Debut	17ade127b9	Exposing prepare_for_model for both slow & fast tokenizers (#5479 ) * Exposing prepare_for_model for both slow & fast tokenizers * Update method signature * The traditional style commit * Hide the warnings behind the verbose flag * update default truncation strategy and prepare_for_model * fix tests and prepare_for_models methods Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-07-03 16:51:21 +02:00
Teven	6726416e4a	Changed expected_output_ids in TransfoXL generation test (#5462 ) * Changed expected_output_ids in TransfoXL generation test to match #4826 generation PR. * making black happy * making isort happy	2020-07-02 11:56:44 +02:00
Patrick von Platen	d16e36c7e5	[Reformer] Add Masked LM Reformer (#5426 ) * fix conflicts * fix * happy rebasing	2020-07-01 22:43:18 +02:00
Joe Davison	35befd9ce3	Fix tensor label type inference in default collator (#5250 ) * allow tensor label inputs to default collator * replace try/except with type check	2020-07-01 10:40:14 -06:00
Patrick von Platen	fe81f7d12c	finish reformer qa head (#5433 )	2020-07-01 12:27:14 -04:00
Patrick von Platen	d697b6ca75	[Longformer] Major Refactor (#5219 ) * refactor naming * add small slow test * refactor * refactor naming * rename selected to extra * big global attention refactor * make style * refactor naming * save intermed * refactor functions * finish function refactor * fix tests * fix longformer * fix longformer * fix longformer * fix all tests but one * finish longformer * address sams and izs comments * fix transpose	2020-07-01 17:43:32 +02:00
Sam Shleifer	e0d58ddb65	[fix] Marian tests import (#5442 )	2020-07-01 11:42:22 -04:00
Funtowicz Morgan	608d5a7c44	Raises PipelineException on FillMaskPipeline when there are != 1 mask_token in the input (#5389 ) * Added PipelineException Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fill-mask pipeline raises exception when more than one mask_token detected. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Put everything in a function. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added tests on pipeline fill-mask when input has != 1 mask_token Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix numel() computation for TF Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing PR comments. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove function typing to avoid import on specific framework. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Quality. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Retry typing with @julien-c tip. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Quality². Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify fill-mask mask_token checking. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Trigger CI	2020-07-01 17:27:47 +02:00
Sam Shleifer	43cb03a93d	MarianTokenizer.prepare_translation_batch uses new tokenizer API (#5182 )	2020-07-01 10:32:50 -04:00
Sam Shleifer	13deb95a40	Move tests/utils.py -> transformers/testing_utils.py (#5350 )	2020-07-01 10:31:17 -04:00
Sam Shleifer	32d2031458	[fix] slow fill_mask test failure (#5406 )	2020-06-30 15:28:15 -04:00
Patrick von Platen	4bcc35cd69	[Docs] Benchmark docs (#5360 ) * first doc version * add benchmark docs * fix typos * improve README * Update docs/source/benchmarks.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix naming and docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-29 16:08:57 +02:00
Sam Shleifer	28a690a80e	[mBART] skip broken forward pass test, stronger integration test (#5327 )	2020-06-28 15:08:28 -04:00
Sam Shleifer	393b8dc09a	examples/seq2seq/run_eval.py fixes and docs (#5322 )	2020-06-26 19:20:43 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Sam Shleifer	798dbff6a7	[pipelines] Change summarization default to distilbart-cnn-12-6 (#5289 )	2020-06-26 11:43:23 -04:00
Funtowicz Morgan	135791e8ef	Add pad_to_multiple_of on tokenizers (reimport) (#5054 ) * Add new parameter `pad_to_multiple_of` on tokenizers. * unittest for pad_to_multiple_of * Add .name when logging enum. * Fix missing .items() on dict in tests. * Add special check + warning if the tokenizer doesn't have proper pad_token. * Use the correct logger format specifier. * Ensure tokenizer with no pad_token do not modify the underlying padding strategy. * Skip test if tokenizer doesn't have pad_token * Fix RobertaTokenizer on empty input * Format. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix and updating to simpler API Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-06-26 11:55:57 +02:00
Lysandre Debut	364a5ae1f0	Refactor Code samples; Test code samples (#5036 ) * Refactor code samples * Test docstrings * Style * Tokenization examples * Run rust of tests * First step to testing source docs * Style and BART comment * Test the remainder of the code samples * Style * let to const * Formatting fixes * Ready for merge * Fix fixture + Style * Fix last tests * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing @sgugger's comments + Fix MobileBERT in TF Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-06-25 16:46:00 -04:00
Thomas Wolf	315f464b0a	[tokenizers] Several small improvements and bug fixes (#5287 ) * avoid recursion in id checks for fast tokenizers * better typings and fix #5232 * align slow and fast tokenizers behaviors for Roberta and GPT2 * style and quality * fix tests - improve typings	2020-06-25 22:17:14 +02:00
Thomas Wolf	27cf1d97f0	[Tokenization] Fix #5181 - make #5155 more explicit - move back the default logging level in tests to WARNING (#5252 ) * fix-5181 Padding to max sequence length while truncation to another length was wrong on slow tokenizers * clean up and fix #5155 * fix XLM test * Fix tests for Transfo-XL * logging only above WARNING in tests * switch slow tokenizers tests in @slow * fix Marian truncation tokenization test * style and quality * make the test a lot faster by limiting the sequence length used in tests	2020-06-25 17:24:28 +02:00
Thomas Wolf	7ac9110711	Add more tests on tokenizers serialization - fix bugs (#5056 ) * update tests for fast tokenizers + fix small bug in saving/loading * better tests on serialization * fixing serialization * comment cleanup	2020-06-24 21:53:08 +02:00
Lysandre Debut	cf10d4cfdd	Cleaning TensorFlow models (#5229 ) * Cleaning TensorFlow models Update all classes stylr * Don't average loss	2020-06-24 11:37:20 -04:00
Patrick von Platen	c2a26ec8a6	[Use cache] Align logic of `use_cache` with output_attentions and output_hidden_states (#5194 ) * fix use cache * add bart use cache * fix bart * finish bart	2020-06-24 16:09:17 +02:00
Patrick von Platen	9fe09cec76	[Benchmark] Extend Benchmark to all model type extensions (#5241 ) * add benchmark for all kinds of models * improved import * delete bogus files * make style	2020-06-24 15:11:42 +02:00
Sam Shleifer	58918c76f4	[bart] add config.extra_pos_embeddings to facilitate reuse (#5190 )	2020-06-23 11:35:42 -04:00
Thomas Wolf	11fdde0271	Tokenizers API developments (#5103 ) * Add return lengths * make pad a bit more flexible so it can be used as collate_fn * check all kwargs sent to encoding method are known * fixing kwargs in encodings * New AddedToken class in python This class let you specify specifique tokenization behaviors for some special tokens. Used in particular for GPT2 and Roberta, to control how white spaces are stripped around special tokens. * style and quality * switched to hugginface tokenizers library for AddedTokens * up to tokenizer 0.8.0-rc3 - update API to use AddedToken state * style and quality * do not raise an error on additional or unused kwargs for tokenize() but only a warning * transfo-xl pretrained model requires torch * Update src/transformers/tokenization_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-23 13:36:57 +02:00
Sam Shleifer	5144104070	[fix] remove unused import (#5206 )	2020-06-22 23:39:04 -04:00
Sam Shleifer	0d158e38c9	[fix] mobilebert had wrong path, causing slow test failure (#5205 )	2020-06-22 23:31:36 -04:00
Thomas Wolf	ebc36108dc	[tokenizers] Fix #5081 and improve backward compatibility (#5125 ) * fix #5081 and improve backward compatibility (slightly) * add nlp to setup.cfg - style and quality * align default to previous default * remove test that doesn't generalize	2020-06-22 17:25:43 +02:00
Joseph Liu	f4e1f02210	Output hidden states (#4978 ) * Configure all models to use output_hidden_states as argument passed to foward() * Pass all tests * Remove cast_bool_to_primitive in TF Flaubert model * correct tf xlnet * add pytorch test * add tf test * Fix broken tests * Configure all models to use output_hidden_states as argument passed to foward() * Pass all tests * Remove cast_bool_to_primitive in TF Flaubert model * correct tf xlnet * add pytorch test * add tf test * Fix broken tests * Refactor output_hidden_states for mobilebert * Reset and remerge to master Co-authored-by: Joseph Liu <joseph.liu@coinflex.com> Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-06-22 10:10:45 -04:00
RafaelWO	b99ad457f4	Added feature to move added tokens in vocabulary for Transformer-XL (#4953 ) * Fixed resize_token_embeddings for transfo_xl model * Fixed resize_token_embeddings for transfo_xl. Added custom methods to TransfoXLPreTrainedModel for resizing layers of the AdaptiveEmbedding. * Updated docstring * Fixed resizinhg cutoffs; added check for new size of embedding layer. * Added test for resize_token_embeddings * Fixed code quality * Fixed unchanged cutoffs in model.config * Added feature to move added tokens in tokenizer. * Fixed code quality * Added feature to move added tokens in tokenizer. * Fixed code quality * Fixed docstring, renamed sym to oken. Co-authored-by: Rafael Weingartner <rweingartner.its-b2015@fh-salzburg.ac.at>	2020-06-22 15:40:52 +02:00
Patrick von Platen	fa0be6d761	Benchmarks (#4912 ) * finish benchmark * fix isort * fix setup cfg * retab * fix time measuring of tf graph mode * fix tf cuda * clean code * better error message	2020-06-22 12:06:56 +02:00

1 2 3 4 5 ...

515 Commits