transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	8d4bb02056	Refactor FLAX tests (#9034 )	2020-12-10 15:57:39 -05:00
Sylvain Gugger	1310e1a758	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
Sylvain Gugger	51e81e5895	MPNet copyright files (#9015 )	2020-12-10 09:29:38 -05:00
Sylvain Gugger	35bffd70e2	Fix documention of book in LayoutLM (#9017 )	2020-12-10 09:28:49 -05:00
Cola	c95de29e31	✏️ Fix typo (#9020 )	2020-12-10 08:22:52 +01:00
Stas Bekman	5e637e6c69	[wip] [ci] doc-job-skip take #4 dry-run (#8980 ) * ci-doc-job-skip-take-4 * wip * wip * wip * wip * skip yaml * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * ready to test * yet another way * trying with HEAD * trying with head.sha * trying with head.sha fix * trying with head.sha fix wip * undo * try to switch to sha * current branch * current branch * PR number check * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride * joy ride	2020-12-09 15:36:36 -05:00
Patrick von Platen	06971ac4f9	[Bart] Refactor - fix issues, consistency with the library, naming (#8900 ) * remove make on the fly linear embedding * start refactor * big first refactor * save intermediate * save intermediat * correct mask issue * save tests * refactor padding masks * make all tests pass * further refactor * make pegasus test pass * fix bool if * fix leftover tests * continue * bart renaming * delete torchscript test hack * fix imports in tests * correct shift * fix docs and repo cons * re-add fix for FSTM * typo in test * fix typo * fix another typo * continue * hot fix 2 for tf * small fixes * refactor types linting * continue * finish refactor * fix import in tests * better bart names * further refactor and add test * delete hack * apply sylvains and lysandres commens * small perf improv * further perf improv * improv perf * fix typo * make style * small perf improv	2020-12-09 20:55:24 +01:00
Funtowicz Morgan	75627148ee	Flax Masked Language Modeling training example (#8728 ) * Remove "Model" suffix from Flax models to look more 🤗 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Initial working (forward + backward) for Flax MLM training example. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Simply code Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing comments, using module and moving to LM task. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore parameter name "module" wrongly renamed model. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore correct output ordering... Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Actually commit the example 😅 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Add FlaxBertModelForMaskedLM after rebasing. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make it possible to initialize the training from scratch Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Reuse flax linen example of cross entropy loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added specific data collator for flax Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove todo for data collator Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added evaluation step Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added ability to provide dtype to support bfloat16 on TPU Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable flax tensorboard output Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable jax.pmap support. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure batches are correctly sized to be dispatched with jax.pmap Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable bfloat16 with --fp16 cmdline args Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Correctly export metrics to tensorboard Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added dropout and ability to use it. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Effectively enable & disable during training and evaluation steps. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Oops. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable specifying kernel initializer scale Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added warmup step to the learning rate scheduler. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix typo. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Print training loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make style Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix linter issue (flake8) Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix model matching Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix dummies Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix non default dtype on Flax models Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same create_position_ids_from_input_ids for FlaxRoberta Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta attention as Bert Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix copy Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Wording. Co-authored-by: Marc van Zee <marcvanzee@gmail.com> Co-authored-by: Marc van Zee <marcvanzee@gmail.com>	2020-12-09 17:13:56 +01:00
StillKeepTry	df2af6d8b8	Add MP Net 2 (#9004 )	2020-12-09 10:32:43 -05:00
cronoik	8729109855	fixes #8968 (#9009 )	2020-12-09 16:21:41 +01:00
Simon Brandeis	e977ed2142	Add the code_search_net dataset tag to CodeBERTa model cards (#9005 )	2020-12-09 15:43:19 +01:00
Patrick von Platen	da37a21c89	push (#9008 )	2020-12-09 15:14:33 +01:00
Sylvain Gugger	61abd50b98	Remove use of deprected method in Trainer HP search (#8996 )	2020-12-09 09:13:41 -05:00
Sylvain Gugger	7e1d709e2a	Fix link to stable version in the doc navbar (#9007 )	2020-12-09 09:11:39 -05:00
Patrick von Platen	02d0e0355c	Diverse beam search 2 (#9006 ) * diverse beam search * bug fixes * bug fixes * bug fix * separate out diverse_beam_search function * separate out diverse_beam_search function * bug fix * improve code quality * bug fix * bug fix * separate out diverse beam search scorer * code format * code format * code format * code format * add test * code format * documentation changes * code quality * add slow integration tests * more general name * refactor into logits processor * add test * avoid too much copy paste * refactor * add to docs * fix-copies * bug fix * Revert "bug fix" This reverts commit `c99eb5a8dc`. * improve comment * implement sylvains feedback Co-authored-by: Ayush Jain <a.jain@sprinklr.com> Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>	2020-12-09 15:00:37 +01:00
Lysandre Debut	67ff1c314a	Templates overhaul 1 (#8993 )	2020-12-08 18:00:07 -05:00
Sylvain Gugger	447808c85f	New squad example (#8992 ) * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add tick * Update README * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-08 14:39:29 -05:00
guillaume-be	7809eb82ae	Removed unused `encoder_hidden_states` and `encoder_attention_mask` (#8972 ) * Removed unused `encoder_hidden_states` and `encoder_attention_mask` from MobileBert * Removed decoder tests for MobileBert * Removed now unnecessary import	2020-12-08 12:04:34 -05:00
Lysandre Debut	b7cdd00f15	Fix interaction of return_token_type_ids and add_special_tokens (#8854 )	2020-12-08 12:04:01 -05:00
Sylvain Gugger	04c446f764	Make `ModelOutput` pickle-able (#8989 )	2020-12-08 11:59:40 -05:00
Julien Chaumond	0d9e6ca9ed	[model_card] remove bogus testing changes	2020-12-08 09:58:45 -05:00
Julien Plu	bf7f79cd57	Optional layers (#8961 ) * Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * remove unused imports * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Fix wrong model name * Fix BART * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Address Patrick's comments * Address Patrick's comments * Add boolean processing for the inputs * Take into account the optional layers * Add missing/unexpected weights in the other models * Apply style * rename parameters * Apply style * Remove useless * Remove useless * Remove useless * Update num parameters * Fix tests * Address Patrick's comment * Remove useless attribute	2020-12-08 09:14:09 -05:00
Stas Bekman	9d7d0005b0	[training] SAVE_STATE_WARNING was removed in pytorch (#8979 ) * [training] SAVE_STATE_WARNING was removed in pytorch FYI `SAVE_STATE_WARNING` has been removed 3 days ago: pytorch/pytorch#46813 Fixes: #8232 @sgugger * style, but add () to prevent autoformatters from botching it * switch to try/except * cleanup	2020-12-07 21:59:55 -08:00
Lysandre Debut	2ae7388eee	Check table as independent script (#8976 )	2020-12-07 19:55:12 -05:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Navjot	c108d0b5a4	add max_length to showcase the use of truncation (#8975 )	2020-12-07 18:35:39 -05:00
Sylvain Gugger	62d30e0583	Small fix to the run clm script (#8973 )	2020-12-07 17:32:09 -05:00
Julien Chaumond	28fa014a1f	transformers-cli: LFS multipart uploads (> 5GB) (#8663 ) * initial commit * [cli] lfs commands * Fix FileSlice * Tweak to FileSlice * [hf_api] Backport filetype arg from `datasets` cc @lhoestq * Silm down the CI while i'm working * Ok let's try this in CI * Update config.yml * Do not try this at home * one more try * Update lfs.py * Revert "Tweak to FileSlice" This reverts commit `d7e32c4b35`. * Update test_hf_api.py * Update test_hf_api.py * Update test_hf_api.py * CI still green? * make CI green again? * Update test_hf_api.py * make CI red again? * Update test_hf_api.py * add CI style back * Fix CI? * oh my * doc + switch back to real staging endpoint * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * Fix docblock + f-strings Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>	2020-12-07 16:38:39 -05:00
Wietse de Vries	0bce7c5508	Create README.md (#8964 )	2020-12-07 16:04:14 -05:00
Nguyen Van Nha	7ccd973ea1	Update README.txt (#8957 )	2020-12-07 16:01:49 -05:00
Stas Bekman	37f4c24f10	> 30 files leads to hanging on --More-- cancel debug printing for now. As it can be seen lead to a failing test here: https://app.circleci.com/pipelines/github/huggingface/transformers/16894/workflows/cc86f7a9-4020-45af-8ab3-c22f79b427cf/jobs/131924	2020-12-07 12:18:05 -08:00
Sylvain Gugger	7f9ccffc5b	Use word_ids to get labels in run_ner (#8962 ) * Use word_ids to get labels in run_ner * Add sanity check	2020-12-07 14:26:36 -05:00
Clement	de6befd41f	Remove sourcerer (#8965 )	2020-12-07 11:15:29 -05:00
sandip	483e13273f	Add TFGPT2ForSequenceClassification based on DialogRPT (#8714 ) * Add TFGPT2ForSequenceClassification based on DialogRPT * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * code refactor for latest other TF PR * code refactor * code refactor * Update modeling_tf_gpt2.py	2020-12-07 16:58:37 +01:00
Sylvain Gugger	28c77ddf3b	Fix QA pipeline on Windows (#8947 )	2020-12-07 09:50:32 -05:00
Philip Tamimi-Sarnikowski	72d6c9c68b	Add model card (#8948 ) * add model card * lowercase identifier Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-12-06 11:16:32 -05:00
Machel Reid	ef93a25427	Fix typo for `modeling_bert` import resulting in ImportError (#8931 ) Self-explanatory ;) - Hope it helps!	2020-12-05 09:57:37 -05:00
Ethan Perez	8dfc8c7221	Don't pass in token_type_ids to BART for GLUE (#8929 ) Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.	2020-12-05 09:52:16 -05:00
Stas Bekman	df311a5ccf	[seq2seq] document the caveat of leaky native amp (#8930 ) * document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-04 15:43:35 -08:00
Stas Bekman	73c51f7fcd	[ci] skip doc jobs - circleCI is not reliable - disable skip for now (#8926 ) * disable skipping, but leave logging for the future	2020-12-04 10:13:42 -08:00
Lysandre Debut	71688a8889	Fix TF T5 only encoder model with booleans (#8925 )	2020-12-04 12:28:47 -05:00
Julien Plu	dcd3046f98	Better booleans handling in the TF models (#8777 ) * Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add boolean processing for the inputs * Apply style * Missing optional * Fix missing some input proc * Update the template * Fix missing inputs * Missing input * Fix args parameter * Trigger CI * Trigger CI * Trigger CI * Address Patrick's and Sylvain's comments * Replace warn by warning * Trigger CI * Fix XLNET * Fix detection	2020-12-04 09:08:29 -05:00
Stas Bekman	4c3d98dddc	[s2s finetune_trainer] add instructions for distributed training (#8884 )	2020-12-03 16:05:55 -08:00
Lysandre Debut	aa60b230ec	Patch model parallel test (#8920 ) * Patch model parallel test * Remove line * Remove `ci_*` from scheduled branches	2020-12-03 17:15:47 -05:00
Lysandre Debut	0c5615af66	Put Transformers on Conda (#8918 ) * conda * Guide * correct tag * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/installation.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-03 14:28:49 -05:00
Julien Chaumond	9ad6194318	Tweak wording + Add badge w/ number of models on the hub (#8914 ) * Add badge w/ number of models on the hub * try to apease @sgugger 😇 * not sure what this `c` was about [ci skip] * Fix script and move stuff around * Fix doc styling error Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2020-12-03 10:56:55 -05:00
Sylvain Gugger	6ed7e32f7c	Fix move when the two cache folders exist (#8917 )	2020-12-03 10:50:13 -05:00
Sylvain Gugger	8453201cfe	Avoid erasing the attention mask when double padding (#8915 )	2020-12-03 10:45:07 -05:00
Skye Wanderman-Milne	0deece9c53	Don't warn that models aren't available if Flax is available. (#8841 )	2020-12-03 10:33:12 -05:00
Julien Chaumond	2b7fc9a0fd	[model_cards] lm-head was deprecated (and wasn't needed here anyways as it was added automatically)	2020-12-03 15:05:01 +01:00

1 2 3 4 5 ...

6079 Commits