transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
Stas Bekman	37f4c24f10	> 30 files leads to hanging on --More-- cancel debug printing for now. As it can be seen lead to a failing test here: https://app.circleci.com/pipelines/github/huggingface/transformers/16894/workflows/cc86f7a9-4020-45af-8ab3-c22f79b427cf/jobs/131924	2020-12-07 12:18:05 -08:00
Sylvain Gugger	7f9ccffc5b	Use word_ids to get labels in run_ner (#8962 ) * Use word_ids to get labels in run_ner * Add sanity check	2020-12-07 14:26:36 -05:00
Clement	de6befd41f	Remove sourcerer (#8965 )	2020-12-07 11:15:29 -05:00
sandip	483e13273f	Add TFGPT2ForSequenceClassification based on DialogRPT (#8714 ) * Add TFGPT2ForSequenceClassification based on DialogRPT * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * code refactor for latest other TF PR * code refactor * code refactor * Update modeling_tf_gpt2.py	2020-12-07 16:58:37 +01:00
Sylvain Gugger	28c77ddf3b	Fix QA pipeline on Windows (#8947 )	2020-12-07 09:50:32 -05:00
Philip Tamimi-Sarnikowski	72d6c9c68b	Add model card (#8948 ) * add model card * lowercase identifier Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-12-06 11:16:32 -05:00
Machel Reid	ef93a25427	Fix typo for `modeling_bert` import resulting in ImportError (#8931 ) Self-explanatory ;) - Hope it helps!	2020-12-05 09:57:37 -05:00
Ethan Perez	8dfc8c7221	Don't pass in token_type_ids to BART for GLUE (#8929 ) Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.	2020-12-05 09:52:16 -05:00
Stas Bekman	df311a5ccf	[seq2seq] document the caveat of leaky native amp (#8930 ) * document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-04 15:43:35 -08:00
Stas Bekman	73c51f7fcd	[ci] skip doc jobs - circleCI is not reliable - disable skip for now (#8926 ) * disable skipping, but leave logging for the future	2020-12-04 10:13:42 -08:00
Lysandre Debut	71688a8889	Fix TF T5 only encoder model with booleans (#8925 )	2020-12-04 12:28:47 -05:00
Julien Plu	dcd3046f98	Better booleans handling in the TF models (#8777 ) * Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add boolean processing for the inputs * Apply style * Missing optional * Fix missing some input proc * Update the template * Fix missing inputs * Missing input * Fix args parameter * Trigger CI * Trigger CI * Trigger CI * Address Patrick's and Sylvain's comments * Replace warn by warning * Trigger CI * Fix XLNET * Fix detection	2020-12-04 09:08:29 -05:00
Stas Bekman	4c3d98dddc	[s2s finetune_trainer] add instructions for distributed training (#8884 )	2020-12-03 16:05:55 -08:00
Lysandre Debut	aa60b230ec	Patch model parallel test (#8920 ) * Patch model parallel test * Remove line * Remove `ci_*` from scheduled branches	2020-12-03 17:15:47 -05:00
Lysandre Debut	0c5615af66	Put Transformers on Conda (#8918 ) * conda * Guide * correct tag * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/installation.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-03 14:28:49 -05:00
Julien Chaumond	9ad6194318	Tweak wording + Add badge w/ number of models on the hub (#8914 ) * Add badge w/ number of models on the hub * try to apease @sgugger 😇 * not sure what this `c` was about [ci skip] * Fix script and move stuff around * Fix doc styling error Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2020-12-03 10:56:55 -05:00
Sylvain Gugger	6ed7e32f7c	Fix move when the two cache folders exist (#8917 )	2020-12-03 10:50:13 -05:00
Sylvain Gugger	8453201cfe	Avoid erasing the attention mask when double padding (#8915 )	2020-12-03 10:45:07 -05:00
Skye Wanderman-Milne	0deece9c53	Don't warn that models aren't available if Flax is available. (#8841 )	2020-12-03 10:33:12 -05:00
Julien Chaumond	2b7fc9a0fd	[model_cards] lm-head was deprecated (and wasn't needed here anyways as it was added automatically)	2020-12-03 15:05:01 +01:00
Patrick von Platen	443f67e887	[PyTorch] Refactor Resize Token Embeddings (#8880 ) * fix resize tokens * correct mobile_bert * move embedding fix into modeling_utils.py * refactor * fix lm head resize * refactor * break lines to make sylvain happy * add news tests * fix typo * improve test * skip bart-like for now * check if base_model = get(...) is necessary * clean files * improve test * fix tests * revert style templates * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py	2020-12-02 19:19:50 +01:00
Devangi Purkayastha	e52f9c0ade	Update README.md (#8906 )	2020-12-02 09:28:44 -08:00
ryota-mo	801b2cb36f	Fix typo in docstring (#8905 )	2020-12-02 12:08:31 -05:00
Stas Bekman	7e1cb00c37	[trainer] improve code readability (#8903 ) * [trainer] improve code This PR: - removes redundant code ``` self.model = model if model is not None else None ``` and ``` self.model = model ``` are the same. * separate attribute assignment from code logic - which simplifies things further. * whitespace	2020-12-02 09:07:42 -08:00
Nicolas Patry	a8c3f9aa76	Warning about too long input for fast tokenizers too (#8799 ) * Warning about too long input for fast tokenizers too If truncation is not set in tokenizers, but the tokenization is too long for the model (`model_max_length`), we used to trigger a warning that The input would probably fail (which it most likely will). This PR re-enables the warning for fast tokenizers too and uses common code for the trigger to make sure it's consistent across. * Checking for pair of inputs too. * Making the function private and adding it's doc. * Remove formatting ?? in odd place. * Missed uppercase.	2020-12-02 10:18:28 -05:00
sandip	f6b44e6190	Transfoxl seq classification (#8868 ) * Transfoxl sequence classification * Transfoxl sequence classification	2020-12-02 10:08:32 -05:00
Stas Bekman	24f0c2fe33	[ci] skip doc jobs take #3 (#8885 ) * check that we get any match first * docs only * 2 docs only * add code * restore	2020-12-02 10:06:45 -05:00
Stas Bekman	693ac3594b	disable job skip - need more work reference: https://github.com/huggingface/transformers/pull/8853#issuecomment-736779863	2020-12-01 12:03:29 -08:00
Stas Bekman	379005c9d2	start using training_args.parallel_mode (#8882 )	2020-12-01 11:40:36 -08:00
Sylvain Gugger	b08843cf4d	Add a `parallel_mode` property to TrainingArguments (#8877 ) * Add a `distributed_env` property to TrainingArguments * Change name * Address comment	2020-12-01 13:46:09 -05:00
Sylvain Gugger	7c10dd22ae	Better support for resuming training (#8878 )	2020-12-01 13:45:21 -05:00
Stas Bekman	21db560df3	[CI] skip docs-only jobs take #2 (#8853 ) * restore skip * Revert "Remove deprecated `evalutate_during_training` (#8852)" This reverts commit `5530299096`. * check that pipeline.git.base_revision is defined before proceeding * Revert "Revert "Remove deprecated `evalutate_during_training` (#8852)"" This reverts commit `dfec84db3f`. * check that pipeline.git.base_revision is defined before proceeding * doc only * doc + code * restore * restore * typo	2020-12-01 13:15:25 -05:00
Lysandre Debut	a947386cee	Better warning when loading a tokenizer with AutoTokenizer w/o SnetencePiece (#8881 )	2020-12-01 13:13:11 -05:00
Adam Pocock	9c18f15685	Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582 . (#8860 ) Update src/transformers/tokenization_utils_base.py with review fix Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-12-01 13:01:52 -05:00
Sylvain Gugger	c0df963ee1	Make the big table creation/check platform independent (#8856 )	2020-12-01 11:45:57 -05:00
Ratthachat (Jung)	d366228df1	2 typos in modeling_rag.py (#8676 ) * 2 typos - from_question_encoder_generator_configs fix 2 typos from_encoder_generator_configs --> from_question_encoder_generator_configs * apply make style	2020-12-01 16:16:48 +01:00
Rodolfo Quispe	814b9550d7	Fix doc for language code (#8848 )	2020-12-01 10:44:37 +01:00
elk-cloner	4a9e502a36	Ctrl for sequence classification (#8812 ) * add CTRLForSequenceClassification * pass local test * merge with master * fix modeling test for sequence classification * fix deco * fix assert	2020-12-01 09:49:27 +01:00
Stas Bekman	7f34d75780	[s2s trainer] fix DP mode (#8823 ) * fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup	2020-11-30 12:55:56 -08:00
Nicolas Patry	d8fc26e919	NerPipeline (TokenClassification) now outputs offsets of words (#8781 ) * NerPipeline (TokenClassification) now outputs offsets of words - It happens that the offsets are missing, it forces the user to pattern match the "word" from his input, which is not always feasible. For instance if a sentence contains the same word twice, then there is no way to know which is which. - This PR proposes to fix that by outputting 2 new keys for this pipelines outputs, "start" and "end", which correspond to the string offsets of the word. That means that we should always have the invariant: ```python input[entity["start"]: entity["end"]] == entity["entity_group"] # or entity["entity"] if not grouped ``` * Fixing doc style	2020-11-30 14:05:08 -05:00
LysandreJik	5fd3d81ec9	fix pypi complaint on version naming	2020-11-30 13:54:52 -05:00
Funtowicz Morgan	51b071313b	Attempt to fix Flax CI error(s) (#8829 ) * Slightly increase tolerance between pytorch and flax output Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * test_multiple_sentences doesn't require torch Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify parameterization on "jit" to use boolean rather than str Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove pytest.mark.parametrize which seems to fail in some circumstances Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix unused imports. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix style. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Give default parameters values for traced model. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Review comment: Change sentences to sequences Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-30 13:43:17 -05:00
LysandreJik	9995a341c9	Update docs	2020-11-30 12:07:52 -05:00
LysandreJik	22b0ff757a	Release: v4.0.0	2020-11-30 12:07:43 -05:00
Sylvain Gugger	5530299096	Remove deprecated `evalutate_during_training` (#8852 ) * Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 11:12:15 -05:00
Shai Erera	773849415a	Use model.from_pretrained for DataParallel also (#8795 ) * Use model.from_pretrained for DataParallel also When training on multiple GPUs, the code wraps a model with torch.nn.DataParallel. However if the model has custom from_pretrained logic, it does not get applied during load_best_model_at_end. This commit uses the underlying model during load_best_model_at_end, and re-wraps the loaded model with DataParallel. If you choose to reject this change, then could you please move the this logic to a function, e.g. def load_best_model_checkpoint(best_model_checkpoint) or something, so that it can be overridden? * Fix silly bug * Address review comments Thanks for the feedback. I made the change that you proposed, but I also think we should update L811 to check if `self.mode` is an instance of `PreTrained`, otherwise we would still not get into that `if` section, right?	2020-11-30 11:11:10 -05:00
Sylvain Gugger	4062c75e44	Merge remote-tracking branch 'origin/master'	2020-11-30 10:51:35 -05:00
Sylvain Gugger	08e707633c	Comment the skip job on doc line	2020-11-30 10:51:25 -05:00
Sylvain Gugger	75f8100fc7	Add a direct link to the big table (#8850 )	2020-11-30 10:29:23 -05:00
Fraser Greenlee	cc983cd9cd	Correct docstring. (#8845 ) Related issue: https://github.com/huggingface/transformers/issues/8837	2020-11-30 09:33:30 -05:00

1 2 3 4 5 ...

6049 Commits