transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Julien Chaumond	9ad6194318	Tweak wording + Add badge w/ number of models on the hub (#8914 ) * Add badge w/ number of models on the hub * try to apease @sgugger 😇 * not sure what this `c` was about [ci skip] * Fix script and move stuff around * Fix doc styling error Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2020-12-03 10:56:55 -05:00
Sylvain Gugger	6ed7e32f7c	Fix move when the two cache folders exist (#8917 )	2020-12-03 10:50:13 -05:00
Sylvain Gugger	8453201cfe	Avoid erasing the attention mask when double padding (#8915 )	2020-12-03 10:45:07 -05:00
Skye Wanderman-Milne	0deece9c53	Don't warn that models aren't available if Flax is available. (#8841 )	2020-12-03 10:33:12 -05:00
Julien Chaumond	2b7fc9a0fd	[model_cards] lm-head was deprecated (and wasn't needed here anyways as it was added automatically)	2020-12-03 15:05:01 +01:00
Patrick von Platen	443f67e887	[PyTorch] Refactor Resize Token Embeddings (#8880 ) * fix resize tokens * correct mobile_bert * move embedding fix into modeling_utils.py * refactor * fix lm head resize * refactor * break lines to make sylvain happy * add news tests * fix typo * improve test * skip bart-like for now * check if base_model = get(...) is necessary * clean files * improve test * fix tests * revert style templates * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py	2020-12-02 19:19:50 +01:00
Devangi Purkayastha	e52f9c0ade	Update README.md (#8906 )	2020-12-02 09:28:44 -08:00
ryota-mo	801b2cb36f	Fix typo in docstring (#8905 )	2020-12-02 12:08:31 -05:00
Stas Bekman	7e1cb00c37	[trainer] improve code readability (#8903 ) * [trainer] improve code This PR: - removes redundant code ``` self.model = model if model is not None else None ``` and ``` self.model = model ``` are the same. * separate attribute assignment from code logic - which simplifies things further. * whitespace	2020-12-02 09:07:42 -08:00
Nicolas Patry	a8c3f9aa76	Warning about too long input for fast tokenizers too (#8799 ) * Warning about too long input for fast tokenizers too If truncation is not set in tokenizers, but the tokenization is too long for the model (`model_max_length`), we used to trigger a warning that The input would probably fail (which it most likely will). This PR re-enables the warning for fast tokenizers too and uses common code for the trigger to make sure it's consistent across. * Checking for pair of inputs too. * Making the function private and adding it's doc. * Remove formatting ?? in odd place. * Missed uppercase.	2020-12-02 10:18:28 -05:00
sandip	f6b44e6190	Transfoxl seq classification (#8868 ) * Transfoxl sequence classification * Transfoxl sequence classification	2020-12-02 10:08:32 -05:00
Stas Bekman	24f0c2fe33	[ci] skip doc jobs take #3 (#8885 ) * check that we get any match first * docs only * 2 docs only * add code * restore	2020-12-02 10:06:45 -05:00
Stas Bekman	693ac3594b	disable job skip - need more work reference: https://github.com/huggingface/transformers/pull/8853#issuecomment-736779863	2020-12-01 12:03:29 -08:00
Stas Bekman	379005c9d2	start using training_args.parallel_mode (#8882 )	2020-12-01 11:40:36 -08:00
Sylvain Gugger	b08843cf4d	Add a `parallel_mode` property to TrainingArguments (#8877 ) * Add a `distributed_env` property to TrainingArguments * Change name * Address comment	2020-12-01 13:46:09 -05:00
Sylvain Gugger	7c10dd22ae	Better support for resuming training (#8878 )	2020-12-01 13:45:21 -05:00
Stas Bekman	21db560df3	[CI] skip docs-only jobs take #2 (#8853 ) * restore skip * Revert "Remove deprecated `evalutate_during_training` (#8852)" This reverts commit `5530299096`. * check that pipeline.git.base_revision is defined before proceeding * Revert "Revert "Remove deprecated `evalutate_during_training` (#8852)"" This reverts commit `dfec84db3f`. * check that pipeline.git.base_revision is defined before proceeding * doc only * doc + code * restore * restore * typo	2020-12-01 13:15:25 -05:00
Lysandre Debut	a947386cee	Better warning when loading a tokenizer with AutoTokenizer w/o SnetencePiece (#8881 )	2020-12-01 13:13:11 -05:00
Adam Pocock	9c18f15685	Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582 . (#8860 ) Update src/transformers/tokenization_utils_base.py with review fix Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-12-01 13:01:52 -05:00
Sylvain Gugger	c0df963ee1	Make the big table creation/check platform independent (#8856 )	2020-12-01 11:45:57 -05:00
Ratthachat (Jung)	d366228df1	2 typos in modeling_rag.py (#8676 ) * 2 typos - from_question_encoder_generator_configs fix 2 typos from_encoder_generator_configs --> from_question_encoder_generator_configs * apply make style	2020-12-01 16:16:48 +01:00
Rodolfo Quispe	814b9550d7	Fix doc for language code (#8848 )	2020-12-01 10:44:37 +01:00
elk-cloner	4a9e502a36	Ctrl for sequence classification (#8812 ) * add CTRLForSequenceClassification * pass local test * merge with master * fix modeling test for sequence classification * fix deco * fix assert	2020-12-01 09:49:27 +01:00
Stas Bekman	7f34d75780	[s2s trainer] fix DP mode (#8823 ) * fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup	2020-11-30 12:55:56 -08:00
Nicolas Patry	d8fc26e919	NerPipeline (TokenClassification) now outputs offsets of words (#8781 ) * NerPipeline (TokenClassification) now outputs offsets of words - It happens that the offsets are missing, it forces the user to pattern match the "word" from his input, which is not always feasible. For instance if a sentence contains the same word twice, then there is no way to know which is which. - This PR proposes to fix that by outputting 2 new keys for this pipelines outputs, "start" and "end", which correspond to the string offsets of the word. That means that we should always have the invariant: ```python input[entity["start"]: entity["end"]] == entity["entity_group"] # or entity["entity"] if not grouped ``` * Fixing doc style	2020-11-30 14:05:08 -05:00
LysandreJik	5fd3d81ec9	fix pypi complaint on version naming	2020-11-30 13:54:52 -05:00
Funtowicz Morgan	51b071313b	Attempt to fix Flax CI error(s) (#8829 ) * Slightly increase tolerance between pytorch and flax output Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * test_multiple_sentences doesn't require torch Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify parameterization on "jit" to use boolean rather than str Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove pytest.mark.parametrize which seems to fail in some circumstances Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix unused imports. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix style. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Give default parameters values for traced model. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Review comment: Change sentences to sequences Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-30 13:43:17 -05:00
LysandreJik	9995a341c9	Update docs	2020-11-30 12:07:52 -05:00
LysandreJik	22b0ff757a	Release: v4.0.0	2020-11-30 12:07:43 -05:00
Sylvain Gugger	5530299096	Remove deprecated `evalutate_during_training` (#8852 ) * Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 11:12:15 -05:00
Shai Erera	773849415a	Use model.from_pretrained for DataParallel also (#8795 ) * Use model.from_pretrained for DataParallel also When training on multiple GPUs, the code wraps a model with torch.nn.DataParallel. However if the model has custom from_pretrained logic, it does not get applied during load_best_model_at_end. This commit uses the underlying model during load_best_model_at_end, and re-wraps the loaded model with DataParallel. If you choose to reject this change, then could you please move the this logic to a function, e.g. def load_best_model_checkpoint(best_model_checkpoint) or something, so that it can be overridden? * Fix silly bug * Address review comments Thanks for the feedback. I made the change that you proposed, but I also think we should update L811 to check if `self.mode` is an instance of `PreTrained`, otherwise we would still not get into that `if` section, right?	2020-11-30 11:11:10 -05:00
Sylvain Gugger	4062c75e44	Merge remote-tracking branch 'origin/master'	2020-11-30 10:51:35 -05:00
Sylvain Gugger	08e707633c	Comment the skip job on doc line	2020-11-30 10:51:25 -05:00
Sylvain Gugger	75f8100fc7	Add a direct link to the big table (#8850 )	2020-11-30 10:29:23 -05:00
Fraser Greenlee	cc983cd9cd	Correct docstring. (#8845 ) Related issue: https://github.com/huggingface/transformers/issues/8837	2020-11-30 09:33:30 -05:00
Stefan Schweter	19fa01ce2a	token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828 )	2020-11-30 09:21:56 -05:00
Ahmed Elnaggar	40ecaf0c2b	Add T5 Encoder for Feature Extraction (#8717 ) * Add T5 Encoder class for feature extraction * fix T5 encoder add_start_docstrings indent * update init with T5 encoder * update init with TFT5ModelEncoder * remove TFT5ModelEncoder * change T5ModelEncoder order in init * add T5ModelEncoder to transformers init * clean T5ModelEncoder * update init with TFT5ModelEncoder * add TFModelEncoder for Tensorflow * update init with TFT5ModelEncoder * Update src/transformers/models/t5/modeling_t5.py change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove encoder_outputs 1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict. * Authorize missing decoder keys * remove unnecessary input parameters remove pask_key_values and use_cache * remove use_cache remove use_cache from the forward method * add doctoring for T5 encoder add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING * change return_dict to dot access * add T5_ENCODER_INPUTS_DOCSTRING for TF T5 * change TFT5Encoder output type to BaseModelOutput * remove unnecessary parameters for TFT5Encoder * remove unnecessary if statement * add import BaseModelOutput * fix BaseModelOutput typo to TFBaseModelOutput * update T5 doc with T5ModelEncoder * add T5ModelEncoder to tests * finish pytorch * finish docs and mt5 * add mtf to init * fix init * remove n_positions * finish PR * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/mt5/modeling_tf_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 08:34:40 +01:00
Lysandre Debut	610cb106a2	Migration guide from v3.x to v4.x (#8763 ) * Migration guide from v3.x to v4.x * Better wording * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments * Better wording. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-29 20:13:07 -05:00
Stas Bekman	c239dcda83	[CI] implement job skipping for doc-only PRs (#8826 ) * implement job skipping for doc-only PRs * silent grep is crucial * wip * wip * wip * wip * wip * wip * wip * wip * let's add doc * let's add code * revert test commits * restore * Better name * Better name * Better name * some more testing * some more testing * some more testing * finish testing	2020-11-29 11:31:30 -05:00
Guy Rosin	3a08cc1ce7	Minor docs typo fixes (#8797 ) * Fix minor typos * Additional typos * Style fix Co-authored-by: guyrosin <guyrosin@assist-561.cs.technion.ac.il>	2020-11-29 11:27:00 -05:00
Patrick von Platen	5ced23dc84	[Pegasus] Refactor Tokenizer (#8731 ) * refactor * further refactor * fix the rest tomorrow * save intermediate * finish slow tokenizer * make more tests pass * finish refactor * fix comment * clean further * fix name * fix naming * Update src/transformers/models/reformer/tokenization_reformer.py * Apply suggestions from code review * Apply suggestions from code review * refactor * fix init tokenizers * refactor * improve convert * refactor * correct convert slow tokenizer * final fix for Pegasus Tok * remove ipdb * improve links	2020-11-29 16:57:43 +01:00
Patrick von Platen	36b60ce9e8	fix mt5 config (#8832 )	2020-11-28 19:50:49 +01:00
Lysandre Debut	18c32eeb21	Model parallel tests should return, not pass in non model parallel settings. (#8825 )	2020-11-27 16:41:29 -05:00
LysandreJik	edbff1fd00	Temporarily deactivate model generation	2020-11-27 16:15:00 -05:00
Stas Bekman	00ea45659f	suggest a numerical limit of 50MB for determining @slow (#8824 )	2020-11-27 16:04:54 -05:00
Max Del	0a921b6459	BART & FSMT: fix decoder not returning hidden states from the last layer (#8597 ) * Fix decoder not returning hidden states from the last layer * Resolve conflict * Change the way to gather hidden states * Add decoder hidden states test * Make pytest and black happy * Remove redundant line * remove new line Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2020-11-27 18:35:34 +01:00
Moussa Kamal Eddine	81fe0bf085	Add barthez model (#8393 ) * Add init barthez * Add barthez model, tokenizer and docs BARThez is a pre-trained french seq2seq model that uses BART objective. * Apply suggestions from code review docs typos Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add license * Change URLs scheme * Remove barthez model keep tokenizer * Fix style * Fix quality * Update tokenizer * Add fast tokenizer * Add fast tokenizer test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-27 12:31:42 -05:00
Julien Plu	b0f2dbc594	Fix setup.py (#8798 ) enforce unix newline encoding regardless of OS creating the file	2020-11-27 09:25:20 -08:00
Manuel Romero	03bddc375b	Create README.md (#8729 ) * Create README.md * Fix model path	2020-11-27 18:19:15 +01:00
Giovanni Compagnoni	f9a2a9e32b	Extend typing to path-like objects in `PretrainedConfig` and `PreTrainedModel` (#8770 ) * update configuration_utils.py typing to allow pathlike objects when sensible * update modeling_utils.py typing to allow pathlike objects when sensible * black * update tokenization_utils_base.py typing to allow pathlike objects when sensible * update tokenization_utils_fast.py typing to allow pathlike objects when sensible * update configuration_auto.py typing to allow pathlike objects when sensible * update configuration_auto.py docstring to allow pathlike objects when sensible * update tokenization_auto.py docstring to allow pathlike objects when sensible * black	2020-11-27 10:52:58 -05:00

1 2 3 4 5 ...

6034 Commits