transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 10:38:23 +06:00

Author	SHA1	Message	Date
Julien Chaumond	55bc0c599a	[model_cards] Switch to a more explicit domain for the media bucket	2020-10-27 18:08:05 +01:00
Harutaka Kawamura	7bff0af0a4	Fix a bug for `CallbackHandler.callback_list` (#8052 ) * Fix callback_list * Add test Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> * Fix test Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>	2020-10-27 10:37:04 -04:00
Harutaka Kawamura	8e28c327fc	Fix assertion error message for MLflowCallback (#8091 )	2020-10-27 10:34:51 -04:00
Sylvain Gugger	3220f21f14	Styling fix	2020-10-27 10:09:51 -04:00
Jonathan Chang	286dc19a4f	Fix IterableDataset with __len__ in Trainer (#8095 )	2020-10-27 09:52:35 -04:00
Sam Shleifer	d93acd6f13	Move style_doc to extra_quality_checks (#8081 )	2020-10-27 09:42:07 -04:00
Stas Bekman	bfd5e370a7	[CI] generate separate report files as artifacts (#7995 ) * better reports * a whole bunch of reports in their own files * clean up * improvements * github artifacts experiment * style * complete the report generator with multiple improvements/fixes * fix * save all reports under one dir to easy upload * can remove temp failing tests * doc fix * some cleanup	2020-10-27 09:25:07 -04:00
Lysandre Debut	33f6ef733a	Fix DeBERTa docs (#8092 ) * Fix DeBERTa docs * Tokenizer and config	2020-10-27 09:07:41 -04:00
Sylvain Gugger	c42596bc07	Doc styling fixes (#8074 ) * Fix a few docstrings * More fixes * Styling	2020-10-27 07:54:50 -04:00
Doug Blank	1496931b49	Fix comet_ml import and add ensure availability (#7933 ) * Fix comet_ml import and add ensure availability * Make isort happy * Make flake8 happy * Don't show comet_ml warn if COMET_MODE=DISABLED * Make isort happy	2020-10-27 07:31:07 -04:00
Chengxi Guo	985bba9096	fix doc bug (#8082 ) Signed-off-by: mymusise <mymusise1@gmail.com>	2020-10-27 07:29:25 -04:00
Sylvain Gugger	08f534d2da	Doc styling (#8067 ) * Important files * Styling them all * Revert "Styling them all" This reverts commit `7d029395fd`. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy	2020-10-26 18:26:02 -04:00
Sylvain Gugger	04a17f8550	Doc fixes in preparation for the docstyle PR (#8061 ) * Fixes in preparation for doc styling * More fixes * Better syntax * Fixes * Style * More fixes * More fixes	2020-10-26 15:01:09 -04:00
Philip May	8bbb74f211	[Model Card] new cross lingual sentence model for German and English (#8026 ) * mc for new cross lingual sentence model * fat text * url spelling fix * more url spelling fixes * slight thanks change * small improvements in text * multilingual word xchange * change colab link * xval fold number * add model links * line break in model names * Update README.md * Update README.md * new examples link * new examples link * add evaluation dataset name * add more about multi lingual * typo fix * typo * typos * hyperparameter typos * hyperparameter typo * add metadata * add metadata * Update README.md * typo fix * Small improvement	2020-10-26 14:48:26 -04:00
Lysandre Debut	3a10764574	Fix TF training arguments instantiation (#8063 )	2020-10-26 14:39:25 -04:00
Sam Shleifer	bc9332b545	[TF] from_pt should respect authorized_unexpected_keys (#8056 )	2020-10-26 13:53:27 -04:00
Stas Bekman	7ff7c4934b	fixing crash (#8057 )	2020-10-26 13:19:10 -04:00
Lysandre Debut	cbad90d86d	Fix + Test (#8049 )	2020-10-26 12:32:27 -04:00
Patrick von Platen	664c7ec453	[Seq2Seq Trainer] Make sure padding is implemented for models without pad_token (#8043 ) * make sure padding is implemented for non-padding tokens models as well * add better error message * add better warning * remove results files * Update examples/seq2seq/seq2seq_trainer.py * remove unnecessary copy line * correct usage of labels * delete test files	2020-10-26 17:28:16 +01:00
mohammadreza-Banaei73	098ddc2244	Update README.md (#8050 ) --wwm cant be used as an argument given run_language_modeling.py and should be changed to --whole_word_mask	2020-10-26 12:00:18 -04:00
Joe Davison	fbcddb8544	add mutliclass field to default zero shot example	2020-10-26 11:07:51 -04:00
Yusuke Mori	a9ac1db276	Minor error fix of 'bart-large-cnn' details in the pretrained_models doc (#8053 )	2020-10-26 11:05:16 -04:00
Samuel	fc2d6eac3c	Minor typo fixes to the preprocessing tutorial in the docs (#8046 ) * Fix minor typos Fix minor typos in the docs. * Update docs/source/preprocessing.rst Clearer data structure description. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-26 10:22:29 -04:00
Joe Davison	b0a907615a	minor model card description updates (#8051 )	2020-10-26 10:04:20 -04:00
noise-field	c48b16b8da	Mlflow integration callback (#8016 ) * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created. * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.	2020-10-26 09:41:58 -04:00
Lysandre Debut	8be9cb0aef	Tiny TF Bart fixes (#8023 )	2020-10-26 09:29:56 -04:00
Sylvain Gugger	077478637d	Fix label name in DataCollatorForNextSentencePrediction test (#8048 )	2020-10-26 09:23:12 -04:00
Sam Shleifer	8bbe8247f1	Cleanup pytorch tests (#8033 )	2020-10-26 08:59:06 -04:00
suliuzh	20a0894d1a	update version for scipy (#7998 )	2020-10-26 08:56:56 -04:00
Sam Shleifer	f20aec1de5	fsmt slow test uses lists (#8031 )	2020-10-26 08:32:36 -04:00
Stas Bekman	101186bc1f	[docs] [testing] distributed training (#7993 ) * distributed training * fix * fix formatting * wording	2020-10-26 08:15:05 -04:00
luyug	c153bcc5c8	Add mixed precision evaluation (#8036 ) * Add mixed precision evaluation * use original flag	2020-10-26 08:12:31 -04:00
Samuel	9aa2826687	Minor typo fixes to the tokenizer summary (#8045 ) Minor typo fixes to the tokenizer summary	2020-10-26 08:08:33 -04:00
Lysandre	829b9f8cc3	Remove codecov.yml	2020-10-26 08:05:02 -04:00
Thomas Wolf	79eb391586	[tokenizers] Fixing #8001 - Adding tests on tokenizers serialization (#8006 ) * fixing #8001 * make T5 tokenizer serialization more robust - style	2020-10-26 10:27:48 +01:00
Julien Chaumond	7087d9b1c0	[model_cards] bert-base-danish Fixup #8030	2020-10-26 09:38:21 +01:00
Julien Chaumond	efc4a21ffa	Fixup #8025 Close #8030	2020-10-26 09:32:07 +01:00
Sam Longenbach	5148f43309	[Model Card] DJSammy/bert-base-danish-uncased_BotXO,ai (#8025 ) * Create README.md * Update README.md	2020-10-25 15:20:46 +08:00
Suraj Patil	38f6739cd6	[doc prepare_seq2seq_batch] fix docs (#8013 )	2020-10-24 15:33:47 -04:00
Yixin Nie	00602f7840	Create model card for pre-trained NLI models. (#7864 ) * Create README.md * Update model_cards/ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Add Meta information for dataset identifier. Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-10-24 03:16:07 -04:00
Patrick von Platen	3c682ea15c	[Examples] Allow EncoderDecoderModels to be trained with Seq2Seq (#7809 ) * Make Seq2Seq Trainer more similar to Trainer * fix typo * fix seq2seq trainer * remove from tests * remove lock * remove train files * delete test files * correct typo * check at init * make sure trainer is not slowed down on TPU * correct isort * remove use cache * fix use cache * add last use chache = false	2020-10-23 23:05:51 +02:00
Sacha Arbonel	59b5953d89	Create model card for bert-italian-cased-finetuned-pos (#8003 ) * Create README.md * Update model_cards/sachaarbonel/bert-italian-cased-finetuned-pos/README.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-10-23 10:58:05 -04:00
Zhiqi Huang	6e07c1f446	Add model cards for DynaBERT (#7999 )	2020-10-23 10:53:53 -04:00
Zhiqi Huang	43fdafef89	Create README.md (#7997 )	2020-10-23 10:53:37 -04:00
Blaise Cruz	627e813734	Added model cards for Tagalog ELECTRA models (#7996 ) Co-authored-by: Jan Christian Blaise Cruz <jcblaise@Blaises-MacBook-Pro.local>	2020-10-23 10:52:21 -04:00
Philip May	9865e1fe52	model card for German Sentence Embeddings V2 (#7952 ) * model card German Sentence Embeddings V2 - for German RoBERTa for Sentence Embeddings V2 - marked old as outdated * small correction * small improvement in description * small spelling fix * spelling fix * add evaluation results * spearman explanation * add number of trials	2020-10-23 10:45:54 -04:00
Ethan Perez	d39da5a2ab	Handling longformer model_type (#7990 ) Updating the run_squad training script to handle the "longformer" `model_type`. The longformer is trained in the same was as RoBERTa, so I've added the "longformer" `model_type` (that's the right hugginface name for the LongFormer model, right?) everywhere there was a "roberta" `model_type` reference. The longformer (like RoBERTa) doesn't use `token_type_ids` (as I understand from looking at the [longformer notebook](https://github.com/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb), which is what gets updated after this change. This fix might be related to [this issue](https://github.com/huggingface/transformers/issues/7249) with SQuAD training when using run_squad.py	2020-10-23 10:34:06 -04:00
Anthony MOI	5e323017a4	Fix BatchEncoding.word_to_tokens for removed tokens (#7939 )	2020-10-23 10:29:37 -04:00
Patrick von Platen	4acfd1a8dc	[Reformer] remove reformer pad_token_id (#7991 ) * remove reformer pad_token_id * fix pegasus	2020-10-23 10:29:15 -04:00
Thomas Wolf	3a40cdf58d	[tests\|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 ) * WIP refactoring pipeline tests - switching to fast tokenizers * fix dialog pipeline and fill-mask * refactoring pipeline tests backbone * make large tests slow * fix tests (tf Bart inactive for now) * fix doc... * clean up for merge * fixing tests - remove bart from summarization until there is TF * fix quality and RAG * Add new translation pipeline tests - fix JAX tests * only slow for dialog * Fixing the missing TF-BART imports in modeling_tf_auto * spin out pipeline tests in separate CI job * adding pipeline test to CI YAML * add slow pipeline tests * speed up tf and pt join test to avoid redoing all the standalone pt and tf tests * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/pipelines.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add require_torch and require_tf in is_pt_tf_cross_test Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-23 15:58:19 +02:00

1 2 3 4 5 ...

5759 Commits