transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
LysandreJik	e1b7e10d5f	Update TF BERT test	2020-11-23 18:19:12 -05:00
Colin Brochtrup	8ffc01a76a	Add early stopping callback to pytorch trainer (#8581 ) * Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer * Add early stopping test * Set patience counter to 0 if best metric not defined yet * Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on. * Run make style * make funciton name sensible * Improve new argument docstring wording and hope that flakey CI test passes. * Use on_evaluation callback instead of custom. Remove some debug printing * Move early stopping arguments and state into early stopping callback * Run make style * Remove old code * Fix docs formatting. make style went rogue on me. * Remove copied attributes and fix variable * Add assertions on training arguments instead of mutating them. Move comment out of public docs. * Make separate test for early stopping callback. Add test of invalid arguments. * Run make style... I remembered before CI this time! * appease flake8 * Add EarlyStoppingCallback to callback docs * Make docstring EarlyStoppingCallabck match other callbacks. * Fix typo in docs	2020-11-23 17:25:35 -05:00
Sylvain Gugger	367f497dec	Fix max length in run_plm script (#8738 )	2020-11-23 16:02:31 -05:00
Stas Bekman	e84786aaa6	consistent ignore keys + make private (#8737 ) * consistent ignore keys + make private * style * - authorized_missing_keys => _keys_to_ignore_on_load_missing - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected * move public doc of private attributes to private comment	2020-11-23 12:33:13 -08:00
Sylvain Gugger	49759c0cda	Document new training argument	2020-11-23 15:02:59 -05:00
alexorona	1cd9be2aeb	gpt2 and t5 parallel modeling (#8696 ) * gpt2 and t5 parallel modeling * model_parallel utils update * adding missing model_parallel_utils Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5 * training_args reformat Reformatted training_args * style formatting Style formatting doc string length on training_args and model_parallel_utils * style changes make style && make quality for training_args and model_parallel_utils. * adding tests * minor change in trainer reverts loss calculation * Update training_args.py * Update training_args.py added back docstring language for adam_beta1 and adam_beta2 * Update trainer.py * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix style & rebase Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-11-23 14:41:23 -05:00
Stas Bekman	1e45bef0a7	[trainer] make generate work with multigpu (#8716 ) * make generate work with multigpu * better fix - thanks @sgugger	2020-11-23 10:57:27 -08:00
Sylvain Gugger	900024273b	Change default cache path (#8734 ) * Change default cache path * Document changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-23 13:56:45 -05:00
Julien Chaumond	0cc5ab1333	Improve bert-japanese tokenizer handling (#8659 ) * Make ci fail * Try to make tests actually run? * CI finally failing? * Fix CI * Revert "Fix CI" This reverts commit `ca7923be73`. * Ooops wrong one * one more try * Ok ok let's move this elsewhere * Alternative to globals() (#8667) * Alternative to globals() * Error is raised later so return None * Sentencepiece not installed make some tokenizers None * Apply Lysandre wisdom * Slightly clearer comment? cc @sgugger Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-23 11:15:02 -05:00
Amine Abdaoui	eec76615f6	[model_cards]: control input examples of Geotrend models (#8727 ) * [model_cards]: control arabic model examples * [model_cards]: control input examples of Geotrend models * [model_cards]: add link to generatation script	2020-11-23 11:09:50 -05:00
Jessica Yung	143b564e59	Add pip install update to resolve import error in transformers notebook (#8616 ) * Add pip install update to resolve import error Add pip install upgrade tensorflow-gpu to remove error below: ``` --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-2-094fadb93f3f> in <module>() 1 import torch ----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer 3 4 torch.set_grad_enabled(False) 4 frames /usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>() 133 134 # Pipelines --> 135 from .pipelines import ( 136 Conversation, 137 ConversationalPipeline, /usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>() 46 import tensorflow as tf 47 ---> 48 from .modeling_tf_auto import ( 49 TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING, 50 TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING, /usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>() 49 from .configuration_utils import PretrainedConfig 50 from .file_utils import add_start_docstrings ---> 51 from .modeling_tf_albert import ( 52 TFAlbertForMaskedLM, 53 TFAlbertForMultipleChoice, /usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>() 22 import tensorflow as tf 23 ---> 24 from .activations_tf import get_tf_activation 25 from .configuration_albert import AlbertConfig 26 from .file_utils import ( /usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>() 52 "gelu": tf.keras.layers.Activation(gelu), 53 "relu": tf.keras.activations.relu, ---> 54 "swish": tf.keras.activations.swish, 55 "silu": tf.keras.activations.swish, 56 "gelu_new": tf.keras.layers.Activation(gelu_new), AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish' ``` I have tried running the colab after this change and it seems to work fine (all the cells run with no errors). * Update notebooks/02-transformers.ipynb only need to upgrade tensorflow, not tensorflow-gpu. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-23 09:58:52 -05:00
Yossi Synett	18c8cf000b	Fix bug in x-attentions output for roberta and harden test to catch it (#8660 )	2020-11-23 13:28:29 +01:00
Tony	48cc224703	[model_cards] Add card for gpt2-rnm (#8673 )	2020-11-23 05:52:29 -05:00
Nguyen Van Nha	52585e40af	create README.md (#8682 ) * create README.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-23 05:51:54 -05:00
Sagor Sarker	b5187e317f	added bangla-bert-sentiment model card (#8687 )	2020-11-23 05:51:16 -05:00
moniquebm	b6d864e2f0	Create README.md (#8630 ) * Create README.md * correct metrics id cc @lhoestq Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-23 04:48:10 -05:00
Santiago Castro	e1f3156b21	Fix many typos (#8708 )	2020-11-21 22:58:10 -05:00
Patrick von Platen	9c0afdaf7b	fix flaky ci (#8694 )	2020-11-20 22:07:21 +01:00
Binoy Dalal	29bdb88368	Vectorize RepetitionPenaltyLogitsProcessor to improve performance (#8598 ) * refactored exisiting nested loops to vectorized implementation * replaced explicit indexing with torch.where * modifying score for previous input_ids only	2020-11-20 19:59:06 +01:00
Roman Kalyakin	2594bd8b73	moved temperature wrapper before topP/topK (#8686 )	2020-11-20 19:33:54 +01:00
Quentin Lhoest	8062fa63c5	Fix rag finetuning + add finetuning test (#8585 ) * replace init_ddp_connection for index init * style * add finetune test * add test data * move generate tensors to device * add test on EM metric * style * allow multi process test * keep gloo process group for retrieval * add multi-gpu test * use custom accelerator * clean test finetune * minor * style * style * typo * use python call instead of imported main fumction * return_dict fix in modeling_rag * use float32 in retrieval * store as float32 as well in the custom knowledge dataset example * style * rename to finetune_rag * style * update readme * rename utils and callbacks to utils_rag and callbacks_rag * fix test * patrick's comments * generate dummy data in the finetue test script * remove dummy data files * style	2020-11-20 19:05:03 +01:00
Sylvain Gugger	63e91f5fde	Document adam betas TrainingArguments (#8688 )	2020-11-20 09:27:25 -05:00
Kevin Canwen Xu	94caaa93c2	Update the bibtex with EMNLP demo (#8678 ) * Update the bibtex with EMNLP demo * Update README.md * Update README.md	2020-11-20 13:26:33 +08:00
Sylvain Gugger	6494910f27	Add sentencepiece to the CI and fix tests (#8672 ) * Fix the CI and tests * Fix quality * Remove that m form nowhere	2020-11-19 16:44:20 -05:00
Stas Bekman	0ad45e108d	[examples/seq2seq] fix PL deprecation warning (#8577 ) * fix deprecation warning * fix	2020-11-19 21:46:04 +01:00
Arindum Roy	0e19a4c2d6	Update bert-base-multilingual-cased-README.md (#8668 ) The heading was originally uncased, which did not reflect the contents of this README. Changed it to cased.	2020-11-19 15:45:06 -05:00
Stas Bekman	06518404cb	revert	2020-11-19 12:12:46 -08:00
Stas Bekman	297a29382f	Please fix your software not to ping master You may be unaware but you're running some software that meddles with every commit on https://github.com/huggingface/transformers/ Something is wrong with the software you're using. It adds a reference to almost every PR in the master tree. Which is very wrong. Please check your software and please don't do it again. Example: see the bottom of this PR and most other PRs: https://github.com/huggingface/transformers/pull/8639	2020-11-19 12:11:35 -08:00
Stas Bekman	42111f1d56	[tokenizers] convert_to_tensors: don't reconvert when the type is already right (#8283 ) * don't reconvert when the type is already right * better name * adjust logic as suggested * merge	2020-11-19 12:06:01 -08:00
Sylvain Gugger	20b658607e	Fix run_ner script (#8664 ) * Fix run_ner script * Pin datasets	2020-11-19 13:59:30 -05:00
Zhylko Dima	ca0109bd68	`disable_ngram_loss` fix for prophetnet (#8554 ) * `disable_ngram_loss` fix for prophetnet * add changes documentation * fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments * mean label smoothing loss * small refactor * fix test Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-11-19 19:18:07 +01:00
Sylvain Gugger	0603564e93	Merge remote-tracking branch 'origin/master'	2020-11-19 12:18:57 -05:00
Sylvain Gugger	1e08af383a	Forgot to save...	2020-11-19 12:18:50 -05:00
LysandreJik	d86b5ffc6f	Release: v4.0.0-rc-1	2020-11-19 12:00:07 -05:00
Sylvain Gugger	cb3e5c33f7	Fix a few last paths for the new repo org (#8666 )	2020-11-19 11:56:42 -05:00
Matthias	a79a96ddaa	fix small typo (#8644 ) Fixed a small typo on the XLNet and permutation language modelling section	2020-11-19 11:24:11 -05:00
Sylvain Gugger	4208f496ee	Better filtering of the model outputs in Trainer (#8633 ) * Better filtering of the model outputs in Trainer * Fix examples tests * Add test for Lysandre	2020-11-19 10:43:15 -05:00
Lysandre Debut	f2e07e7272	Fix a bunch of slow tests (#8634 ) * CI should install `sentencepiece` * Requiring TF * Fixing some TFDPR bugs * remove return_dict=False/True hack Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-11-19 10:41:41 -05:00
elk-cloner	5362bb8a6b	Tf longformer for sequence classification (#8231 ) * working on LongformerForSequenceClassification * add TFLongformerForMultipleChoice * add TFLongformerForTokenClassification * use add_start_docstrings_to_model_forward * test TFLongformerForSequenceClassification * test TFLongformerForMultipleChoice * test TFLongformerForTokenClassification * remove test from repo * add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice * add requested classes to modeling_tf_auto.py update dummy_tf_objects fix tests fix bugs in requested classes * pass all tests except test_inputs_embeds * sync with master * pass all tests except test_inputs_embeds * pass all tests * pass all tests * work on test_inputs_embeds * fix style and quality * make multi choice work * fix TFLongformerForTokenClassification signature * fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature * fix mult choice * fix mc hint * fix input embeds * fix input embeds * refactor input embeds * fix copy issue * apply sylvains changes and clean more Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-19 10:37:27 -05:00
Quentin Lhoest	62cd9ce9f8	fix missing return dict (#8653 )	2020-11-19 15:17:18 +01:00
Amine Abdaoui	0c2677f529	[model card] : fix bert-base-15lang-cased (#8655 ) the table was badly formatted because of a single line break	2020-11-19 05:41:02 -05:00
Amine Abdaoui	0a80959bdd	Add cards for all Geotrend models (#8617 ) * docs(bert-base-15lang-cased): add model card * add cards for all Geotrend models * [model cards] fix language tag for all Geotrend models	2020-11-19 04:47:24 -05:00
cronoik	dcc9c64299	Updated the Extractive Question Answering code snippets (#8636 ) * Updated the Extractive Question Answering code snippets The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call. * Update task_summary.rst	2020-11-18 18:56:47 -05:00
Tim Isbister	28d16e7ac5	Update README.md (#8635 )	2020-11-18 18:35:23 -05:00
cronoik	b290195ac7	grammar (#8639 )	2020-11-18 18:04:25 -05:00
Stas Bekman	d86d57faa3	[s2s] distillation apex breaks return_dict obj (#8631 ) * apex breaks return_dict obj * style	2020-11-18 12:51:29 -08:00
Perez Ogayo	bf3611b2ab	Created ModelCard for Hel-ach-en MT model (#8496 ) * Updated ModelCard * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-18 14:42:13 -05:00
Yifan Peng	c95b26a719	Create README.md (#8362 )	2020-11-18 13:37:14 -05:00
Manuel Romero	fdbbb6c17a	Model card: T5-base fine-tuned on QuaRTz (#8369 ) * Model card: T5-base fine-tuned on QuaRTz * Update model_cards/mrm8488/t5-base-finetuned-quartz/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-18 13:34:27 -05:00
Yifan Peng	6e6d24c5d8	Create README.md (#8363 )	2020-11-18 13:33:04 -05:00

1 2 3 4 5 ...

5958 Commits