transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Stas Bekman	00ea45659f	suggest a numerical limit of 50MB for determining @slow (#8824 )	2020-11-27 16:04:54 -05:00
Max Del	0a921b6459	BART & FSMT: fix decoder not returning hidden states from the last layer (#8597 ) * Fix decoder not returning hidden states from the last layer * Resolve conflict * Change the way to gather hidden states * Add decoder hidden states test * Make pytest and black happy * Remove redundant line * remove new line Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2020-11-27 18:35:34 +01:00
Moussa Kamal Eddine	81fe0bf085	Add barthez model (#8393 ) * Add init barthez * Add barthez model, tokenizer and docs BARThez is a pre-trained french seq2seq model that uses BART objective. * Apply suggestions from code review docs typos Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add license * Change URLs scheme * Remove barthez model keep tokenizer * Fix style * Fix quality * Update tokenizer * Add fast tokenizer * Add fast tokenizer test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-27 12:31:42 -05:00
Julien Plu	b0f2dbc594	Fix setup.py (#8798 ) enforce unix newline encoding regardless of OS creating the file	2020-11-27 09:25:20 -08:00
Manuel Romero	03bddc375b	Create README.md (#8729 ) * Create README.md * Fix model path	2020-11-27 18:19:15 +01:00
Giovanni Compagnoni	f9a2a9e32b	Extend typing to path-like objects in `PretrainedConfig` and `PreTrainedModel` (#8770 ) * update configuration_utils.py typing to allow pathlike objects when sensible * update modeling_utils.py typing to allow pathlike objects when sensible * black * update tokenization_utils_base.py typing to allow pathlike objects when sensible * update tokenization_utils_fast.py typing to allow pathlike objects when sensible * update configuration_auto.py typing to allow pathlike objects when sensible * update configuration_auto.py docstring to allow pathlike objects when sensible * update tokenization_auto.py docstring to allow pathlike objects when sensible * black	2020-11-27 10:52:58 -05:00
Patrick von Platen	a7d46a0609	Fix dpr<>bart config for RAG (#8808 ) * correct dpr test and bert pos fault * fix dpr bert config problem * fix layoutlm * add config to dpr as well	2020-11-27 16:26:45 +01:00
Patrick von Platen	a2cf37595e	[Flax test] Add require pytorch to flix flax test (#8816 ) * try flax fix * same for roberta	2020-11-27 14:40:42 +01:00
mdermentzi	e3ef62bce1	Update README.md (#8815 ) The tokenizer called at the input_ids of example 2 is currently encoding text_1. I think this should be changed to text_2.	2020-11-27 08:34:57 -05:00
Kristian Holsheimer	f8eda599bd	[FlaxBert] Fix non-broadcastable attention mask for batched forward-passes (#8791 ) * [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes * [FlaxRoberta] Fix non-broadcastable attention mask * Use jax.numpy instead of ordinary numpy (otherwise not jit-able) * Partially revert "Use jax.numpy ..." * Add tests for batched forward passes * Avoid unnecessary OOMs due to preallocation of GPU memory by XLA * Auto-fix style * Re-enable GPU memory preallocation but with mem fraction < 1/paralleism	2020-11-27 13:21:19 +01:00
Stas Bekman	cb7602b38d	typo (#8810 )	2020-11-26 14:47:36 -08:00
Stas Bekman	ddf3c64654	potpurri of small fixes (#8807 )	2020-11-26 14:06:27 -08:00
chutaklee	52708d2637	Fix PPLM (#8779 ) * Fix pplm * fix style * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-26 22:23:36 +01:00
Patrick von Platen	8f07f5c44b	Revert "finetune.py: specifying generation min_length (#8478 )" (#8805 ) This reverts commit `5aa361f3e5`.	2020-11-26 20:12:01 +01:00
Manuel Romero	66e9608bae	Create README.md (#8760 )	2020-11-26 12:43:43 -05:00
Daniel Khashabi	5aa361f3e5	finetune.py: specifying generation min_length (#8478 )	2020-11-26 12:33:02 +05:30
joangines	30e7f7e5da	Create README.md (#8752 )	2020-11-25 17:38:21 -05:00
Patrick von Platen	2a6fbe6a40	[XLNet] Fix mems behavior (#8567 ) * fix mems in xlnet * fix use_mems * fix use_mem_len * fix use mems * clean docs * fix tf typo * make xlnet tf for generation work * fix tf test * refactor use cache * add use cache for missing models * correct use_cache in generate * correct use cache in tf generate * fix tf * correct getattr typo * make sylvain happy * change in docs as well * do not apply to cookie cutter statements * fix tf test * make pytorch model fully backward compatible	2020-11-25 16:54:59 -05:00
Joe Davison	369f1d77b4	Return correct Bart hidden state tensors (#8747 ) * bart output hidden states upstream * same w/ decoder * add tests * fix prophetnet * fix gpt2 and ctrl * fix fstm and skip test for reformer and longformer * fix all models Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-25 22:06:04 +01:00
Lysandre Debut	138f45c184	Fix QA argument handler (#8765 ) * Fix QA argument handler * Attempt to get a better fix for QA (#8768) Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2020-11-25 14:02:15 -05:00
Sylvain Gugger	4821ea5aeb	Big model table (#8774 ) * First draft * Styling * With all changes staged * Update docs/source/index.rst Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Styling Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-25 12:02:15 -05:00
Manuel Romero	90d5ab3bfe	Create README.md (#8761 )	2020-11-24 17:51:24 -05:00
Julien Plu	29d4992453	New TF model inputs (#8602 ) * Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add input processing for TF Flaubert * Add deprecated arguments * Add input processing to TF XLM * remove unused imports * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add the new inputs in new Longformer models * Update the template with the new input processing * Remove useless assert * Apply style * Trigger CI	2020-11-24 13:55:00 -05:00
Stas Bekman	82d443a7fd	[core] implement support for run-time dependency version checking (#8645 ) * implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-24 13:22:25 -05:00
Quentin Lhoest	a7d73cfdd4	fix rag index names in eval_rag.py example (#8730 )	2020-11-24 17:04:47 +01:00
Binoy Dalal	8d4ed7e953	added instructions for syncing upstream master with forked master via PR (#8745 ) * added instructions for syncing upstream master with forked master via PR * expand to add a note to why this is requested Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2020-11-24 10:11:46 -05:00
Lysandre Debut	e09e54fd9d	MT5 should have an autotokenizer (#8743 ) * MT5 should have an autotokenizer * Different configurations should be able to point to same tokenizers	2020-11-24 09:50:25 -05:00
Lysandre Debut	6fdd0bb231	Fix slow tests v2 (#8746 ) * Fix BART test * Fix MBART tests * Remove erroneous line from yaml * Update tests/test_modeling_bart.py * Quality	2020-11-24 09:35:12 -05:00
zhiheng-huang	2c83b3c38d	Support various BERT relative position embeddings (2nd) (#8276 ) * Support BERT relative position embeddings * Fix typo in README.md * Address review comment * Fix failing tests * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py * make fix copies * fix configs of electra and albert and fix longformer * remove copy statement from longformer * fix albert * fix electra * Add bert variants forward tests for various position embeddings * [tiny] Fix style for test_modeling_bert.py * improve docstring * [tiny] improve docstring and remove unnecessary dependency * [tiny] Remove unused import * re-add to ALBERT * make embeddings work for ALBERT * add test for albert Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-24 14:40:53 +01:00
Julien Chaumond	9e71aa2f8f	[EsperBERTo] Fix URLs to assets	2020-11-24 14:15:30 +01:00
Lysandre Debut	02f48b9bfc	Model parallel documentation (#8741 ) * Add parallelize methods to the .rst files * Correct format	2020-11-23 20:14:48 -05:00
LysandreJik	7f2c00913a	TF BERT test update	2020-11-23 18:20:19 -05:00
LysandreJik	e1b7e10d5f	Update TF BERT test	2020-11-23 18:19:12 -05:00
Colin Brochtrup	8ffc01a76a	Add early stopping callback to pytorch trainer (#8581 ) * Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer * Add early stopping test * Set patience counter to 0 if best metric not defined yet * Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on. * Run make style * make funciton name sensible * Improve new argument docstring wording and hope that flakey CI test passes. * Use on_evaluation callback instead of custom. Remove some debug printing * Move early stopping arguments and state into early stopping callback * Run make style * Remove old code * Fix docs formatting. make style went rogue on me. * Remove copied attributes and fix variable * Add assertions on training arguments instead of mutating them. Move comment out of public docs. * Make separate test for early stopping callback. Add test of invalid arguments. * Run make style... I remembered before CI this time! * appease flake8 * Add EarlyStoppingCallback to callback docs * Make docstring EarlyStoppingCallabck match other callbacks. * Fix typo in docs	2020-11-23 17:25:35 -05:00
Sylvain Gugger	367f497dec	Fix max length in run_plm script (#8738 )	2020-11-23 16:02:31 -05:00
Stas Bekman	e84786aaa6	consistent ignore keys + make private (#8737 ) * consistent ignore keys + make private * style * - authorized_missing_keys => _keys_to_ignore_on_load_missing - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected * move public doc of private attributes to private comment	2020-11-23 12:33:13 -08:00
Sylvain Gugger	49759c0cda	Document new training argument	2020-11-23 15:02:59 -05:00
alexorona	1cd9be2aeb	gpt2 and t5 parallel modeling (#8696 ) * gpt2 and t5 parallel modeling * model_parallel utils update * adding missing model_parallel_utils Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5 * training_args reformat Reformatted training_args * style formatting Style formatting doc string length on training_args and model_parallel_utils * style changes make style && make quality for training_args and model_parallel_utils. * adding tests * minor change in trainer reverts loss calculation * Update training_args.py * Update training_args.py added back docstring language for adam_beta1 and adam_beta2 * Update trainer.py * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix style & rebase Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-11-23 14:41:23 -05:00
Stas Bekman	1e45bef0a7	[trainer] make generate work with multigpu (#8716 ) * make generate work with multigpu * better fix - thanks @sgugger	2020-11-23 10:57:27 -08:00
Sylvain Gugger	900024273b	Change default cache path (#8734 ) * Change default cache path * Document changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-23 13:56:45 -05:00
Julien Chaumond	0cc5ab1333	Improve bert-japanese tokenizer handling (#8659 ) * Make ci fail * Try to make tests actually run? * CI finally failing? * Fix CI * Revert "Fix CI" This reverts commit `ca7923be73`. * Ooops wrong one * one more try * Ok ok let's move this elsewhere * Alternative to globals() (#8667) * Alternative to globals() * Error is raised later so return None * Sentencepiece not installed make some tokenizers None * Apply Lysandre wisdom * Slightly clearer comment? cc @sgugger Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-23 11:15:02 -05:00
Amine Abdaoui	eec76615f6	[model_cards]: control input examples of Geotrend models (#8727 ) * [model_cards]: control arabic model examples * [model_cards]: control input examples of Geotrend models * [model_cards]: add link to generatation script	2020-11-23 11:09:50 -05:00
Jessica Yung	143b564e59	Add pip install update to resolve import error in transformers notebook (#8616 ) * Add pip install update to resolve import error Add pip install upgrade tensorflow-gpu to remove error below: ``` --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-2-094fadb93f3f> in <module>() 1 import torch ----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer 3 4 torch.set_grad_enabled(False) 4 frames /usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>() 133 134 # Pipelines --> 135 from .pipelines import ( 136 Conversation, 137 ConversationalPipeline, /usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>() 46 import tensorflow as tf 47 ---> 48 from .modeling_tf_auto import ( 49 TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING, 50 TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING, /usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>() 49 from .configuration_utils import PretrainedConfig 50 from .file_utils import add_start_docstrings ---> 51 from .modeling_tf_albert import ( 52 TFAlbertForMaskedLM, 53 TFAlbertForMultipleChoice, /usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>() 22 import tensorflow as tf 23 ---> 24 from .activations_tf import get_tf_activation 25 from .configuration_albert import AlbertConfig 26 from .file_utils import ( /usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>() 52 "gelu": tf.keras.layers.Activation(gelu), 53 "relu": tf.keras.activations.relu, ---> 54 "swish": tf.keras.activations.swish, 55 "silu": tf.keras.activations.swish, 56 "gelu_new": tf.keras.layers.Activation(gelu_new), AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish' ``` I have tried running the colab after this change and it seems to work fine (all the cells run with no errors). * Update notebooks/02-transformers.ipynb only need to upgrade tensorflow, not tensorflow-gpu. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-23 09:58:52 -05:00
Yossi Synett	18c8cf000b	Fix bug in x-attentions output for roberta and harden test to catch it (#8660 )	2020-11-23 13:28:29 +01:00
Tony	48cc224703	[model_cards] Add card for gpt2-rnm (#8673 )	2020-11-23 05:52:29 -05:00
Nguyen Van Nha	52585e40af	create README.md (#8682 ) * create README.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-23 05:51:54 -05:00
Sagor Sarker	b5187e317f	added bangla-bert-sentiment model card (#8687 )	2020-11-23 05:51:16 -05:00
moniquebm	b6d864e2f0	Create README.md (#8630 ) * Create README.md * correct metrics id cc @lhoestq Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-23 04:48:10 -05:00
Santiago Castro	e1f3156b21	Fix many typos (#8708 )	2020-11-21 22:58:10 -05:00
Patrick von Platen	9c0afdaf7b	fix flaky ci (#8694 )	2020-11-20 22:07:21 +01:00

1 2 3 4 5 ...

5990 Commits