Commit Graph

5990 Commits

Author SHA1 Message Date
Stas Bekman
00ea45659f
suggest a numerical limit of 50MB for determining @slow (#8824) 2020-11-27 16:04:54 -05:00
Max Del
0a921b6459
BART & FSMT: fix decoder not returning hidden states from the last layer (#8597)
* Fix decoder not returning hidden states from the last layer

* Resolve conflict

* Change the way to gather hidden states

* Add decoder hidden states test

* Make pytest and black happy

* Remove redundant line

* remove new line

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2020-11-27 18:35:34 +01:00
Moussa Kamal Eddine
81fe0bf085
Add barthez model (#8393)
* Add init barthez

* Add barthez model, tokenizer and docs

BARThez is a pre-trained french seq2seq model that uses BART objective.

* Apply suggestions from code review docs typos

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add license

* Change URLs scheme

* Remove barthez model keep tokenizer

* Fix style

* Fix quality

* Update tokenizer

* Add fast tokenizer

* Add fast tokenizer test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-27 12:31:42 -05:00
Julien Plu
b0f2dbc594
Fix setup.py (#8798)
enforce unix newline encoding regardless of OS creating the file
2020-11-27 09:25:20 -08:00
Manuel Romero
03bddc375b
Create README.md (#8729)
* Create README.md

* Fix model path
2020-11-27 18:19:15 +01:00
Giovanni Compagnoni
f9a2a9e32b
Extend typing to path-like objects in PretrainedConfig and PreTrainedModel (#8770)
* update configuration_utils.py typing to allow pathlike objects when sensible

* update modeling_utils.py typing to allow pathlike objects when sensible

* black

* update tokenization_utils_base.py typing to allow pathlike objects when sensible

* update tokenization_utils_fast.py typing to allow pathlike objects when sensible

* update configuration_auto.py typing to allow pathlike objects when sensible

* update configuration_auto.py docstring to allow pathlike objects when sensible

* update tokenization_auto.py docstring to allow pathlike objects when sensible

* black
2020-11-27 10:52:58 -05:00
Patrick von Platen
a7d46a0609
Fix dpr<>bart config for RAG (#8808)
* correct dpr test and bert pos fault

* fix dpr bert config problem

* fix layoutlm

* add config to dpr as well
2020-11-27 16:26:45 +01:00
Patrick von Platen
a2cf37595e
[Flax test] Add require pytorch to flix flax test (#8816)
* try flax fix

* same for roberta
2020-11-27 14:40:42 +01:00
mdermentzi
e3ef62bce1
Update README.md (#8815)
The tokenizer called at the input_ids of example 2 is currently encoding text_1. I think this should be changed to text_2.
2020-11-27 08:34:57 -05:00
Kristian Holsheimer
f8eda599bd
[FlaxBert] Fix non-broadcastable attention mask for batched forward-passes (#8791)
* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes

* [FlaxRoberta] Fix non-broadcastable attention mask

* Use jax.numpy instead of ordinary numpy (otherwise not jit-able)

* Partially revert "Use jax.numpy ..."

* Add tests for batched forward passes

* Avoid unnecessary OOMs due to preallocation of GPU memory by XLA

* Auto-fix style

* Re-enable GPU memory preallocation but with mem fraction < 1/paralleism
2020-11-27 13:21:19 +01:00
Stas Bekman
cb7602b38d
typo (#8810) 2020-11-26 14:47:36 -08:00
Stas Bekman
ddf3c64654
potpurri of small fixes (#8807) 2020-11-26 14:06:27 -08:00
chutaklee
52708d2637
Fix PPLM (#8779)
* Fix pplm

* fix style

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-26 22:23:36 +01:00
Patrick von Platen
8f07f5c44b
Revert "finetune.py: specifying generation min_length (#8478)" (#8805)
This reverts commit 5aa361f3e5.
2020-11-26 20:12:01 +01:00
Manuel Romero
66e9608bae
Create README.md (#8760) 2020-11-26 12:43:43 -05:00
Daniel Khashabi
5aa361f3e5
finetune.py: specifying generation min_length (#8478) 2020-11-26 12:33:02 +05:30
joangines
30e7f7e5da
Create README.md (#8752) 2020-11-25 17:38:21 -05:00
Patrick von Platen
2a6fbe6a40
[XLNet] Fix mems behavior (#8567)
* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible
2020-11-25 16:54:59 -05:00
Joe Davison
369f1d77b4
Return correct Bart hidden state tensors (#8747)
* bart output hidden states upstream

* same w/ decoder

* add tests

* fix prophetnet

* fix gpt2 and ctrl

* fix fstm and skip test for reformer and longformer

* fix all models

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-25 22:06:04 +01:00
Lysandre Debut
138f45c184
Fix QA argument handler (#8765)
* Fix QA argument handler

* Attempt to get a better fix for QA (#8768)

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2020-11-25 14:02:15 -05:00
Sylvain Gugger
4821ea5aeb
Big model table (#8774)
* First draft

* Styling

* With all changes staged

* Update docs/source/index.rst

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Styling

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-25 12:02:15 -05:00
Manuel Romero
90d5ab3bfe
Create README.md (#8761) 2020-11-24 17:51:24 -05:00
Julien Plu
29d4992453
New TF model inputs (#8602)
* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add input processing for TF Flaubert

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add the new inputs in new Longformer models

* Update the template with the new input processing

* Remove useless assert

* Apply style

* Trigger CI
2020-11-24 13:55:00 -05:00
Stas Bekman
82d443a7fd
[core] implement support for run-time dependency version checking (#8645)
* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-24 13:22:25 -05:00
Quentin Lhoest
a7d73cfdd4
fix rag index names in eval_rag.py example (#8730) 2020-11-24 17:04:47 +01:00
Binoy Dalal
8d4ed7e953
added instructions for syncing upstream master with forked master via PR (#8745)
* added instructions for syncing upstream master with forked master via PR

* expand to add a note to why this is requested

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2020-11-24 10:11:46 -05:00
Lysandre Debut
e09e54fd9d
MT5 should have an autotokenizer (#8743)
* MT5 should have an autotokenizer

* Different configurations should be able to point to same tokenizers
2020-11-24 09:50:25 -05:00
Lysandre Debut
6fdd0bb231
Fix slow tests v2 (#8746)
* Fix BART test

* Fix MBART tests

* Remove erroneous line from yaml

* Update tests/test_modeling_bart.py

* Quality
2020-11-24 09:35:12 -05:00
zhiheng-huang
2c83b3c38d
Support various BERT relative position embeddings (2nd) (#8276)
* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-24 14:40:53 +01:00
Julien Chaumond
9e71aa2f8f [EsperBERTo] Fix URLs to assets 2020-11-24 14:15:30 +01:00
Lysandre Debut
02f48b9bfc
Model parallel documentation (#8741)
* Add parallelize methods to the .rst files

* Correct format
2020-11-23 20:14:48 -05:00
LysandreJik
7f2c00913a TF BERT test update 2020-11-23 18:20:19 -05:00
LysandreJik
e1b7e10d5f Update TF BERT test 2020-11-23 18:19:12 -05:00
Colin Brochtrup
8ffc01a76a
Add early stopping callback to pytorch trainer (#8581)
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs
2020-11-23 17:25:35 -05:00
Sylvain Gugger
367f497dec
Fix max length in run_plm script (#8738) 2020-11-23 16:02:31 -05:00
Stas Bekman
e84786aaa6
consistent ignore keys + make private (#8737)
* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment
2020-11-23 12:33:13 -08:00
Sylvain Gugger
49759c0cda Document new training argument 2020-11-23 15:02:59 -05:00
alexorona
1cd9be2aeb
gpt2 and t5 parallel modeling (#8696)
* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-23 14:41:23 -05:00
Stas Bekman
1e45bef0a7
[trainer] make generate work with multigpu (#8716)
* make generate work with multigpu

* better fix - thanks @sgugger
2020-11-23 10:57:27 -08:00
Sylvain Gugger
900024273b
Change default cache path (#8734)
* Change default cache path

* Document changes

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-23 13:56:45 -05:00
Julien Chaumond
0cc5ab1333
Improve bert-japanese tokenizer handling (#8659)
* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be73.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-23 11:15:02 -05:00
Amine Abdaoui
eec76615f6
[model_cards]: control input examples of Geotrend models (#8727)
* [model_cards]: control arabic model examples

* [model_cards]: control input examples of Geotrend models

* [model_cards]: add link to generatation script
2020-11-23 11:09:50 -05:00
Jessica Yung
143b564e59
Add pip install update to resolve import error in transformers notebook (#8616)
* Add pip install update to resolve import error

Add pip install upgrade tensorflow-gpu to remove error below:
```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-094fadb93f3f> in <module>()
      1 import torch
----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer
      3 
      4 torch.set_grad_enabled(False)

4 frames
/usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>()
    133 
    134 # Pipelines
--> 135 from .pipelines import (
    136     Conversation,
    137     ConversationalPipeline,

/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>()
     46     import tensorflow as tf
     47 
---> 48     from .modeling_tf_auto import (
     49         TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING,
     50         TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>()
     49 from .configuration_utils import PretrainedConfig
     50 from .file_utils import add_start_docstrings
---> 51 from .modeling_tf_albert import (
     52     TFAlbertForMaskedLM,
     53     TFAlbertForMultipleChoice,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>()
     22 import tensorflow as tf
     23 
---> 24 from .activations_tf import get_tf_activation
     25 from .configuration_albert import AlbertConfig
     26 from .file_utils import (

/usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>()
     52     "gelu": tf.keras.layers.Activation(gelu),
     53     "relu": tf.keras.activations.relu,
---> 54     "swish": tf.keras.activations.swish,
     55     "silu": tf.keras.activations.swish,
     56     "gelu_new": tf.keras.layers.Activation(gelu_new),

AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish'
```
I have tried running the colab after this change and it seems to work fine (all the cells run with no errors).

* Update notebooks/02-transformers.ipynb

only need to upgrade tensorflow, not tensorflow-gpu.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-23 09:58:52 -05:00
Yossi Synett
18c8cf000b
Fix bug in x-attentions output for roberta and harden test to catch it (#8660) 2020-11-23 13:28:29 +01:00
Tony
48cc224703
[model_cards] Add card for gpt2-rnm (#8673) 2020-11-23 05:52:29 -05:00
Nguyen Van Nha
52585e40af
create README.md (#8682)
* create README.md

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-23 05:51:54 -05:00
Sagor Sarker
b5187e317f
added bangla-bert-sentiment model card (#8687) 2020-11-23 05:51:16 -05:00
moniquebm
b6d864e2f0
Create README.md (#8630)
* Create README.md

* correct metrics id

cc @lhoestq

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-23 04:48:10 -05:00
Santiago Castro
e1f3156b21
Fix many typos (#8708) 2020-11-21 22:58:10 -05:00
Patrick von Platen
9c0afdaf7b
fix flaky ci (#8694) 2020-11-20 22:07:21 +01:00