Commit Graph

5958 Commits

Author SHA1 Message Date
LysandreJik
e1b7e10d5f Update TF BERT test 2020-11-23 18:19:12 -05:00
Colin Brochtrup
8ffc01a76a
Add early stopping callback to pytorch trainer (#8581)
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs
2020-11-23 17:25:35 -05:00
Sylvain Gugger
367f497dec
Fix max length in run_plm script (#8738) 2020-11-23 16:02:31 -05:00
Stas Bekman
e84786aaa6
consistent ignore keys + make private (#8737)
* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment
2020-11-23 12:33:13 -08:00
Sylvain Gugger
49759c0cda Document new training argument 2020-11-23 15:02:59 -05:00
alexorona
1cd9be2aeb
gpt2 and t5 parallel modeling (#8696)
* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-23 14:41:23 -05:00
Stas Bekman
1e45bef0a7
[trainer] make generate work with multigpu (#8716)
* make generate work with multigpu

* better fix - thanks @sgugger
2020-11-23 10:57:27 -08:00
Sylvain Gugger
900024273b
Change default cache path (#8734)
* Change default cache path

* Document changes

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-23 13:56:45 -05:00
Julien Chaumond
0cc5ab1333
Improve bert-japanese tokenizer handling (#8659)
* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be73.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-23 11:15:02 -05:00
Amine Abdaoui
eec76615f6
[model_cards]: control input examples of Geotrend models (#8727)
* [model_cards]: control arabic model examples

* [model_cards]: control input examples of Geotrend models

* [model_cards]: add link to generatation script
2020-11-23 11:09:50 -05:00
Jessica Yung
143b564e59
Add pip install update to resolve import error in transformers notebook (#8616)
* Add pip install update to resolve import error

Add pip install upgrade tensorflow-gpu to remove error below:
```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-094fadb93f3f> in <module>()
      1 import torch
----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer
      3 
      4 torch.set_grad_enabled(False)

4 frames
/usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>()
    133 
    134 # Pipelines
--> 135 from .pipelines import (
    136     Conversation,
    137     ConversationalPipeline,

/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>()
     46     import tensorflow as tf
     47 
---> 48     from .modeling_tf_auto import (
     49         TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING,
     50         TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>()
     49 from .configuration_utils import PretrainedConfig
     50 from .file_utils import add_start_docstrings
---> 51 from .modeling_tf_albert import (
     52     TFAlbertForMaskedLM,
     53     TFAlbertForMultipleChoice,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>()
     22 import tensorflow as tf
     23 
---> 24 from .activations_tf import get_tf_activation
     25 from .configuration_albert import AlbertConfig
     26 from .file_utils import (

/usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>()
     52     "gelu": tf.keras.layers.Activation(gelu),
     53     "relu": tf.keras.activations.relu,
---> 54     "swish": tf.keras.activations.swish,
     55     "silu": tf.keras.activations.swish,
     56     "gelu_new": tf.keras.layers.Activation(gelu_new),

AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish'
```
I have tried running the colab after this change and it seems to work fine (all the cells run with no errors).

* Update notebooks/02-transformers.ipynb

only need to upgrade tensorflow, not tensorflow-gpu.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-23 09:58:52 -05:00
Yossi Synett
18c8cf000b
Fix bug in x-attentions output for roberta and harden test to catch it (#8660) 2020-11-23 13:28:29 +01:00
Tony
48cc224703
[model_cards] Add card for gpt2-rnm (#8673) 2020-11-23 05:52:29 -05:00
Nguyen Van Nha
52585e40af
create README.md (#8682)
* create README.md

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-23 05:51:54 -05:00
Sagor Sarker
b5187e317f
added bangla-bert-sentiment model card (#8687) 2020-11-23 05:51:16 -05:00
moniquebm
b6d864e2f0
Create README.md (#8630)
* Create README.md

* correct metrics id

cc @lhoestq

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-23 04:48:10 -05:00
Santiago Castro
e1f3156b21
Fix many typos (#8708) 2020-11-21 22:58:10 -05:00
Patrick von Platen
9c0afdaf7b
fix flaky ci (#8694) 2020-11-20 22:07:21 +01:00
Binoy Dalal
29bdb88368
Vectorize RepetitionPenaltyLogitsProcessor to improve performance (#8598)
* refactored exisiting nested loops to vectorized implementation

* replaced explicit indexing with torch.where

* modifying score for previous input_ids only
2020-11-20 19:59:06 +01:00
Roman Kalyakin
2594bd8b73
moved temperature wrapper before topP/topK (#8686) 2020-11-20 19:33:54 +01:00
Quentin Lhoest
8062fa63c5
Fix rag finetuning + add finetuning test (#8585)
* replace init_ddp_connection for index init

* style

* add finetune test

* add test data

* move generate tensors to device

* add test on EM metric

* style

* allow multi process test

* keep gloo process group for retrieval

* add multi-gpu test

* use custom accelerator

* clean test finetune

* minor

* style

* style

* typo

* use python call instead of imported main fumction

* return_dict fix in modeling_rag

* use float32 in retrieval

* store as float32 as well in the custom knowledge dataset example

* style

* rename to finetune_rag

* style

* update readme

* rename utils and callbacks to utils_rag and callbacks_rag

* fix test

* patrick's comments

* generate dummy data in the finetue test script

* remove dummy data files

* style
2020-11-20 19:05:03 +01:00
Sylvain Gugger
63e91f5fde
Document adam betas TrainingArguments (#8688) 2020-11-20 09:27:25 -05:00
Kevin Canwen Xu
94caaa93c2
Update the bibtex with EMNLP demo (#8678)
* Update the bibtex with EMNLP demo

* Update README.md

* Update README.md
2020-11-20 13:26:33 +08:00
Sylvain Gugger
6494910f27
Add sentencepiece to the CI and fix tests (#8672)
* Fix the CI and tests

* Fix quality

* Remove that m form nowhere
2020-11-19 16:44:20 -05:00
Stas Bekman
0ad45e108d
[examples/seq2seq] fix PL deprecation warning (#8577)
* fix deprecation warning

* fix
2020-11-19 21:46:04 +01:00
Arindum Roy
0e19a4c2d6
Update bert-base-multilingual-cased-README.md (#8668)
The heading was originally uncased, which did not reflect the contents of this README. Changed it to cased.
2020-11-19 15:45:06 -05:00
Stas Bekman
06518404cb
revert 2020-11-19 12:12:46 -08:00
Stas Bekman
297a29382f
Please fix your software not to ping master
You may be unaware but you're running some software that meddles with every commit on https://github.com/huggingface/transformers/

Something is wrong with the software you're using. It adds a reference to almost every PR in the master tree. Which is very wrong. Please check your software and please don't do it again.

Example:
see the bottom of this PR and most other PRs:
https://github.com/huggingface/transformers/pull/8639
2020-11-19 12:11:35 -08:00
Stas Bekman
42111f1d56
[tokenizers] convert_to_tensors: don't reconvert when the type is already right (#8283)
* don't reconvert when the type is already right

* better name

* adjust logic as suggested

* merge
2020-11-19 12:06:01 -08:00
Sylvain Gugger
20b658607e
Fix run_ner script (#8664)
* Fix run_ner script

* Pin datasets
2020-11-19 13:59:30 -05:00
Zhylko Dima
ca0109bd68
disable_ngram_loss fix for prophetnet (#8554)
* `disable_ngram_loss` fix for prophetnet

* add changes documentation

* fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments

* mean label smoothing loss

* small refactor

* fix test

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-19 19:18:07 +01:00
Sylvain Gugger
0603564e93 Merge remote-tracking branch 'origin/master' 2020-11-19 12:18:57 -05:00
Sylvain Gugger
1e08af383a Forgot to save... 2020-11-19 12:18:50 -05:00
LysandreJik
d86b5ffc6f Release: v4.0.0-rc-1 2020-11-19 12:00:07 -05:00
Sylvain Gugger
cb3e5c33f7
Fix a few last paths for the new repo org (#8666) 2020-11-19 11:56:42 -05:00
Matthias
a79a96ddaa
fix small typo (#8644)
Fixed a small typo on the XLNet and permutation language modelling section
2020-11-19 11:24:11 -05:00
Sylvain Gugger
4208f496ee
Better filtering of the model outputs in Trainer (#8633)
* Better filtering of the model outputs in Trainer

* Fix examples tests

* Add test for Lysandre
2020-11-19 10:43:15 -05:00
Lysandre Debut
f2e07e7272
Fix a bunch of slow tests (#8634)
* CI should install `sentencepiece`

* Requiring TF

* Fixing some TFDPR bugs

* remove return_dict=False/True hack

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-19 10:41:41 -05:00
elk-cloner
5362bb8a6b
Tf longformer for sequence classification (#8231)
* working on LongformerForSequenceClassification

* add TFLongformerForMultipleChoice

* add TFLongformerForTokenClassification

* use add_start_docstrings_to_model_forward

* test TFLongformerForSequenceClassification

* test TFLongformerForMultipleChoice

* test TFLongformerForTokenClassification

* remove test from repo

* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice

* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes

* pass all tests except test_inputs_embeds

* sync with master

* pass all tests except test_inputs_embeds

* pass all tests

* pass all tests

* work on test_inputs_embeds

* fix style and quality

* make multi choice work

* fix TFLongformerForTokenClassification signature

* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature

* fix mult choice

* fix mc hint

* fix input embeds

* fix input embeds

* refactor input embeds

* fix copy issue

* apply sylvains changes and clean more

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-19 10:37:27 -05:00
Quentin Lhoest
62cd9ce9f8
fix missing return dict (#8653) 2020-11-19 15:17:18 +01:00
Amine Abdaoui
0c2677f529
[model card] : fix bert-base-15lang-cased (#8655)
the table was badly formatted because of a single line break
2020-11-19 05:41:02 -05:00
Amine Abdaoui
0a80959bdd
Add cards for all Geotrend models (#8617)
* docs(bert-base-15lang-cased): add model card

* add cards for all Geotrend models

* [model cards] fix language tag for all Geotrend models
2020-11-19 04:47:24 -05:00
cronoik
dcc9c64299
Updated the Extractive Question Answering code snippets (#8636)
* Updated the Extractive Question Answering code snippets

The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call.

* Update task_summary.rst
2020-11-18 18:56:47 -05:00
Tim Isbister
28d16e7ac5
Update README.md (#8635) 2020-11-18 18:35:23 -05:00
cronoik
b290195ac7
grammar (#8639) 2020-11-18 18:04:25 -05:00
Stas Bekman
d86d57faa3
[s2s] distillation apex breaks return_dict obj (#8631)
* apex breaks return_dict obj

* style
2020-11-18 12:51:29 -08:00
Perez Ogayo
bf3611b2ab
Created ModelCard for Hel-ach-en MT model (#8496)
* Updated ModelCard

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-18 14:42:13 -05:00
Yifan Peng
c95b26a719
Create README.md (#8362) 2020-11-18 13:37:14 -05:00
Manuel Romero
fdbbb6c17a
Model card: T5-base fine-tuned on QuaRTz (#8369)
* Model card: T5-base fine-tuned on QuaRTz

* Update model_cards/mrm8488/t5-base-finetuned-quartz/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-18 13:34:27 -05:00
Yifan Peng
6e6d24c5d8
Create README.md (#8363) 2020-11-18 13:33:04 -05:00