Commit Graph

15053 Commits

Author SHA1 Message Date
Julien Chaumond
60a42ef1c0 [model_cards] Fix CamemBERT table markdown
see https://github.com/huggingface/transformers/pull/3836
2020-04-17 20:21:15 -04:00
Julien Chaumond
88aecee6a2 [ci] GitHub-hosted runner has no space left on device 2020-04-17 20:16:00 -04:00
Benjamin Muller
73efa694e6
Update camembert-base-README.md (#3836) 2020-04-17 20:08:13 -04:00
Patrick von Platen
e9d0bc027a
[Config, Serialization] more readable config serialization (#3797)
* better config serialization

* finish configuration utils
2020-04-17 20:07:18 -04:00
Lysandre Debut
8b63a01d95
XLM tokenizer should encode with bos token (#3791)
* XLM tokenizer should encode with bos token

* Update tests
2020-04-17 11:28:55 -04:00
Patrick von Platen
1d4a35b396
Higher tolerance for past testing in TF T5 (#3844) 2020-04-17 11:26:16 -04:00
Patrick von Platen
d13eca11e2
Higher tolerance for past testing in T5 (#3843) 2020-04-17 11:25:14 -04:00
Harutaka Kawamura
b0c9fbb293
Add workflow to build docs (#3763) 2020-04-17 11:23:18 -04:00
Santiago Castro
c19727fd38
Add support for the null answer in QuestionAnsweringPipeline (#3441)
* Add support for the null answer in `QuestionAnsweringPipeline`

* black

* Fix min null score computation

* Fix a PR comment
2020-04-17 11:17:21 -04:00
Simon Böhm
edf0582c0b
Fix token_type_id in BERT question-answering example (#3790)
token_type_id is converted into the segment embedding. For question answering,
this needs to highlight whether a token belongs to sequence 0 or 1.
encode_plus takes care of correctly setting this parameter automatically.
2020-04-17 11:14:12 -04:00
Pierric Cistac
6d00033e97
Question Answering support for Albert and Roberta in TF (#3812)
* Add TFAlbertForQuestionAnswering

* Add TFRobertaForQuestionAnswering

* Update TFAutoModel with Roberta/Albert for QA

* Clean `super` TF Albert calls
2020-04-17 10:45:30 -04:00
Patrick von Platen
f399c00610
Update README 2020-04-17 09:42:22 +02:00
Sam Shleifer
f0c96fafd1
[examples] summarization/bart/finetune.py supports t5 (#3824)
renames `run_bart_sum.py` to `finetune.py`
2020-04-16 15:15:19 -04:00
Jonathan Sum
0cec4fab7d typo: fine-grained token-leven
Changing from "fine-grained token-leven" to "fine-grained token-level"
2020-04-16 15:11:23 -04:00
Aryansh Omray
14cdeee75a Tanh torch warnings 2020-04-16 15:10:35 -04:00
Sam Shleifer
16469fedbd
[PretrainedTokenizer] Factor out tensor conversion method (#3777) 2020-04-16 15:02:43 -04:00
Patrick von Platen
80a1694514
[Examples, T5] Change newstest2013 to newstest2014 and clean up (#3817)
* Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion

* More pythonic file handling through 'with' context

* COSMETIC - ran Black and isort

* Fixed reference to number of lines in newstest2014

* Fixed failing test. More pythonic file handling

* finish PR from tholiao

* remove outcommented lines

* make style

* make isort happy

Co-authored-by: Thomas Liao <tholiao@gmail.com>
2020-04-16 20:00:41 +02:00
Lysandre Debut
d486795158
JIT not compatible with PyTorch/XLA (#3743) 2020-04-16 11:19:24 -04:00
Davide Fiocco
b1e2368b32
Typo fix (#3821) 2020-04-16 11:04:32 -04:00
Patrick von Platen
baca8fa8e6
clean pipelines (#3795) 2020-04-16 10:21:34 -04:00
Patrick von Platen
38f7461df3
[TFT5, Cache] Add cache to TFT5 (#3772)
* correct gpt2 test inputs

* make style

* delete modeling_gpt2 change in test file

* translate from pytorch

* correct tests

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* make tensorflow t5 caching work

* make style

* clean reorder cache

* remove unnecessary spaces

* fix test
2020-04-16 16:14:52 +02:00
Patrick von Platen
a5b249472e
change pad token id to config pad token id (#3793) 2020-04-16 15:58:57 +02:00
Sam Shleifer
dbd041243d
[cleanup] factor out get_head_mask, invert_attn_mask, get_exten… (#3806)
* Delete some copy pasted code
2020-04-16 09:55:25 -04:00
Patrick von Platen
d22894dfd4
[Docs] Add DialoGPT (#3755)
* add dialoGPT

* update README.md

* fix conflict

* update readme

* add code links to docs

* Update README.md

* Update dialo_gpt2.rst

* Update pretrained_models.rst

* Update docs/source/model_doc/dialo_gpt2.rst

Co-Authored-By: Julien Chaumond <chaumond@gmail.com>

* change filename of dialogpt

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-16 09:04:32 +02:00
Sam Shleifer
c59b1e682d
[examples] unit test for run_bart_sum (#3544)
- adds pytorch-lightning dependency
2020-04-15 18:35:01 -04:00
Patrick von Platen
301bf8d1b4
Create Modelcard for Reformer Model 2020-04-15 16:26:24 +02:00
Patrick von Platen
01c37dcdb5
[Config, Caching] Remove output_past everywhere and replace by use_cache argument (#3734)
* remove output_past from pt

* make style

* add optional input length for gpt2

* add use cache to prepare input

* save memory in gpt2

* correct gpt2 test inputs

* make past input optional for gpt2

* finish use_cache for all models

* make style

* delete modeling_gpt2 change in test file

* correct docstring

* correct is true statements for gpt2
2020-04-14 14:40:28 -04:00
Patrick von Platen
092cf881a5
[Generation, EncoderDecoder] Apply Encoder Decoder 1.5GB memory… (#3778) 2020-04-13 22:29:28 -04:00
Teven
352d5472b0
Shift labels internally within TransfoXLLMHeadModel when called with labels (#3716)
* Shifting labels inside TransfoXLLMHead

* Changed doc to reflect change

* Updated pytorch test

* removed IDE whitespace changes

* black reformat

Co-authored-by: TevenLeScao <teven.lescao@gmail.com>
2020-04-13 18:11:23 +02:00
elk-cloner
5ebd898953
fix dataset shuffling for Distributed training (#huggingface#3721) (#3766) 2020-04-13 10:11:18 -04:00
HenrykBorzymowski
7972a4019f
updated dutch squad model card (#3736)
* added model_cards for polish squad models

* corrected mistake in polish design cards

* updated model_cards for squad2_dutch model

* added links to benchmark models

Co-authored-by: Henryk Borzymowski <henryk.borzymowski@pwc.com>
2020-04-11 06:44:59 -04:00
HUSEIN ZOLKEPLI
f8c1071c51
Added README huseinzol05/albert-tiny-bahasa-cased (#3746)
* add bert bahasa readme

* update readme

* update readme

* added xlnet

* added tiny-bert and fix xlnet readme

* added albert base

* added albert tiny
2020-04-11 06:42:06 -04:00
Jin Young Sohn
700ccf6e35
Fix glue_convert_examples_to_features API breakage (#3742) 2020-04-10 16:03:27 -04:00
Anthony MOI
b7cf9f43d2
Update tokenizers to 0.7.0-rc5 (#3705) 2020-04-10 14:23:49 -04:00
Jin Young Sohn
551b450527
Add run_glue_tpu.py that trains models on TPUs (#3702)
* Initial commit to get BERT + run_glue.py on TPU

* Add README section for TPU and address comments.

* Cleanup TPU bits from run_glue.py (#3)

TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.

* Cleanup TPU bits from run_glue.py

TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.

* No need to call `xm.mark_step()` explicitly (#4)

Since for gradient accumulation we're accumulating on batches from
`ParallelLoader` instance which on next() marks the step itself.

* Resolve R/W conflicts from multiprocessing (#5)

* Add XLNet in list of models for `run_glue_tpu.py` (#6)

* Add RoBERTa to list of models in TPU GLUE (#7)

* Add RoBERTa and DistilBert to list of models in TPU GLUE (#8)

* Use barriers to reduce duplicate work/resources (#9)

* Shard eval dataset and aggregate eval metrics (#10)

* Shard eval dataset and aggregate eval metrics

Also, instead of calling `eval_loss.item()` every time do summation with
tensors on device.

* Change defaultdict to float

* Reduce the pred, label tensors instead of metrics

As brought up during review some metrics like f1 cannot be aggregated
via averaging. GLUE task metrics depends largely on the dataset, so
instead we sync the prediction and label tensors so that the metrics can
be computed accurately on those instead.

* Only use tb_writer from master (#11)

* Apply huggingface black code formatting

* Style

* Remove `--do_lower_case` as example uses cased

* Add option to specify tensorboard logdir

This is needed for our testing framework which checks regressions
against key metrics writtern by the summary writer.

* Using configuration for `xla_device`

* Prefix TPU specific comments.

* num_cores clarification and namespace eval metrics

* Cache features file under `args.cache_dir`

Instead of under `args.data_dir`. This is needed as our test infra uses
data_dir with a read-only filesystem.

* Rename `run_glue_tpu` to `run_tpu_glue`

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-04-10 12:53:54 -04:00
Julien Chaumond
cbad305ce6
[docs] The use of do_lower_case in scripts is on its way to deprecation (#3738) 2020-04-10 12:34:04 -04:00
Julien Chaumond
b169ac9c2b
[examples] Generate argparsers from type hints on dataclasses (#3669)
* [examples] Generate argparsers from type hints on dataclasses

* [HfArgumentParser] way simpler API

* Restore run_language_modeling.py for easier diff

* [HfArgumentParser] final tweaks from code review
2020-04-10 12:21:58 -04:00
Sam Shleifer
7a7fdf71f8
Multilingual BART - (#3602)
- support mbart-en-ro weights
- add MBartTokenizer
2020-04-10 11:25:39 -04:00
Julien Chaumond
f98d0ef2a2
Big cleanup of glue_convert_examples_to_features (#3688)
* Big cleanup of `glue_convert_examples_to_features`

* Use batch_encode_plus

* Cleaner wrapping of glue_convert_examples_to_features for TF

@lysandrejik

* Cleanup syntax, thanks to @mfuntowicz

* Raise explicit error in case of user error
2020-04-10 10:20:18 -04:00
Patrick von Platen
ce2298fb5f
[T5, generation] Add decoder caching for T5 (#3682)
* initial commit to add decoder caching for T5

* better naming for caching

* finish T5 decoder caching

* correct test

* added extensive past testing for T5

* clean files

* make tests cleaner

* improve docstring

* improve docstring

* better reorder cache

* make style

* Update src/transformers/modeling_t5.py

Co-Authored-By: Yacine Jernite <yjernite@users.noreply.github.com>

* make set output past work for all layers

* improve docstring

* improve docstring

Co-authored-by: Yacine Jernite <yjernite@users.noreply.github.com>
2020-04-10 01:02:50 +02:00
calpt
9384e5f6de
Fix force_download of files on Windows (#3697) 2020-04-09 14:44:57 -04:00
Julien Chaumond
bc65afc4df [Exbert] Change style of button 2020-04-09 10:44:42 -04:00
LysandreJik
31baeed614 Update quotes
cc @julien-c
2020-04-09 09:09:00 -04:00
Teven
f8208fa456
Correct transformers-cli env call 2020-04-09 09:03:19 +02:00
Lysandre Debut
6435b9f908
Updating the TensorFlow models to work as expected with tokenizers v3.0.0 (#3684)
* Updating modeling tf files; adding tests

* Merge `encode_plus` and `batch_encode_plus`
2020-04-08 16:22:44 -04:00
LysandreJik
500aa12318 close #3699 2020-04-08 14:32:47 -04:00
Julien Chaumond
a594ee9c84
More doc for model cards (#3698)
see https://github.com/huggingface/transformers/pull/3679#pullrequestreview-389368270
2020-04-08 12:12:52 -04:00
Julien Chaumond
83703cd077 Update doc for {Summarization,Translation}Pipeline and other tweaks 2020-04-08 09:45:00 -04:00
Seyone Chithrananda
a1b3b4167e
Created README.md for model card ChemBERTa (#3666)
* created readme.md

* update readme with fixes

Fixes from PR comments
2020-04-08 09:10:20 -04:00
Lorenzo Ampil
747907dc5e Fix typo in FeatureExtractionPipeline docstring 2020-04-08 09:08:56 -04:00