Commit Graph

155 Commits

Author SHA1 Message Date
Patrick von Platen
068e6b5edd
make files independent (#8267) 2020-11-03 21:13:33 +01:00
Lysandre
eb6313e823 Fix Tatoeba skip 2020-11-03 10:35:00 -05:00
Sam Shleifer
b63beb743c
Skip tatoeba tests if Tatoeba-Challenge not cloned (#8260) 2020-11-03 09:49:29 -05:00
Patrick von Platen
9f1747f999
[Seq2Seq] Correct import in Seq2Seq Trainer (#8254) 2020-11-03 07:56:41 -05:00
Patrick von Platen
9bd30f7cf4
[Seq2SeqTrainer] Move import to init to make file self-contained (#8194)
* boom boom

* reverse order
2020-11-01 23:31:55 +01:00
Sam Shleifer
49e4fece5c
[s2s] distillBART docs for paper replication (#8150) 2020-10-29 12:01:15 -04:00
Santiago Castro
969859d5f6
Fix doc errors and typos across the board (#8139)
* Fix doc errors and typos across the board

* Fix a typo

* Fix the CI

* Fix more typos

* Fix CI

* More fixes

* Fix CI

* More fixes

* More fixes
2020-10-29 10:33:33 -04:00
Stas Bekman
825925dfaa
[s2s test] cleanup (#8131) 2020-10-28 16:50:36 -04:00
Sean Naren
5e24982e58
Upgrade PyTorch Lightning to 1.0.2 (#7852)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-28 14:59:14 -04:00
Stas Bekman
5423f2a9d4
[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107)
* move the helper code into testing_utils

* port test_trainer_distributed to work with pytest

* improve docs

* simplify notes

* doc

* doc

* style

* doc

* further improvements

* torch might not be available

* real fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-28 11:51:32 -04:00
Patrick von Platen
664c7ec453
[Seq2Seq Trainer] Make sure padding is implemented for models without pad_token (#8043)
* make sure padding is implemented for non-padding tokens models as well

* add better error message

* add better warning

* remove results files

* Update examples/seq2seq/seq2seq_trainer.py

* remove unnecessary copy line

* correct usage of labels

* delete test files
2020-10-26 17:28:16 +01:00
Patrick von Platen
3c682ea15c
[Examples] Allow EncoderDecoderModels to be trained with Seq2Seq (#7809)
* Make Seq2Seq Trainer more similar to Trainer

* fix typo

* fix seq2seq trainer

* remove from tests

* remove lock

* remove train files

* delete test files

* correct typo

* check at init

* make sure trainer is not slowed down on TPU

* correct isort

* remove use cache

* fix use cache

* add last use chache = false
2020-10-23 23:05:51 +02:00
Stas Bekman
023f0f3708
[s2s trainer] tests to use distributed on multi-gpu machine (#7965) 2020-10-22 17:26:22 -04:00
Stas Bekman
8b38173398
[seq2seq testing] multigpu test run via subprocess (#7281)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-21 17:20:53 -04:00
Stas Bekman
0e24e4c136
[s2s] create doc for pegasus/fsmt replication (#7934) 2020-10-20 15:07:52 -04:00
Stas Bekman
3e31e7f956
[testing] rename skip targets + docs (#7863)
* rename skip targets + docs

* fix quotes

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* small improvements

* fix

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-20 04:39:13 -04:00
Stas Bekman
9f7b2b2432
[s2s testing] turn all to unittests, use auto-delete temp dirs (#7859) 2020-10-17 14:33:21 -04:00
Stas Bekman
1652ddad35
[seq2seq testing] improve readability (#7845) 2020-10-16 09:05:29 -04:00
Sam Shleifer
96e47d9229
[cleanup] assign todos, faster bart-cnn test (#7835)
* 2 beam output

* unassign/remove TODOs

* remove one more
2020-10-16 03:11:18 -04:00
Stas Bekman
2255c2c7a0
[seq2seq] get_git_info fails gracefully (#7843)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-16 00:22:43 -04:00
Sylvain Gugger
a1d1b332d0
Add predict step accumulation (#7767)
* Add eval_accumulation_step and clean distributed eval

* Add TPU test

* Add TPU stuff

* Fix arg name

* Fix Seq2SeqTrainer

* Fix total_size

* Update src/transformers/trainer_pt_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Doc and add test to TPU

* Add unit test

* Adapt name

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-14 11:41:45 -04:00
Tiger
7e73c12805
fixed lots of typos. (#7758) 2020-10-13 10:00:20 -04:00
Sam Shleifer
9c2b2db2cd
[marian] Automate Tatoeba-Challenge conversion (#7709) 2020-10-12 12:24:25 -04:00
Sam Shleifer
827c519494
[examples] bump pl=0.9.0 (#7053) 2020-10-11 16:39:38 -04:00
Sam Shleifer
297233fa92
[s2s] Switch README urls to cdn (#7670) 2020-10-08 21:22:22 -04:00
Sam Shleifer
a1ecc90d6b
[pseudo] Switch URLS to CDN (#7661) 2020-10-08 14:12:39 -04:00
Suraj Patil
06a973fd2a
[s2s] configure lr_scheduler from command line (#7641) 2020-10-08 13:06:35 -04:00
Sam Shleifer
aba4e22944
[pseudolabels] cleanup markdown table (#7653) 2020-10-07 23:04:18 -04:00
Sam Shleifer
e2bb9abb6a
[s2s] release pseudolabel links and instructions (#7639) 2020-10-07 11:20:44 -04:00
Sylvain Gugger
08ba4b4902
Trainer callbacks (#7596)
* Initial callback proposal

* Finish various callbacks

* Post-rebase conflicts

* Fix tests

* Don't use something that's not set

* Documentation

* Remove unwanted print.

* Document all models can work

* Add tests + small fixes

* Update docs/source/internal/trainer_utils.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Fix TF tests

* Real fix this time

* This one should work

* Fix typo

* Really fix typo

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-07 10:50:21 -04:00
Sam Shleifer
500be01c5d
[s2s] save first batch to json for debugging purposes (#6810) 2020-10-06 16:11:56 -04:00
Sam Shleifer
d5d2744aa7
Support T5 Distillation w/hidden state supervision (#7599) 2020-10-05 21:31:48 -04:00
Suraj Patil
99cb924bfb
[s2s] add config params like Dropout in Seq2SeqTrainingArguments (#7532) 2020-10-04 12:42:30 -04:00
Sam Shleifer
9bdce3a4f9
[s2s] fix lockfile and peg distillation constants (#7545) 2020-10-02 15:58:14 -04:00
Sam Shleifer
de4d7b004a
[s2s] Adafactor support for builtin trainer (#7522) 2020-10-01 17:27:45 -04:00
Sam Shleifer
d3a9601a11
[s2s] trainer scripts: Remove --run_name, thanks sylvain! (#7521) 2020-10-01 17:18:47 -04:00
Sylvain Gugger
bdcc4b78a2
Fix seq2seq example test (#7518)
* Fix seq2seq example test

* Fix bad copy-paste

* Also save the state
2020-10-01 14:13:29 -04:00
Sam Shleifer
2a358f45ef
[s2s] fix nltk pytest race condition with FileLock (#7515) 2020-10-01 12:51:09 -04:00
Suraj Patil
72d363d979
[examples/s2s] clean up finetune_trainer (#7509) 2020-10-01 12:19:29 -04:00
Sam Shleifer
48f23f92a8
[s2sTrainer] test + code cleanup (#7467) 2020-10-01 00:33:01 -04:00
Sam Shleifer
03e46c1de3
[s2s] fix kwargs style (#7488) 2020-09-30 17:00:06 -04:00
Sam Shleifer
6fe8a693eb
[s2s] Fix t5 warning for distributed eval (#7487) 2020-09-30 16:58:03 -04:00
Amanpreet Singh
c031d01023
Seq2SeqDataset: avoid passing src_lang everywhere (#7470)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-30 13:27:48 -04:00
Suraj Patil
08939cfdf7
[s2strainer] fix eval dataset loading (#7477) 2020-09-30 12:39:13 -04:00
Sam Shleifer
74d8d69bd4
[s2s] consistent output format across eval scripts (#7435) 2020-09-28 23:20:03 -04:00
Sam Shleifer
748425d47d
[T5] allow config.decoder_layers to control decoder size (#7409)
* Working assymmetrical T5

* rename decoder_layers -> num_decoder_layers

* Fix docstring

* Allow creation of asymmetric t5 students
2020-09-28 03:08:04 -04:00
Sam Shleifer
7296fea1d6
[s2s] rougeLSum expects \n between sentences (#7410)
Co-authored-by: Swetha Mandava <smandava@nvidia.com>
2020-09-27 16:27:19 -04:00
Suraj Patil
eab5f59682
[s2s] add create student script (#7290)
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-27 15:10:46 -04:00
Suraj Patil
415071b4c2
doc changes (#7385) 2020-09-25 08:00:36 -04:00
Suraj Patil
9e68d075a4
Seq2SeqTrainer (#6769)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-24 18:46:58 -04:00