Stas Bekman
8b38173398
[seq2seq testing] multigpu test run via subprocess ( #7281 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-21 17:20:53 -04:00
Stas Bekman
0e24e4c136
[s2s] create doc for pegasus/fsmt replication ( #7934 )
2020-10-20 15:07:52 -04:00
Stas Bekman
3e31e7f956
[testing] rename skip targets + docs ( #7863 )
...
* rename skip targets + docs
* fix quotes
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* small improvements
* fix
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-20 04:39:13 -04:00
Stas Bekman
9f7b2b2432
[s2s testing] turn all to unittests, use auto-delete temp dirs ( #7859 )
2020-10-17 14:33:21 -04:00
Stas Bekman
1652ddad35
[seq2seq testing] improve readability ( #7845 )
2020-10-16 09:05:29 -04:00
Sam Shleifer
96e47d9229
[cleanup] assign todos, faster bart-cnn test ( #7835 )
...
* 2 beam output
* unassign/remove TODOs
* remove one more
2020-10-16 03:11:18 -04:00
Stas Bekman
2255c2c7a0
[seq2seq] get_git_info fails gracefully ( #7843 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-16 00:22:43 -04:00
Sylvain Gugger
a1d1b332d0
Add predict step accumulation ( #7767 )
...
* Add eval_accumulation_step and clean distributed eval
* Add TPU test
* Add TPU stuff
* Fix arg name
* Fix Seq2SeqTrainer
* Fix total_size
* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Doc and add test to TPU
* Add unit test
* Adapt name
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-14 11:41:45 -04:00
Tiger
7e73c12805
fixed lots of typos. ( #7758 )
2020-10-13 10:00:20 -04:00
Sam Shleifer
9c2b2db2cd
[marian] Automate Tatoeba-Challenge conversion ( #7709 )
2020-10-12 12:24:25 -04:00
Sam Shleifer
827c519494
[examples] bump pl=0.9.0 ( #7053 )
2020-10-11 16:39:38 -04:00
Sam Shleifer
297233fa92
[s2s] Switch README urls to cdn ( #7670 )
2020-10-08 21:22:22 -04:00
Sam Shleifer
a1ecc90d6b
[pseudo] Switch URLS to CDN ( #7661 )
2020-10-08 14:12:39 -04:00
Suraj Patil
06a973fd2a
[s2s] configure lr_scheduler from command line ( #7641 )
2020-10-08 13:06:35 -04:00
Sam Shleifer
aba4e22944
[pseudolabels] cleanup markdown table ( #7653 )
2020-10-07 23:04:18 -04:00
Sam Shleifer
e2bb9abb6a
[s2s] release pseudolabel links and instructions ( #7639 )
2020-10-07 11:20:44 -04:00
Sylvain Gugger
08ba4b4902
Trainer callbacks ( #7596 )
...
* Initial callback proposal
* Finish various callbacks
* Post-rebase conflicts
* Fix tests
* Don't use something that's not set
* Documentation
* Remove unwanted print.
* Document all models can work
* Add tests + small fixes
* Update docs/source/internal/trainer_utils.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Fix TF tests
* Real fix this time
* This one should work
* Fix typo
* Really fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-07 10:50:21 -04:00
Sam Shleifer
500be01c5d
[s2s] save first batch to json for debugging purposes ( #6810 )
2020-10-06 16:11:56 -04:00
Sam Shleifer
d5d2744aa7
Support T5 Distillation w/hidden state supervision ( #7599 )
2020-10-05 21:31:48 -04:00
Suraj Patil
99cb924bfb
[s2s] add config params like Dropout in Seq2SeqTrainingArguments ( #7532 )
2020-10-04 12:42:30 -04:00
Sam Shleifer
9bdce3a4f9
[s2s] fix lockfile and peg distillation constants ( #7545 )
2020-10-02 15:58:14 -04:00
Sam Shleifer
de4d7b004a
[s2s] Adafactor support for builtin trainer ( #7522 )
2020-10-01 17:27:45 -04:00
Sam Shleifer
d3a9601a11
[s2s] trainer scripts: Remove --run_name, thanks sylvain! ( #7521 )
2020-10-01 17:18:47 -04:00
Sylvain Gugger
bdcc4b78a2
Fix seq2seq example test ( #7518 )
...
* Fix seq2seq example test
* Fix bad copy-paste
* Also save the state
2020-10-01 14:13:29 -04:00
Sam Shleifer
2a358f45ef
[s2s] fix nltk pytest race condition with FileLock ( #7515 )
2020-10-01 12:51:09 -04:00
Suraj Patil
72d363d979
[examples/s2s] clean up finetune_trainer ( #7509 )
2020-10-01 12:19:29 -04:00
Sam Shleifer
48f23f92a8
[s2sTrainer] test + code cleanup ( #7467 )
2020-10-01 00:33:01 -04:00
Sam Shleifer
03e46c1de3
[s2s] fix kwargs style ( #7488 )
2020-09-30 17:00:06 -04:00
Sam Shleifer
6fe8a693eb
[s2s] Fix t5 warning for distributed eval ( #7487 )
2020-09-30 16:58:03 -04:00
Amanpreet Singh
c031d01023
Seq2SeqDataset: avoid passing src_lang everywhere ( #7470 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-30 13:27:48 -04:00
Suraj Patil
08939cfdf7
[s2strainer] fix eval dataset loading ( #7477 )
2020-09-30 12:39:13 -04:00
Sam Shleifer
74d8d69bd4
[s2s] consistent output format across eval scripts ( #7435 )
2020-09-28 23:20:03 -04:00
Sam Shleifer
748425d47d
[T5] allow config.decoder_layers to control decoder size ( #7409 )
...
* Working assymmetrical T5
* rename decoder_layers -> num_decoder_layers
* Fix docstring
* Allow creation of asymmetric t5 students
2020-09-28 03:08:04 -04:00
Sam Shleifer
7296fea1d6
[s2s] rougeLSum expects \n between sentences ( #7410 )
...
Co-authored-by: Swetha Mandava <smandava@nvidia.com>
2020-09-27 16:27:19 -04:00
Suraj Patil
eab5f59682
[s2s] add create student script ( #7290 )
...
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-27 15:10:46 -04:00
Suraj Patil
415071b4c2
doc changes ( #7385 )
2020-09-25 08:00:36 -04:00
Suraj Patil
9e68d075a4
Seq2SeqTrainer ( #6769 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-24 18:46:58 -04:00
Sam Shleifer
d9d0f1140b
[s2s] distributed eval allows num_return_sequences > 1 ( #7254 )
2020-09-24 17:30:09 -04:00
Stas Bekman
eadd870b2f
[seq2seq] make it easier to run the scripts ( #7274 )
2020-09-24 15:23:48 -04:00
Sam Shleifer
78387cc63e
[s2s] only save metrics.json from rank zero ( #7331 )
2020-09-22 18:27:28 -04:00
Sam Shleifer
e53138a1b9
[s2s] add src_lang kwarg for distributed eval ( #7300 )
2020-09-22 18:26:37 -04:00
Sam Shleifer
25b0463d0b
[s2s] add supported architecures to MD ( #7252 )
2020-09-22 13:09:35 -04:00
Sam Shleifer
656c27c3a3
[s2s] save hostname with repo info ( #7301 )
...
* save hostname
2020-09-21 17:26:24 -04:00
Stas Bekman
af4b98ed97
[s2s] adjust finetune + test to work with fsmt ( #7263 )
2020-09-21 15:13:19 -04:00
Stas Bekman
8d562a2d1a
[s2s] s/alpha_loss_encoder/alpha_encoder_loss/ ( #7298 )
...
fix to match `distillation.py: self.alpha_encoder_loss`
2020-09-21 14:14:26 -04:00
Stas Bekman
cbb2f75a16
[s2s tests] fix test_run_eval_search ( #7297 )
2020-09-21 14:00:40 -04:00
Stas Bekman
7cbf0f722d
examples/seq2seq/__init__.py mutates sys.path ( #7194 )
2020-09-20 16:54:42 -04:00
Sam Shleifer
83dba10b8f
[s2s] distributed_eval.py saves better speed info ( #7242 )
2020-09-18 15:46:01 -04:00
Sam Shleifer
67d9fc50d9
[s2s] remove double assert ( #7223 )
2020-09-17 18:32:31 -04:00
Sam Shleifer
a5638b2b3a
[s2s] dynamic batch size with --max_tokens_per_batch ( #7030 )
2020-09-17 15:19:34 -04:00