transformers/examples/research_projects/codeparrot/scripts
Jia LI 4868a830db
Jia multi gpu eval (#16428)
* add simple multi gpu complet

* add human_eval_multi_gpu

* use copy strategy to distribute across gpu, to avoid padding

* add doc string

* update code style

* use task id to arrange output

* truncate input to avoid zero pad

* Stop the copy mechanism

* update style

* restore copies to scale better in distributed mode

* update style

* replace human eval

* Apply suggestions from code review

1. Tokenize all input at the same time
2. use attention_mask to get the input length
3. other small fixes

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* correct typo and update docstring

* update code style

* remove num sample division constraint

* remove max len calculation

* use accelerator.gather once to speed up

* use accelerate set_seed; update accelerate version

* correct gather bug

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-04-11 11:24:32 +02:00
..
arguments.py Code parrot minor fixes/niceties (#14666) 2021-12-13 09:30:50 +01:00
bpe_training.py fix: switch from slow to generic tokenizer class (#15122) 2022-01-12 09:12:43 -05:00
codeparrot_training.py Code parrot minor fixes/niceties (#14666) 2021-12-13 09:30:50 +01:00
human_eval.py Jia multi gpu eval (#16428) 2022-04-11 11:24:32 +02:00
initialize_model.py Add CodeParrot 🦜 codebase (#14536) 2021-12-02 10:41:35 +01:00
preprocessing.py Add CodeParrot 🦜 codebase (#14536) 2021-12-02 10:41:35 +01:00
validation_loss.py Add CodeParrot 🦜 codebase (#14536) 2021-12-02 10:41:35 +01:00