Philip May
cfca638acb
Add MT5ForConditionalGeneration as supported arch. to summarization README ( #11961 )
...
* Add MT5ForConditionalGeneration as supported arch.
* Update README.md
2021-05-31 21:24:33 +05:30
Nicholas Vadivelu
1ab147d648
Remove redundant nn.log_softmax
in run_flax_glue.py
( #11920 )
...
* Remove redundant `nn.log_softmax` in `run_flax_glue.py`
`optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference.
* Remove unused 'flax.linen' import
2021-05-31 15:29:04 +01:00
Avital Oliver
2df546918e
Link official Cloud TPU JAX docs ( #11892 )
2021-05-26 15:44:40 -04:00
Stas Bekman
1b6530104d
[Examples] create model with custom config on the fly ( #11798 )
...
* create custom model on the flight
* better wording
* add update_from_string
* cleanup
* cleanup
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more bool options
* style
* fix logger
* add test
* add the doc
* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-25 10:40:49 -07:00
Stas Bekman
6287c929c1
[lm examples] fix overflow in perplexity calc ( #11855 )
...
* fix overflow in perplexity calc
* use inf
* fix
2021-05-25 08:11:26 -07:00
Sylvain Gugger
f086652b16
Add option to log only once in multinode training ( #11819 )
...
* Add option to long only once in multinode training
* Use an alternate property
2021-05-25 08:03:43 -04:00
Wang Ran (汪然)
b8344a274f
typo ( #11858 )
2021-05-25 04:23:46 -04:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error ( #11839 )
...
* fix_torch_device_generate_test
* remove @
* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Patrick von Platen
da22245ed9
Add flax text class colab ( #11824 )
...
* fix_torch_device_generate_test
* remove @
* add flax glue link
2021-05-21 23:11:58 +01:00
Patrick von Platen
82335185fe
[Flax] Small fixes in run_flax_glue.py
( #11820 )
...
* fix_torch_device_generate_test
* remove @
* correct best seed for flax fine-tuning
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 16:52:23 +01:00
Patrick von Platen
bd9871657b
[Flax] Align GLUE training script with mlm training script ( #11778 )
...
* speed up flax glue
* remove unnecessary line
* remove folder
* remove run in loop
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 09:36:56 +01:00
Keren Fuentes
223943872e
Fix failing test on Windows Platform ( #11589 )
...
* add separator for windows
* fixes test_is_copy_consistent on Windows
* fixing writing encoding issue on extended test (for Windows)
* resolving comments
2021-05-20 19:54:23 -04:00
Patrick von Platen
00440e350f
[Flax MLM] Refactor run mlm with optax ( #11745 )
...
* refactor
* update
* update
* update
* refactor run mlm
* finalize
* refactor more
* fix typo
* update
* finish refactor
* modify run mlm
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* small fixes
* upload
* upload
* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-19 12:00:58 +01:00
Tomy Hsieh
eb3e072a3b
Fix a small error in summarization example ( #11762 )
2021-05-18 14:38:36 -04:00
Avital Oliver
77f9bd18af
Add Flax Examples and Cloud TPU README ( #11753 )
...
* Add Flax Examples README
* Apply suggestions from code review
* Update examples/flax/README.md
* add nice table
* fix
* fix
* apply suggestions
* upload
* finish flax readme.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-18 17:45:16 +01:00
Philipp Schmid
04e25c6286
add dataset_name
to data_args and added accuracy metric ( #11760 )
...
* add `dataset_name` to data_args and added accuracy metric
* added documentation for dataset_name
* spelling correction
2021-05-18 16:27:29 +02:00
Patrick von Platen
cebb96f53a
Add more subsections to main doc ( #11758 )
...
* add headers to main doc
* Apply suggestions from code review
* update
* upload
2021-05-18 14:38:56 +01:00
Tommy Chiang
da7e73b721
Fix incorrect newline in #11650 ( #11757 )
2021-05-18 15:28:13 +02:00
Sylvain Gugger
936b57158a
Use new evaluation loop in TrainerQA ( #11746 )
2021-05-17 10:10:13 -04:00
Marc van Zee
726e953d44
Improvements to Flax finetuning script ( #11727 )
...
* Add Cloud details to README
* Flax script and readme updates
* Some simplifications of Flax script
2021-05-17 09:26:33 +01:00
Marc van Zee
94a2348706
Add Cloud details to README ( #11706 )
...
* Add Cloud details to README
* Flax script and readme updates
2021-05-14 14:51:25 +01:00
Patrick von Platen
113eaa7575
correct example script ( #11726 )
2021-05-14 12:02:57 +01:00
Lysandre
d77eb0cf92
Docs for v4.7.0.dev0
2021-05-12 17:08:35 +02:00
Lysandre
64e78564a5
Release: v4.6.0
2021-05-12 17:03:03 +02:00
Philip May
77f4c46b50
remove defaults to None if optional ( #11703 )
2021-05-12 09:11:10 -04:00
Marc van Zee
6797cdc077
Updates README and fixes bug ( #11701 )
2021-05-12 13:52:52 +01:00
Marc van Zee
4ce6bcc310
Adds Flax BERT finetuning example on GLUE ( #11564 )
...
* Adds Flax BERT finetuning example
* fix traced jax tensor type
* Use Optax losses and learning schedulers
* Add 1GPU training results
* merge into master & make style
* fix input
* del file
* Fix bug in loss and add torch runs
* finish bert flax fine-tune
* Update examples/flax/text-classification/README.md
* Update examples/flax/text-classification/run_flax_glue.py
* add requirements
* finalize
* finalize
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-11 19:02:59 +01:00
Sylvain Gugger
a135f59536
Auto modelcard ( #11599 )
...
* Autogenerate model cards from the Trainer
* ModelCard deprecated
* Fix test
* Style
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
* Quality
* With all metadata
* Metadata
* Post-merge conflict mess
* Data args and all examples
* Default license and languages when possible
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-11 11:30:34 -04:00
Jonathan Chang
64232bc0df
Add --text_column to run_summarization_no_trainer ( #11673 )
2021-05-11 07:58:38 -04:00
Matt
ef8d32c5ea
Fix suggested by @bhadreshpsavani ( #11660 )
2021-05-10 14:28:04 +01:00
Quentin Lhoest
1a0b41781d
Update requirements.txt ( #11634 )
2021-05-10 11:19:52 +05:30
Tommy Chiang
7e406f4a65
[Examples] Fix invalid links after reorg ( #11650 )
2021-05-10 11:16:48 +05:30
Tommy Chiang
f2ffcaf49f
[Examples] Check key exists in datasets first ( #11503 )
2021-05-09 15:42:38 -04:00
Stas Bekman
ba0d50f214
[examples] fix sys.path in conftest.py ( #11636 )
...
* restore conftest.py
* fix conftest and make copies
* remove unneeded parts
* remove unwanted files
2021-05-07 14:44:22 -07:00
Jonathan Chang
6f40e31766
Fix comment in run_clm_no_trainer.py ( #11624 )
2021-05-07 12:32:30 +05:30
Vipul Raheja
f594090a93
fix typo in command ( #11605 )
2021-05-06 12:32:54 +05:30
Patrick von Platen
3e3e41ae20
Pytorch - Lazy initialization of models ( #11471 )
...
* lazy_init_weights
* remove ipdb
* save int
* add necessary code
* remove unnecessary utils
* Update src/transformers/models/t5/modeling_t5.py
* clean
* add tests
* correct
* finish tests
* finish tests
* fix some more tests
* fix xlnet & transfo-xl
* fix more tests
* make sure tests are independent
* fix tests more
* finist tests
* final touches
* Update src/transformers/modeling_utils.py
* Apply suggestions from code review
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* clean tests
* give arg positive name
* add more mock weights to xlnet
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-05 17:22:20 +02:00
Sylvain Gugger
6b241e0e3b
Reproducible checkpoint ( #11582 )
...
* Set generator in dataloader
* Use generator in all random samplers
* Checkpoint all RNG states
* Final version
* Quality
* Test
* Address review comments
* Quality
* Remove debug util
* Add python and numpy RNGs
* Split states in different files in distributed
* Quality
* local_rank for TPUs
* Only use generator when accepted
* Add test
* Set seed to avoid flakiness
* Make test less flaky
* Quality
2021-05-04 16:20:56 -04:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py ( #11470 )
...
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Sylvain Gugger
87dd1a00ef
Fix metric computation in run_glue_no_trainer
( #11569 )
2021-05-03 11:42:55 -04:00
Bhadresh Savani
84326a28f8
[Examples] Added support for test-file in QA examples with no trainer ( #11510 )
...
* added support for test-file
* fixed typo
* added suggested changes
* reformatted code
* modifed files
* fix post processing error
* Trigger CI
* removed extra lines
2021-04-30 09:02:50 -04:00
Suraj Patil
57c8e822f7
reszie token embeds ( #11524 )
2021-04-30 08:47:01 -04:00
Matt
20d6931e32
Update TF text classification example ( #11496 )
...
Big refactor, fixes and multi-GPU/TPU support
2021-04-30 13:45:33 +01:00
Manuel Romero
58c789e3d2
Update README.md ( #11489 )
...
Add link to code
2021-04-30 04:29:59 -04:00
Sylvain Gugger
b29eb247d3
Split checkpoint from model_name_or_path in examples ( #11492 )
...
* Split checkpoint from model_name_or_path in examples
* Address review comments
* Address review comments
2021-04-29 18:33:47 -04:00
Jaimeen Ahn
0661abc545
Variable Correction for Consistency in Distillation Example ( #11444 )
...
As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively, the correction makes the example work
2021-04-26 13:30:48 -04:00
Bhadresh Savani
1d30ec95c7
[Examples] Fixes inconsistency around eval vs val and predict vs test ( #11380 )
...
* added changes for uniformity
* modified files
* corrected typo
* fixed qa scripts
* fix typos
* fixed predict typo in qa no trainer
* fixed test file
* reverted trainer changes
* reverted trainer changes in custom exmaples
* updated readme
* added changes in deepspeed test
* added changes for predict and eval
2021-04-26 09:24:31 -07:00
Amine Abdaoui
e3e70f9551
docs(examples): fix link to TPU launcher script ( #11427 )
2021-04-26 09:08:43 -04:00
Patrick von Platen
32dbb2d954
make style ( #11442 )
2021-04-26 13:50:34 +02:00
Sylvain Gugger
1ef152eb48
Default to accuracy metric ( #11405 )
2021-04-23 14:49:59 -04:00