Patrick von Platen
16c0efca2c
Add mlm pretraining xla torch readme ( #12011 )
...
* fix_torch_device_generate_test
* remove @
* upload
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/flax/language-modeling/README.md
* add more info
* finish
* fix
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-14 10:31:21 +01:00
Suraj Patil
15b498f3b8
Flax CLM script ( #12023 )
...
* first draft
* max_seq_length => block_size
* fix arg names
* fix typos
* fix loss calculation
* add max examples, fix train eval steps, metrics
* optimizer mask
* fix perpelexity, metric logging
* fix logging
* data_collator = > data_loader
* refactor loss_fn
* support single GPU
* pass distributed to write_metric
* fix jitting
* fix single device training
* fix single device metrics
* close inner progress bars once finished
* add overwrite_cache arg
* ifx dataset caching issue
* add more logs
* few small fixes,
* address nicholas suggestions
* fix docstr
* address patricks suggestions
* make flake happy
* pass new new_dropout_rng to apply_gradients
* reset train metrics after every epoc
* remove distributed logis, small fixes
2021-06-11 15:16:20 +05:30
Suraj Patil
d1500d9151
pass decay_mask fn to optimizer ( #12087 )
2021-06-09 18:49:27 +01:00
Patrick von Platen
242ec31aa5
[Flax] Refactor MLM ( #12013 )
...
* fix_torch_device_generate_test
* remove @
* finish refactor
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 16:31:32 +01:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error ( #11839 )
...
* fix_torch_device_generate_test
* remove @
* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Patrick von Platen
00440e350f
[Flax MLM] Refactor run mlm with optax ( #11745 )
...
* refactor
* update
* update
* update
* refactor run mlm
* finalize
* refactor more
* fix typo
* update
* finish refactor
* modify run mlm
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* small fixes
* upload
* upload
* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-19 12:00:58 +01:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py ( #11470 )
...
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Patrick von Platen
b48cf7124c
correct typo ( #11393 )
2021-04-23 11:34:59 +02:00
Sylvain Gugger
dabeb15292
Examples reorg ( #11350 )
...
* Base move
* Examples reorganization
* Update references
* Put back test data
* Move conftest
* More fixes
* Move test data to test fixtures
* Update path
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00