Patrick von Platen
|
deecdd4939
|
[Flax] Fix cur step flax examples (#12608)
* fix_torch_device_generate_test
* remove @
* fix save problem
|
2021-07-09 13:51:28 +01:00 |
|
Sylvain Gugger
|
6f1adc4334
|
Fix group_lengths for short datasets (#12558)
|
2021-07-08 07:23:41 -04:00 |
|
Suraj Patil
|
2d42915abe
|
[examples/flax] add adafactor optimizer (#12544)
* add adafactor
* Update examples/flax/language-modeling/run_mlm_flax.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2021-07-07 11:50:30 +05:30 |
|
Patrick von Platen
|
208df208bf
|
[Flax] Adapt examples to be able to use eval_steps and save_steps (#12543)
* fix_torch_device_generate_test
* remove @
* up
* up
* correct
* upload
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
|
2021-07-06 19:41:51 +01:00 |
|
Patrick von Platen
|
d0f7508abe
|
[Flax] Correct logging steps flax (#12515)
* fix_torch_device_generate_test
* remove @
* push
|
2021-07-05 18:21:00 +01:00 |
|
Patrick von Platen
|
bb4ac2b5a8
|
[Flax] Correct flax training scripts (#12514)
* fix_torch_device_generate_test
* remove @
* add logging steps
* correct training scripts
* correct readme
* correct
|
2021-07-05 18:14:50 +01:00 |
|
Patrick von Platen
|
813328682e
|
[Flax] Example scripts - correct weight decay (#12409)
* fix_torch_device_generate_test
* remove @
* finish
* finish
* correct style
|
2021-06-29 12:01:08 +01:00 |
|
Patrick von Platen
|
2d70c91206
|
[Flax] Adapt flax examples to include push_to_hub (#12391)
* fix_torch_device_generate_test
* remove @
* finish
* correct summary writer
* correct push to hub
* fix indent
* finish
* finish
* finish
* finish
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
|
2021-06-28 19:23:35 +01:00 |
|
Stas Bekman
|
4a872caef4
|
remove extra white space from log format (#12360)
|
2021-06-25 13:20:14 -07:00 |
|
Suraj Patil
|
15b498f3b8
|
Flax CLM script (#12023)
* first draft
* max_seq_length => block_size
* fix arg names
* fix typos
* fix loss calculation
* add max examples, fix train eval steps, metrics
* optimizer mask
* fix perpelexity, metric logging
* fix logging
* data_collator = > data_loader
* refactor loss_fn
* support single GPU
* pass distributed to write_metric
* fix jitting
* fix single device training
* fix single device metrics
* close inner progress bars once finished
* add overwrite_cache arg
* ifx dataset caching issue
* add more logs
* few small fixes,
* address nicholas suggestions
* fix docstr
* address patricks suggestions
* make flake happy
* pass new new_dropout_rng to apply_gradients
* reset train metrics after every epoc
* remove distributed logis, small fixes
|
2021-06-11 15:16:20 +05:30 |
|