Nicholas Broad
69e16abf98
Switch from using sum for flattening lists of lists in group_texts ( #14472 )
...
* remove sum for list flattening
* change to chain(*)
* make chain object a list
* delete empty lines
per sgugger's suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-22 16:17:26 -05:00
Suraj Patil
85a4bda4f4
bump flax version ( #14343 )
2021-11-09 22:15:22 +05:30
Suraj Patil
7db2a79b38
[examples/flax] use Repository API for push_to_hub ( #13672 )
...
* use Repository for push_to_hub
* update readme
* update other flax scripts
* update readme
* update qa example
* fix push_to_hub call
* fix typo
* fix more typos
* update readme
* use abosolute path to get repo name
* fix glue script
2021-09-30 16:38:07 +05:30
Stefan Schweter
09549aa18c
examples: minor fixes in flax example readme ( #13502 )
2021-09-10 11:45:57 +05:30
Stefan Schweter
4046e66e40
examples: only use keep_linebreaks when reading TXT files ( #13320 )
...
* examples: only use keep_linebreaks when reading TXT files for all CLM examples
* examples: only use keep_linebreaks when reading TXT files for all CLM examples
* examples: only use keep_linebreaks when reading TXT files for all CLM examples
2021-08-28 16:22:29 +02:00
Stefan Schweter
319d840b46
examples: add keep_linebreaks option to CLM examples ( #13150 )
...
* examples: add keep_linebreaks option to text dataset loader for all CLM examples
* examples: introduce new keep_linebreaks option as data argument in CLM examples
2021-08-27 11:35:45 +02:00
Patrick von Platen
13a9c9a354
[Flax] Refactor gpt2 & bert example docs ( #13024 )
...
* fix_torch_device_generate_test
* remove @
* improve docs for clm
* speed-ups
* correct t5 example as well
* push final touches
* Update examples/flax/language-modeling/README.md
* correct docs for mlm
* Update examples/flax/language-modeling/README.md
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-09 13:37:50 +02:00
abhishek thakur
3ff2cde5ca
tfhub.de -> tfhub.dev ( #12565 )
2021-08-09 08:11:17 +02:00
Patrick von Platen
2e4082364e
[Flax T5] Speed up t5 training ( #13012 )
...
* fix_torch_device_generate_test
* remove @
* update
* up
* fix
* remove f-stings
* correct readme
* up
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-06 11:21:37 +02:00
Stefan Schweter
3d4b3bc3fd
examples: use correct way to get vocab size in flax lm readme ( #12947 )
2021-07-30 21:57:53 +05:30
Stefan Schweter
d3c3e722d6
[FLAX] Minor fixes in CLM example ( #12914 )
...
* readme: fix retrieval of vocab size for flax clm example
* examples: fix flax clm example when using training/evaluation files
2021-07-27 19:48:04 +05:30
Patrick von Platen
13fefdf340
Update README.md
...
cc @patil-suraj
2021-07-20 13:51:15 +02:00
fgaim
66197adc98
Flax MLM: Allow validation split when loading dataset from local file ( #12689 )
...
* Allow validation split when loading dataset from local file
* Flax clm & t5, enable validation split for datasets loaded from local file
2021-07-20 13:38:25 +02:00
Patrick von Platen
f4399ec570
Update README.md
2021-07-14 12:54:31 +01:00
Nick Doiron
5803a2a7ac
Add ByT5 option to example run_t5_mlm_flax.py ( #12634 )
...
* Allow ByT5 type in Flax T5 script
* use T5TokenizerFast
* change up tokenizer config
* model_args
* reorder imports
* Update run_t5_mlm_flax.py
2021-07-13 13:39:57 +01:00
Patrick von Platen
deecdd4939
[Flax] Fix cur step flax examples ( #12608 )
...
* fix_torch_device_generate_test
* remove @
* fix save problem
2021-07-09 13:51:28 +01:00
Sylvain Gugger
6f1adc4334
Fix group_lengths for short datasets ( #12558 )
2021-07-08 07:23:41 -04:00
Ibraheem Moosa
122d7dc34f
Remove logging of GPU count etc logging. ( #12569 )
...
Successfully logging this requires Pytorch. For the purposes of this script we are not using Pytorch.
2021-07-07 23:05:47 +01:00
Patrick von Platen
7d321b7689
[Flax] Allow retraining from save checkpoint ( #12559 )
...
* fix_torch_device_generate_test
* remove @
* finish
2021-07-07 19:13:43 +05:30
Suraj Patil
2d42915abe
[examples/flax] add adafactor optimizer ( #12544 )
...
* add adafactor
* Update examples/flax/language-modeling/run_mlm_flax.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-07-07 11:50:30 +05:30
Patrick von Platen
208df208bf
[Flax] Adapt examples to be able to use eval_steps and save_steps ( #12543 )
...
* fix_torch_device_generate_test
* remove @
* up
* up
* correct
* upload
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-06 19:41:51 +01:00
Patrick von Platen
4605b2b8ec
[Flax] Fix another bug in logging steps ( #12516 )
...
* fix_torch_device_generate_test
* remove @
* up
2021-07-05 18:35:22 +01:00
Patrick von Platen
d0f7508abe
[Flax] Correct logging steps flax ( #12515 )
...
* fix_torch_device_generate_test
* remove @
* push
2021-07-05 18:21:00 +01:00
Patrick von Platen
bb4ac2b5a8
[Flax] Correct flax training scripts ( #12514 )
...
* fix_torch_device_generate_test
* remove @
* add logging steps
* correct training scripts
* correct readme
* correct
2021-07-05 18:14:50 +01:00
Patrick von Platen
813328682e
[Flax] Example scripts - correct weight decay ( #12409 )
...
* fix_torch_device_generate_test
* remove @
* finish
* finish
* correct style
2021-06-29 12:01:08 +01:00
Patrick von Platen
31c3e7e75b
[Flax] Add T5 pretraining script ( #12355 )
...
* fix_torch_device_generate_test
* remove @
* add length computatan
* finish masking
* finish
* upload
* fix some bugs
* finish
* fix dependency table
* correct tensorboard
* Apply suggestions from code review
* correct processing
* slight change init
* correct some more mistakes
* apply suggestions
* improve readme
* fix indent
* Apply suggestions from code review
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
* correct tokenizer
* finish
* finish
* finish
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2021-06-28 20:11:29 +01:00
Patrick von Platen
2d70c91206
[Flax] Adapt flax examples to include push_to_hub
( #12391 )
...
* fix_torch_device_generate_test
* remove @
* finish
* correct summary writer
* correct push to hub
* fix indent
* finish
* finish
* finish
* finish
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-28 19:23:35 +01:00
Stas Bekman
4a872caef4
remove extra white space from log format ( #12360 )
2021-06-25 13:20:14 -07:00
Avital Oliver
9b393240a2
Use a released version of optax rather than installing from Git. ( #12173 )
...
Use a released version of optax rather than installing from Git
2021-06-15 16:42:51 +05:30
Patrick von Platen
7566fefa69
[Flax] Add links to google colabs ( #12146 )
...
* fix_torch_device_generate_test
* remove @
* add colab links
2021-06-14 11:00:29 +01:00
Suraj Patil
d36fce8237
add readme for flax clm ( #12111 )
...
* add readme for flax clm
* use section link for tokenizer
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update metrics
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-14 15:03:55 +05:30
Patrick von Platen
16c0efca2c
Add mlm pretraining xla torch readme ( #12011 )
...
* fix_torch_device_generate_test
* remove @
* upload
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/flax/language-modeling/README.md
* add more info
* finish
* fix
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-14 10:31:21 +01:00
Suraj Patil
15b498f3b8
Flax CLM script ( #12023 )
...
* first draft
* max_seq_length => block_size
* fix arg names
* fix typos
* fix loss calculation
* add max examples, fix train eval steps, metrics
* optimizer mask
* fix perpelexity, metric logging
* fix logging
* data_collator = > data_loader
* refactor loss_fn
* support single GPU
* pass distributed to write_metric
* fix jitting
* fix single device training
* fix single device metrics
* close inner progress bars once finished
* add overwrite_cache arg
* ifx dataset caching issue
* add more logs
* few small fixes,
* address nicholas suggestions
* fix docstr
* address patricks suggestions
* make flake happy
* pass new new_dropout_rng to apply_gradients
* reset train metrics after every epoc
* remove distributed logis, small fixes
2021-06-11 15:16:20 +05:30
Suraj Patil
d1500d9151
pass decay_mask fn to optimizer ( #12087 )
2021-06-09 18:49:27 +01:00
Patrick von Platen
242ec31aa5
[Flax] Refactor MLM ( #12013 )
...
* fix_torch_device_generate_test
* remove @
* finish refactor
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 16:31:32 +01:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error ( #11839 )
...
* fix_torch_device_generate_test
* remove @
* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Patrick von Platen
00440e350f
[Flax MLM] Refactor run mlm with optax ( #11745 )
...
* refactor
* update
* update
* update
* refactor run mlm
* finalize
* refactor more
* fix typo
* update
* finish refactor
* modify run mlm
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* small fixes
* upload
* upload
* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-19 12:00:58 +01:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py ( #11470 )
...
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Patrick von Platen
b48cf7124c
correct typo ( #11393 )
2021-04-23 11:34:59 +02:00
Sylvain Gugger
dabeb15292
Examples reorg ( #11350 )
...
* Base move
* Examples reorganization
* Update references
* Put back test data
* Move conftest
* More fixes
* Move test data to test fixtures
* Update path
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00