Kamal Raj
2bd950ca47
[Flax] token-classification model steps enumerate start from 1 ( #14547 )
...
* step start from 1
* Updated cur_step calcualtion
2021-11-29 21:55:59 +05:30
Nicholas Broad
69e16abf98
Switch from using sum for flattening lists of lists in group_texts ( #14472 )
...
* remove sum for list flattening
* change to chain(*)
* make chain object a list
* delete empty lines
per sgugger's suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-22 16:17:26 -05:00
Suraj Patil
85a4bda4f4
bump flax version ( #14343 )
2021-11-09 22:15:22 +05:30
Lysandre
b8fad022a0
v4.13.0.dev0
2021-10-28 12:56:46 -04:00
Lysandre
62bf536631
Release v4.12.0
2021-10-28 12:09:49 -04:00
Dhananjay Shettigar
319beb64eb
#12789 Replace assert statements with exceptions ( #13909 )
...
* #12789 Replace assert statements with exceptions
* fix-copies: made copy changes to utils_qa.py in examples/pytorch/question-answering and examples/tensorflow/question-answering
* minor refactor for clarity
2021-10-07 09:09:01 -04:00
Yih-Dar
a6ea244f99
Fix: save checkpoint after each epoch and push checkpoint to the hub ( #13872 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-10-05 16:30:13 +05:30
Suraj Patil
7db2a79b38
[examples/flax] use Repository API for push_to_hub ( #13672 )
...
* use Repository for push_to_hub
* update readme
* update other flax scripts
* update readme
* update qa example
* fix push_to_hub call
* fix typo
* fix more typos
* update readme
* use abosolute path to get repo name
* fix glue script
2021-09-30 16:38:07 +05:30
Lysandre
11c69b8045
Docs for version v4.11.0
2021-09-27 14:19:38 -04:00
Lysandre
dc193c906d
Release: v4.11.0
2021-09-27 14:14:09 -04:00
Kamal Raj
78807d86eb
[FLAX] Question Answering Example ( #13649 )
...
* flax qa example
* Updated README: Added Large model
* added utils_qa.py FULL_COPIES
* Updates:
1. Copyright Year updated
2. added dtype arg
3. passing seed and dtype to load model
4. Check eval flag before running eval
* updated README
* updated code comment
2021-09-21 18:34:48 +05:30
Avital Oliver
51e5eca612
Add long overdue link to the Google TRC project ( #13501 )
...
* Add long-overdue link to the Google TRC project
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
2021-09-14 13:41:55 +05:30
Stefan Schweter
09549aa18c
examples: minor fixes in flax example readme ( #13502 )
2021-09-10 11:45:57 +05:30
Kamal Raj
1c191efc3a
flax ner example ( #13365 )
...
* flax ner example
* added task to README
* updated readme
* 1. ArgumentParser -> HfArgumentParser
2. step-wise logging,eval and save
* added requirements.txt
* added progress bar
* updated README
* added check_min_version
* updated training data permuattion with JAX
* added metric lib to requirements
* updated readme table
* fixed imports
2021-09-09 10:12:57 +05:30
Stefan Schweter
4046e66e40
examples: only use keep_linebreaks when reading TXT files ( #13320 )
...
* examples: only use keep_linebreaks when reading TXT files for all CLM examples
* examples: only use keep_linebreaks when reading TXT files for all CLM examples
* examples: only use keep_linebreaks when reading TXT files for all CLM examples
2021-08-28 16:22:29 +02:00
Stefan Schweter
319d840b46
examples: add keep_linebreaks option to CLM examples ( #13150 )
...
* examples: add keep_linebreaks option to text dataset loader for all CLM examples
* examples: introduce new keep_linebreaks option as data argument in CLM examples
2021-08-27 11:35:45 +02:00
Patrick von Platen
13a9c9a354
[Flax] Refactor gpt2 & bert example docs ( #13024 )
...
* fix_torch_device_generate_test
* remove @
* improve docs for clm
* speed-ups
* correct t5 example as well
* push final touches
* Update examples/flax/language-modeling/README.md
* correct docs for mlm
* Update examples/flax/language-modeling/README.md
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-09 13:37:50 +02:00
abhishek thakur
3ff2cde5ca
tfhub.de -> tfhub.dev ( #12565 )
2021-08-09 08:11:17 +02:00
Patrick von Platen
2e4082364e
[Flax T5] Speed up t5 training ( #13012 )
...
* fix_torch_device_generate_test
* remove @
* update
* up
* fix
* remove f-stings
* correct readme
* up
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-06 11:21:37 +02:00
Chungman Lee
75b8990d90
fix typo in example/text-classification README ( #12974 )
...
* fix typo in example/text-classification README
* add space to align the table
2021-08-02 12:58:43 +02:00
Stefan Schweter
3d4b3bc3fd
examples: use correct way to get vocab size in flax lm readme ( #12947 )
2021-07-30 21:57:53 +05:30
Stefan Schweter
d3c3e722d6
[FLAX] Minor fixes in CLM example ( #12914 )
...
* readme: fix retrieval of vocab size for flax clm example
* examples: fix flax clm example when using training/evaluation files
2021-07-27 19:48:04 +05:30
Patrick von Platen
13fefdf340
Update README.md
...
cc @patil-suraj
2021-07-20 13:51:15 +02:00
fgaim
66197adc98
Flax MLM: Allow validation split when loading dataset from local file ( #12689 )
...
* Allow validation split when loading dataset from local file
* Flax clm & t5, enable validation split for datasets loaded from local file
2021-07-20 13:38:25 +02:00
Patrick von Platen
f4399ec570
Update README.md
2021-07-14 12:54:31 +01:00
Nick Doiron
5803a2a7ac
Add ByT5 option to example run_t5_mlm_flax.py ( #12634 )
...
* Allow ByT5 type in Flax T5 script
* use T5TokenizerFast
* change up tokenizer config
* model_args
* reorder imports
* Update run_t5_mlm_flax.py
2021-07-13 13:39:57 +01:00
Bhadresh Savani
de23ecea36
added test file ( #12630 )
2021-07-12 12:15:14 +05:30
Patrick von Platen
deecdd4939
[Flax] Fix cur step flax examples ( #12608 )
...
* fix_torch_device_generate_test
* remove @
* fix save problem
2021-07-09 13:51:28 +01:00
Sylvain Gugger
6f1adc4334
Fix group_lengths for short datasets ( #12558 )
2021-07-08 07:23:41 -04:00
Ibraheem Moosa
122d7dc34f
Remove logging of GPU count etc logging. ( #12569 )
...
Successfully logging this requires Pytorch. For the purposes of this script we are not using Pytorch.
2021-07-07 23:05:47 +01:00
Patrick von Platen
7d321b7689
[Flax] Allow retraining from save checkpoint ( #12559 )
...
* fix_torch_device_generate_test
* remove @
* finish
2021-07-07 19:13:43 +05:30
Suraj Patil
2d42915abe
[examples/flax] add adafactor optimizer ( #12544 )
...
* add adafactor
* Update examples/flax/language-modeling/run_mlm_flax.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-07-07 11:50:30 +05:30
Patrick von Platen
208df208bf
[Flax] Adapt examples to be able to use eval_steps and save_steps ( #12543 )
...
* fix_torch_device_generate_test
* remove @
* up
* up
* correct
* upload
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-06 19:41:51 +01:00
Patrick von Platen
4605b2b8ec
[Flax] Fix another bug in logging steps ( #12516 )
...
* fix_torch_device_generate_test
* remove @
* up
2021-07-05 18:35:22 +01:00
Patrick von Platen
d0f7508abe
[Flax] Correct logging steps flax ( #12515 )
...
* fix_torch_device_generate_test
* remove @
* push
2021-07-05 18:21:00 +01:00
Patrick von Platen
bb4ac2b5a8
[Flax] Correct flax training scripts ( #12514 )
...
* fix_torch_device_generate_test
* remove @
* add logging steps
* correct training scripts
* correct readme
* correct
2021-07-05 18:14:50 +01:00
Suraj Patil
f1c81d6b92
[Flax] ViT training example ( #12300 )
...
* begin script
* clean example, add readme
* update readme
* remove decay mask
* remove masking
* update readme & make flake happy
2021-07-05 18:23:03 +05:30
Patrick von Platen
813328682e
[Flax] Example scripts - correct weight decay ( #12409 )
...
* fix_torch_device_generate_test
* remove @
* finish
* finish
* correct style
2021-06-29 12:01:08 +01:00
Suraj Patil
aecae53377
[example/flax] add summarization readme ( #12393 )
...
* add readme
* update readme and add requirements
* Update examples/flax/summarization/README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-29 14:02:33 +05:30
Patrick von Platen
31c3e7e75b
[Flax] Add T5 pretraining script ( #12355 )
...
* fix_torch_device_generate_test
* remove @
* add length computatan
* finish masking
* finish
* upload
* fix some bugs
* finish
* fix dependency table
* correct tensorboard
* Apply suggestions from code review
* correct processing
* slight change init
* correct some more mistakes
* apply suggestions
* improve readme
* fix indent
* Apply suggestions from code review
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
* correct tokenizer
* finish
* finish
* finish
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2021-06-28 20:11:29 +01:00
Patrick von Platen
2d70c91206
[Flax] Adapt flax examples to include push_to_hub
( #12391 )
...
* fix_torch_device_generate_test
* remove @
* finish
* correct summary writer
* correct push to hub
* fix indent
* finish
* finish
* finish
* finish
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-28 19:23:35 +01:00
Stas Bekman
4a872caef4
remove extra white space from log format ( #12360 )
2021-06-25 13:20:14 -07:00
Suraj Patil
aef3823e1a
[examples/Flax] move the examples table up ( #12341 )
2021-06-24 16:03:37 +05:30
Suraj Patil
c0fe3c9a7a
Flax summarization script ( #12230 )
...
* add summrization script
* fix arguments, preprocessing, metrics
* add generation and metrics
* auto model, prediction loop
* prettify
* label smoothing
* adress Sylvain and Patricks suggestions
* dynamically import shift_tokens_right
* fix shift_tokens_right_fn call
2021-06-23 15:49:30 +05:30
Avital Oliver
9b393240a2
Use a released version of optax rather than installing from Git. ( #12173 )
...
Use a released version of optax rather than installing from Git
2021-06-15 16:42:51 +05:30
Patrick von Platen
7566fefa69
[Flax] Add links to google colabs ( #12146 )
...
* fix_torch_device_generate_test
* remove @
* add colab links
2021-06-14 11:00:29 +01:00
Suraj Patil
d36fce8237
add readme for flax clm ( #12111 )
...
* add readme for flax clm
* use section link for tokenizer
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update metrics
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-14 15:03:55 +05:30
Patrick von Platen
16c0efca2c
Add mlm pretraining xla torch readme ( #12011 )
...
* fix_torch_device_generate_test
* remove @
* upload
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/flax/language-modeling/README.md
* add more info
* finish
* fix
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-14 10:31:21 +01:00
Suraj Patil
15b498f3b8
Flax CLM script ( #12023 )
...
* first draft
* max_seq_length => block_size
* fix arg names
* fix typos
* fix loss calculation
* add max examples, fix train eval steps, metrics
* optimizer mask
* fix perpelexity, metric logging
* fix logging
* data_collator = > data_loader
* refactor loss_fn
* support single GPU
* pass distributed to write_metric
* fix jitting
* fix single device training
* fix single device metrics
* close inner progress bars once finished
* add overwrite_cache arg
* ifx dataset caching issue
* add more logs
* few small fixes,
* address nicholas suggestions
* fix docstr
* address patricks suggestions
* make flake happy
* pass new new_dropout_rng to apply_gradients
* reset train metrics after every epoc
* remove distributed logis, small fixes
2021-06-11 15:16:20 +05:30
Suraj Patil
d1500d9151
pass decay_mask fn to optimizer ( #12087 )
2021-06-09 18:49:27 +01:00
Patrick von Platen
242ec31aa5
[Flax] Refactor MLM ( #12013 )
...
* fix_torch_device_generate_test
* remove @
* finish refactor
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 16:31:32 +01:00
Nicholas Vadivelu
4674061b2a
Fix weight decay masking in run_flax_glue.py
( #11964 )
...
* Fix weight decay masking in `run_flax_glue.py`
Issues with the previous implementation:
- The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods.
- `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped.
- Flax's LayerNorm calls the scale parameter `scale` not `weight`
* Fix formatting with black
* adapt results
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 11:35:26 +01:00
Nicholas Vadivelu
1ab147d648
Remove redundant nn.log_softmax
in run_flax_glue.py
( #11920 )
...
* Remove redundant `nn.log_softmax` in `run_flax_glue.py`
`optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference.
* Remove unused 'flax.linen' import
2021-05-31 15:29:04 +01:00
Avital Oliver
2df546918e
Link official Cloud TPU JAX docs ( #11892 )
2021-05-26 15:44:40 -04:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error ( #11839 )
...
* fix_torch_device_generate_test
* remove @
* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Patrick von Platen
da22245ed9
Add flax text class colab ( #11824 )
...
* fix_torch_device_generate_test
* remove @
* add flax glue link
2021-05-21 23:11:58 +01:00
Patrick von Platen
82335185fe
[Flax] Small fixes in run_flax_glue.py
( #11820 )
...
* fix_torch_device_generate_test
* remove @
* correct best seed for flax fine-tuning
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 16:52:23 +01:00
Patrick von Platen
bd9871657b
[Flax] Align GLUE training script with mlm training script ( #11778 )
...
* speed up flax glue
* remove unnecessary line
* remove folder
* remove run in loop
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 09:36:56 +01:00
Patrick von Platen
00440e350f
[Flax MLM] Refactor run mlm with optax ( #11745 )
...
* refactor
* update
* update
* update
* refactor run mlm
* finalize
* refactor more
* fix typo
* update
* finish refactor
* modify run mlm
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* small fixes
* upload
* upload
* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-19 12:00:58 +01:00
Avital Oliver
77f9bd18af
Add Flax Examples and Cloud TPU README ( #11753 )
...
* Add Flax Examples README
* Apply suggestions from code review
* Update examples/flax/README.md
* add nice table
* fix
* fix
* apply suggestions
* upload
* finish flax readme.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-18 17:45:16 +01:00
Marc van Zee
726e953d44
Improvements to Flax finetuning script ( #11727 )
...
* Add Cloud details to README
* Flax script and readme updates
* Some simplifications of Flax script
2021-05-17 09:26:33 +01:00
Marc van Zee
94a2348706
Add Cloud details to README ( #11706 )
...
* Add Cloud details to README
* Flax script and readme updates
2021-05-14 14:51:25 +01:00
Patrick von Platen
113eaa7575
correct example script ( #11726 )
2021-05-14 12:02:57 +01:00
Marc van Zee
6797cdc077
Updates README and fixes bug ( #11701 )
2021-05-12 13:52:52 +01:00
Marc van Zee
4ce6bcc310
Adds Flax BERT finetuning example on GLUE ( #11564 )
...
* Adds Flax BERT finetuning example
* fix traced jax tensor type
* Use Optax losses and learning schedulers
* Add 1GPU training results
* merge into master & make style
* fix input
* del file
* Fix bug in loss and add torch runs
* finish bert flax fine-tune
* Update examples/flax/text-classification/README.md
* Update examples/flax/text-classification/run_flax_glue.py
* add requirements
* finalize
* finalize
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-11 19:02:59 +01:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py ( #11470 )
...
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Patrick von Platen
b48cf7124c
correct typo ( #11393 )
2021-04-23 11:34:59 +02:00
Sylvain Gugger
dabeb15292
Examples reorg ( #11350 )
...
* Base move
* Examples reorganization
* Update references
* Put back test data
* Move conftest
* More fixes
* Move test data to test fixtures
* Update path
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00