Commit Graph

218 Commits

Author SHA1 Message Date
Kamal Raj
2bd950ca47
[Flax] token-classification model steps enumerate start from 1 (#14547)
* step start from 1

* Updated cur_step calcualtion
2021-11-29 21:55:59 +05:30
Nicholas Broad
69e16abf98
Switch from using sum for flattening lists of lists in group_texts (#14472)
* remove sum for list flattening

* change to chain(*)

* make chain object a list

* delete empty lines

per sgugger's suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-22 16:17:26 -05:00
Suraj Patil
85a4bda4f4
bump flax version (#14343) 2021-11-09 22:15:22 +05:30
Lysandre
b8fad022a0 v4.13.0.dev0 2021-10-28 12:56:46 -04:00
Lysandre
62bf536631 Release v4.12.0 2021-10-28 12:09:49 -04:00
Dhananjay Shettigar
319beb64eb
#12789 Replace assert statements with exceptions (#13909)
* #12789 Replace assert statements with exceptions

* fix-copies: made copy changes to utils_qa.py in examples/pytorch/question-answering and examples/tensorflow/question-answering

* minor refactor for clarity
2021-10-07 09:09:01 -04:00
Yih-Dar
a6ea244f99
Fix: save checkpoint after each epoch and push checkpoint to the hub (#13872)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-10-05 16:30:13 +05:30
Suraj Patil
7db2a79b38
[examples/flax] use Repository API for push_to_hub (#13672)
* use Repository for push_to_hub

* update readme

* update other flax scripts

* update readme

* update qa example

* fix push_to_hub call

* fix typo

* fix more typos

* update readme

* use abosolute path to get repo name

* fix glue script
2021-09-30 16:38:07 +05:30
Lysandre
11c69b8045 Docs for version v4.11.0 2021-09-27 14:19:38 -04:00
Lysandre
dc193c906d Release: v4.11.0 2021-09-27 14:14:09 -04:00
Kamal Raj
78807d86eb
[FLAX] Question Answering Example (#13649)
* flax qa example

* Updated README:  Added Large model

* added utils_qa.py FULL_COPIES

* Updates:
1. Copyright Year updated
2. added dtype arg
3. passing seed and dtype to load model
4. Check eval flag before running eval

* updated README

* updated code comment
2021-09-21 18:34:48 +05:30
Avital Oliver
51e5eca612
Add long overdue link to the Google TRC project (#13501)
* Add long-overdue link to the Google TRC project

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
2021-09-14 13:41:55 +05:30
Stefan Schweter
09549aa18c
examples: minor fixes in flax example readme (#13502) 2021-09-10 11:45:57 +05:30
Kamal Raj
1c191efc3a
flax ner example (#13365)
* flax ner example

* added task to README

* updated readme

* 1. ArgumentParser -> HfArgumentParser
2. step-wise logging,eval and save

* added requirements.txt

* added progress bar

* updated README

* added check_min_version

* updated training data permuattion with JAX

* added metric lib to requirements

* updated readme table

* fixed imports
2021-09-09 10:12:57 +05:30
Stefan Schweter
4046e66e40
examples: only use keep_linebreaks when reading TXT files (#13320)
* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples
2021-08-28 16:22:29 +02:00
Stefan Schweter
319d840b46
examples: add keep_linebreaks option to CLM examples (#13150)
* examples: add keep_linebreaks option to text dataset loader for all CLM examples

* examples: introduce new keep_linebreaks option as data argument in CLM examples
2021-08-27 11:35:45 +02:00
Patrick von Platen
13a9c9a354
[Flax] Refactor gpt2 & bert example docs (#13024)
* fix_torch_device_generate_test

* remove @

* improve docs for clm

* speed-ups

* correct t5 example as well

* push final touches

* Update examples/flax/language-modeling/README.md

* correct docs for mlm

* Update examples/flax/language-modeling/README.md

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-09 13:37:50 +02:00
abhishek thakur
3ff2cde5ca
tfhub.de -> tfhub.dev (#12565) 2021-08-09 08:11:17 +02:00
Patrick von Platen
2e4082364e
[Flax T5] Speed up t5 training (#13012)
* fix_torch_device_generate_test

* remove @

* update

* up

* fix

* remove f-stings

* correct readme

* up

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-06 11:21:37 +02:00
Chungman Lee
75b8990d90
fix typo in example/text-classification README (#12974)
* fix typo in example/text-classification README

* add space to align the table
2021-08-02 12:58:43 +02:00
Stefan Schweter
3d4b3bc3fd
examples: use correct way to get vocab size in flax lm readme (#12947) 2021-07-30 21:57:53 +05:30
Stefan Schweter
d3c3e722d6
[FLAX] Minor fixes in CLM example (#12914)
* readme: fix retrieval of vocab size for flax clm example

* examples: fix flax clm example when using training/evaluation files
2021-07-27 19:48:04 +05:30
Patrick von Platen
13fefdf340
Update README.md
cc @patil-suraj
2021-07-20 13:51:15 +02:00
fgaim
66197adc98
Flax MLM: Allow validation split when loading dataset from local file (#12689)
* Allow validation split when loading dataset from local file

* Flax clm & t5, enable validation split for datasets loaded from local file
2021-07-20 13:38:25 +02:00
Patrick von Platen
f4399ec570
Update README.md 2021-07-14 12:54:31 +01:00
Nick Doiron
5803a2a7ac
Add ByT5 option to example run_t5_mlm_flax.py (#12634)
* Allow ByT5 type in Flax T5 script

* use T5TokenizerFast

* change up tokenizer config

* model_args

* reorder imports

* Update run_t5_mlm_flax.py
2021-07-13 13:39:57 +01:00
Bhadresh Savani
de23ecea36
added test file (#12630) 2021-07-12 12:15:14 +05:30
Patrick von Platen
deecdd4939
[Flax] Fix cur step flax examples (#12608)
* fix_torch_device_generate_test

* remove @

* fix save problem
2021-07-09 13:51:28 +01:00
Sylvain Gugger
6f1adc4334
Fix group_lengths for short datasets (#12558) 2021-07-08 07:23:41 -04:00
Ibraheem Moosa
122d7dc34f
Remove logging of GPU count etc logging. (#12569)
Successfully logging this requires Pytorch. For the purposes of this script we are not using Pytorch.
2021-07-07 23:05:47 +01:00
Patrick von Platen
7d321b7689
[Flax] Allow retraining from save checkpoint (#12559)
* fix_torch_device_generate_test

* remove @

* finish
2021-07-07 19:13:43 +05:30
Suraj Patil
2d42915abe
[examples/flax] add adafactor optimizer (#12544)
* add adafactor

* Update examples/flax/language-modeling/run_mlm_flax.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-07-07 11:50:30 +05:30
Patrick von Platen
208df208bf
[Flax] Adapt examples to be able to use eval_steps and save_steps (#12543)
* fix_torch_device_generate_test

* remove @

* up

* up

* correct

* upload

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-06 19:41:51 +01:00
Patrick von Platen
4605b2b8ec
[Flax] Fix another bug in logging steps (#12516)
* fix_torch_device_generate_test

* remove @

* up
2021-07-05 18:35:22 +01:00
Patrick von Platen
d0f7508abe
[Flax] Correct logging steps flax (#12515)
* fix_torch_device_generate_test

* remove @

* push
2021-07-05 18:21:00 +01:00
Patrick von Platen
bb4ac2b5a8
[Flax] Correct flax training scripts (#12514)
* fix_torch_device_generate_test

* remove @

* add logging steps

* correct training scripts

* correct readme

* correct
2021-07-05 18:14:50 +01:00
Suraj Patil
f1c81d6b92
[Flax] ViT training example (#12300)
* begin script

* clean example, add readme

* update readme

* remove decay mask

* remove masking

* update readme & make flake happy
2021-07-05 18:23:03 +05:30
Patrick von Platen
813328682e
[Flax] Example scripts - correct weight decay (#12409)
* fix_torch_device_generate_test

* remove @

* finish

* finish

* correct style
2021-06-29 12:01:08 +01:00
Suraj Patil
aecae53377
[example/flax] add summarization readme (#12393)
* add readme

* update readme and add requirements

* Update examples/flax/summarization/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-29 14:02:33 +05:30
Patrick von Platen
31c3e7e75b
[Flax] Add T5 pretraining script (#12355)
* fix_torch_device_generate_test

* remove @

* add length computatan

* finish masking

* finish

* upload

* fix some bugs

* finish

* fix dependency table

* correct tensorboard

* Apply suggestions from code review

* correct processing

* slight change init

* correct some more mistakes

* apply suggestions

* improve readme

* fix indent

* Apply suggestions from code review

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* correct tokenizer

* finish

* finish

* finish

* finish

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2021-06-28 20:11:29 +01:00
Patrick von Platen
2d70c91206
[Flax] Adapt flax examples to include push_to_hub (#12391)
* fix_torch_device_generate_test

* remove @

* finish

* correct summary writer

* correct push to hub

* fix indent

* finish

* finish

* finish

* finish

* finish

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-28 19:23:35 +01:00
Stas Bekman
4a872caef4
remove extra white space from log format (#12360) 2021-06-25 13:20:14 -07:00
Suraj Patil
aef3823e1a
[examples/Flax] move the examples table up (#12341) 2021-06-24 16:03:37 +05:30
Suraj Patil
c0fe3c9a7a
Flax summarization script (#12230)
* add summrization script

* fix arguments, preprocessing, metrics

* add generation and metrics

* auto model, prediction loop

* prettify

* label smoothing

* adress Sylvain and Patricks suggestions

* dynamically import shift_tokens_right

* fix shift_tokens_right_fn call
2021-06-23 15:49:30 +05:30
Avital Oliver
9b393240a2
Use a released version of optax rather than installing from Git. (#12173)
Use a released version of optax rather than installing from Git
2021-06-15 16:42:51 +05:30
Patrick von Platen
7566fefa69
[Flax] Add links to google colabs (#12146)
* fix_torch_device_generate_test

* remove @

* add colab links
2021-06-14 11:00:29 +01:00
Suraj Patil
d36fce8237
add readme for flax clm (#12111)
* add readme for flax clm

* use section link for tokenizer

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update metrics

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-14 15:03:55 +05:30
Patrick von Platen
16c0efca2c
Add mlm pretraining xla torch readme (#12011)
* fix_torch_device_generate_test

* remove @

* upload

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Update examples/flax/language-modeling/README.md

* add more info

* finish

* fix

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-14 10:31:21 +01:00
Suraj Patil
15b498f3b8
Flax CLM script (#12023)
* first draft

* max_seq_length => block_size

* fix arg names

* fix typos

* fix loss calculation

* add max examples, fix  train eval steps, metrics

* optimizer mask

* fix perpelexity, metric logging

* fix logging

* data_collator = > data_loader

* refactor loss_fn

* support single GPU

* pass distributed to write_metric

* fix jitting

* fix single device training

* fix single device metrics

* close inner progress bars once finished

* add overwrite_cache arg

* ifx dataset caching issue

* add more logs

* few small fixes,

* address nicholas suggestions

* fix docstr

* address patricks suggestions

* make flake happy

* pass new new_dropout_rng to apply_gradients

* reset train metrics after every epoc

* remove distributed logis, small fixes
2021-06-11 15:16:20 +05:30
Suraj Patil
d1500d9151
pass decay_mask fn to optimizer (#12087) 2021-06-09 18:49:27 +01:00
Patrick von Platen
242ec31aa5
[Flax] Refactor MLM (#12013)
* fix_torch_device_generate_test

* remove @

* finish refactor

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 16:31:32 +01:00
Nicholas Vadivelu
4674061b2a
Fix weight decay masking in run_flax_glue.py (#11964)
* Fix weight decay masking in `run_flax_glue.py`

Issues with the previous implementation:
- The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods.
- `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped.
- Flax's LayerNorm calls the scale parameter `scale` not `weight`

* Fix formatting with black

* adapt results

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 11:35:26 +01:00
Nicholas Vadivelu
1ab147d648
Remove redundant nn.log_softmax in run_flax_glue.py (#11920)
* Remove redundant `nn.log_softmax` in `run_flax_glue.py`

`optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference.

* Remove unused 'flax.linen' import
2021-05-31 15:29:04 +01:00
Avital Oliver
2df546918e
Link official Cloud TPU JAX docs (#11892) 2021-05-26 15:44:40 -04:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error (#11839)
* fix_torch_device_generate_test

* remove @

* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Patrick von Platen
da22245ed9
Add flax text class colab (#11824)
* fix_torch_device_generate_test

* remove @

* add flax glue link
2021-05-21 23:11:58 +01:00
Patrick von Platen
82335185fe
[Flax] Small fixes in run_flax_glue.py (#11820)
* fix_torch_device_generate_test

* remove @

* correct best seed for flax fine-tuning

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 16:52:23 +01:00
Patrick von Platen
bd9871657b
[Flax] Align GLUE training script with mlm training script (#11778)
* speed up flax glue

* remove unnecessary line

* remove folder

* remove run in loop

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 09:36:56 +01:00
Patrick von Platen
00440e350f
[Flax MLM] Refactor run mlm with optax (#11745)
* refactor

* update

* update

* update

* refactor run mlm

* finalize

* refactor more

* fix typo

* update

* finish refactor

* modify run mlm

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* small fixes

* upload

* upload

* finish run mlm script

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-19 12:00:58 +01:00
Avital Oliver
77f9bd18af
Add Flax Examples and Cloud TPU README (#11753)
* Add Flax Examples README

* Apply suggestions from code review

* Update examples/flax/README.md

* add nice table

* fix

* fix

* apply suggestions

* upload

* finish flax readme.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-18 17:45:16 +01:00
Marc van Zee
726e953d44
Improvements to Flax finetuning script (#11727)
* Add Cloud details to README

* Flax script and readme updates

* Some simplifications of Flax script
2021-05-17 09:26:33 +01:00
Marc van Zee
94a2348706
Add Cloud details to README (#11706)
* Add Cloud details to README

* Flax script and readme updates
2021-05-14 14:51:25 +01:00
Patrick von Platen
113eaa7575
correct example script (#11726) 2021-05-14 12:02:57 +01:00
Marc van Zee
6797cdc077
Updates README and fixes bug (#11701) 2021-05-12 13:52:52 +01:00
Marc van Zee
4ce6bcc310
Adds Flax BERT finetuning example on GLUE (#11564)
* Adds Flax BERT finetuning example

* fix traced jax tensor type

* Use Optax losses and learning schedulers

* Add 1GPU training results

* merge into master & make style

* fix input

* del file

* Fix bug in loss and add torch runs

* finish bert flax fine-tune

* Update examples/flax/text-classification/README.md

* Update examples/flax/text-classification/run_flax_glue.py

* add requirements

* finalize

* finalize

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-11 19:02:59 +01:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470)
* add flax roberta

* make style

* correct initialiazation

* modify model to save weights

* fix copied from

* fix copied from

* correct some more code

* add more roberta models

* Apply suggestions from code review

* merge from master

* finish

* finish docs

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Patrick von Platen
b48cf7124c
correct typo (#11393) 2021-04-23 11:34:59 +02:00
Sylvain Gugger
dabeb15292
Examples reorg (#11350)
* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00