transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-14 18:18:24 +06:00

Author	SHA1	Message	Date
Patrick von Platen	4605b2b8ec	[Flax] Fix another bug in logging steps (#12516 ) * fix_torch_device_generate_test * remove @ * up	2021-07-05 18:35:22 +01:00
Patrick von Platen	d0f7508abe	[Flax] Correct logging steps flax (#12515 ) * fix_torch_device_generate_test * remove @ * push	2021-07-05 18:21:00 +01:00
Patrick von Platen	bb4ac2b5a8	[Flax] Correct flax training scripts (#12514 ) * fix_torch_device_generate_test * remove @ * add logging steps * correct training scripts * correct readme * correct	2021-07-05 18:14:50 +01:00
Suraj Patil	f1c81d6b92	[Flax] ViT training example (#12300 ) * begin script * clean example, add readme * update readme * remove decay mask * remove masking * update readme & make flake happy	2021-07-05 18:23:03 +05:30
Patrick von Platen	813328682e	[Flax] Example scripts - correct weight decay (#12409 ) * fix_torch_device_generate_test * remove @ * finish * finish * correct style	2021-06-29 12:01:08 +01:00
Suraj Patil	aecae53377	[example/flax] add summarization readme (#12393 ) * add readme * update readme and add requirements * Update examples/flax/summarization/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-29 14:02:33 +05:30
Patrick von Platen	31c3e7e75b	[Flax] Add T5 pretraining script (#12355 ) * fix_torch_device_generate_test * remove @ * add length computatan * finish masking * finish * upload * fix some bugs * finish * fix dependency table * correct tensorboard * Apply suggestions from code review * correct processing * slight change init * correct some more mistakes * apply suggestions * improve readme * fix indent * Apply suggestions from code review Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * correct tokenizer * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2021-06-28 20:11:29 +01:00
Patrick von Platen	2d70c91206	[Flax] Adapt flax examples to include `push_to_hub` (#12391 ) * fix_torch_device_generate_test * remove @ * finish * correct summary writer * correct push to hub * fix indent * finish * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-28 19:23:35 +01:00
Stas Bekman	4a872caef4	remove extra white space from log format (#12360 )	2021-06-25 13:20:14 -07:00
Suraj Patil	aef3823e1a	[examples/Flax] move the examples table up (#12341 )	2021-06-24 16:03:37 +05:30
Suraj Patil	c0fe3c9a7a	Flax summarization script (#12230 ) * add summrization script * fix arguments, preprocessing, metrics * add generation and metrics * auto model, prediction loop * prettify * label smoothing * adress Sylvain and Patricks suggestions * dynamically import shift_tokens_right * fix shift_tokens_right_fn call	2021-06-23 15:49:30 +05:30
Avital Oliver	9b393240a2	Use a released version of optax rather than installing from Git. (#12173 ) Use a released version of optax rather than installing from Git	2021-06-15 16:42:51 +05:30
Patrick von Platen	7566fefa69	[Flax] Add links to google colabs (#12146 ) * fix_torch_device_generate_test * remove @ * add colab links	2021-06-14 11:00:29 +01:00
Suraj Patil	d36fce8237	add readme for flax clm (#12111 ) * add readme for flax clm * use section link for tokenizer * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update metrics Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 15:03:55 +05:30
Patrick von Platen	16c0efca2c	Add mlm pretraining xla torch readme (#12011 ) * fix_torch_device_generate_test * remove @ * upload * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update examples/flax/language-modeling/README.md * add more info * finish * fix Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-14 10:31:21 +01:00
Suraj Patil	15b498f3b8	Flax CLM script (#12023 ) * first draft * max_seq_length => block_size * fix arg names * fix typos * fix loss calculation * add max examples, fix train eval steps, metrics * optimizer mask * fix perpelexity, metric logging * fix logging * data_collator = > data_loader * refactor loss_fn * support single GPU * pass distributed to write_metric * fix jitting * fix single device training * fix single device metrics * close inner progress bars once finished * add overwrite_cache arg * ifx dataset caching issue * add more logs * few small fixes, * address nicholas suggestions * fix docstr * address patricks suggestions * make flake happy * pass new new_dropout_rng to apply_gradients * reset train metrics after every epoc * remove distributed logis, small fixes	2021-06-11 15:16:20 +05:30
Suraj Patil	d1500d9151	pass decay_mask fn to optimizer (#12087 )	2021-06-09 18:49:27 +01:00
Patrick von Platen	242ec31aa5	[Flax] Refactor MLM (#12013 ) * fix_torch_device_generate_test * remove @ * finish refactor Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-03 16:31:32 +01:00
Nicholas Vadivelu	4674061b2a	Fix weight decay masking in `run_flax_glue.py` (#11964 ) * Fix weight decay masking in `run_flax_glue.py` Issues with the previous implementation: - The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods. - `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped. - Flax's LayerNorm calls the scale parameter `scale` not `weight` * Fix formatting with black * adapt results Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-03 11:35:26 +01:00
Nicholas Vadivelu	1ab147d648	Remove redundant `nn.log_softmax` in `run_flax_glue.py` (#11920 ) * Remove redundant `nn.log_softmax` in `run_flax_glue.py` `optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference. * Remove unused 'flax.linen' import	2021-05-31 15:29:04 +01:00
Avital Oliver	2df546918e	Link official Cloud TPU JAX docs (#11892 )	2021-05-26 15:44:40 -04:00
Patrick von Platen	f580604157	[Flax] Fix PyTorch import error (#11839 ) * fix_torch_device_generate_test * remove @ * change pytorch import to flax import	2021-05-24 10:41:10 +01:00
Patrick von Platen	da22245ed9	Add flax text class colab (#11824 ) * fix_torch_device_generate_test * remove @ * add flax glue link	2021-05-21 23:11:58 +01:00
Patrick von Platen	82335185fe	[Flax] Small fixes in `run_flax_glue.py` (#11820 ) * fix_torch_device_generate_test * remove @ * correct best seed for flax fine-tuning Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-21 16:52:23 +01:00
Patrick von Platen	bd9871657b	[Flax] Align GLUE training script with mlm training script (#11778 ) * speed up flax glue * remove unnecessary line * remove folder * remove run in loop Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-21 09:36:56 +01:00
Patrick von Platen	00440e350f	[Flax MLM] Refactor run mlm with optax (#11745 ) * refactor * update * update * update * refactor run mlm * finalize * refactor more * fix typo * update * finish refactor * modify run mlm * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * small fixes * upload * upload * finish run mlm script Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-19 12:00:58 +01:00
Avital Oliver	77f9bd18af	Add Flax Examples and Cloud TPU README (#11753 ) * Add Flax Examples README * Apply suggestions from code review * Update examples/flax/README.md * add nice table * fix * fix * apply suggestions * upload * finish flax readme.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-18 17:45:16 +01:00
Marc van Zee	726e953d44	Improvements to Flax finetuning script (#11727 ) * Add Cloud details to README * Flax script and readme updates * Some simplifications of Flax script	2021-05-17 09:26:33 +01:00
Marc van Zee	94a2348706	Add Cloud details to README (#11706 ) * Add Cloud details to README * Flax script and readme updates	2021-05-14 14:51:25 +01:00
Patrick von Platen	113eaa7575	correct example script (#11726 )	2021-05-14 12:02:57 +01:00
Marc van Zee	6797cdc077	Updates README and fixes bug (#11701 )	2021-05-12 13:52:52 +01:00
Marc van Zee	4ce6bcc310	Adds Flax BERT finetuning example on GLUE (#11564 ) * Adds Flax BERT finetuning example * fix traced jax tensor type * Use Optax losses and learning schedulers * Add 1GPU training results * merge into master & make style * fix input * del file * Fix bug in loss and add torch runs * finish bert flax fine-tune * Update examples/flax/text-classification/README.md * Update examples/flax/text-classification/run_flax_glue.py * add requirements * finalize * finalize Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-11 19:02:59 +01:00
Patrick von Platen	084a187da3	[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470 ) * add flax roberta * make style * correct initialiazation * modify model to save weights * fix copied from * fix copied from * correct some more code * add more roberta models * Apply suggestions from code review * merge from master * finish * finish docs Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-04 19:57:59 +02:00
Patrick von Platen	b48cf7124c	correct typo (#11393 )	2021-04-23 11:34:59 +02:00
Sylvain Gugger	dabeb15292	Examples reorg (#11350 ) * Base move * Examples reorganization * Update references * Put back test data * Move conftest * More fixes * Move test data to test fixtures * Update path * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments and clean Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-21 11:11:20 -04:00

1 2 3

135 Commits