transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 22:00:09 +06:00

Author	SHA1	Message	Date
Nicholas Broad	69e16abf98	Switch from using sum for flattening lists of lists in group_texts (#14472 ) * remove sum for list flattening * change to chain() make chain object a list * delete empty lines per sgugger's suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Nicholas Broad <nicholas@nmbroad.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-22 16:17:26 -05:00
Suraj Patil	85a4bda4f4	bump flax version (#14343 )	2021-11-09 22:15:22 +05:30
Suraj Patil	7db2a79b38	[examples/flax] use Repository API for push_to_hub (#13672 ) * use Repository for push_to_hub * update readme * update other flax scripts * update readme * update qa example * fix push_to_hub call * fix typo * fix more typos * update readme * use abosolute path to get repo name * fix glue script	2021-09-30 16:38:07 +05:30
Stefan Schweter	09549aa18c	examples: minor fixes in flax example readme (#13502 )	2021-09-10 11:45:57 +05:30
Stefan Schweter	4046e66e40	examples: only use keep_linebreaks when reading TXT files (#13320 ) * examples: only use keep_linebreaks when reading TXT files for all CLM examples * examples: only use keep_linebreaks when reading TXT files for all CLM examples * examples: only use keep_linebreaks when reading TXT files for all CLM examples	2021-08-28 16:22:29 +02:00
Stefan Schweter	319d840b46	examples: add keep_linebreaks option to CLM examples (#13150 ) * examples: add keep_linebreaks option to text dataset loader for all CLM examples * examples: introduce new keep_linebreaks option as data argument in CLM examples	2021-08-27 11:35:45 +02:00
Patrick von Platen	13a9c9a354	[Flax] Refactor gpt2 & bert example docs (#13024 ) * fix_torch_device_generate_test * remove @ * improve docs for clm * speed-ups * correct t5 example as well * push final touches * Update examples/flax/language-modeling/README.md * correct docs for mlm * Update examples/flax/language-modeling/README.md Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-08-09 13:37:50 +02:00
abhishek thakur	3ff2cde5ca	tfhub.de -> tfhub.dev (#12565 )	2021-08-09 08:11:17 +02:00
Patrick von Platen	2e4082364e	[Flax T5] Speed up t5 training (#13012 ) * fix_torch_device_generate_test * remove @ * update * up * fix * remove f-stings * correct readme * up Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-08-06 11:21:37 +02:00
Stefan Schweter	3d4b3bc3fd	examples: use correct way to get vocab size in flax lm readme (#12947 )	2021-07-30 21:57:53 +05:30
Stefan Schweter	d3c3e722d6	[FLAX] Minor fixes in CLM example (#12914 ) * readme: fix retrieval of vocab size for flax clm example * examples: fix flax clm example when using training/evaluation files	2021-07-27 19:48:04 +05:30
Patrick von Platen	13fefdf340	Update README.md cc @patil-suraj	2021-07-20 13:51:15 +02:00
fgaim	66197adc98	Flax MLM: Allow validation split when loading dataset from local file (#12689 ) * Allow validation split when loading dataset from local file * Flax clm & t5, enable validation split for datasets loaded from local file	2021-07-20 13:38:25 +02:00
Patrick von Platen	f4399ec570	Update README.md	2021-07-14 12:54:31 +01:00
Nick Doiron	5803a2a7ac	Add ByT5 option to example run_t5_mlm_flax.py (#12634 ) * Allow ByT5 type in Flax T5 script * use T5TokenizerFast * change up tokenizer config * model_args * reorder imports * Update run_t5_mlm_flax.py	2021-07-13 13:39:57 +01:00
Patrick von Platen	deecdd4939	[Flax] Fix cur step flax examples (#12608 ) * fix_torch_device_generate_test * remove @ * fix save problem	2021-07-09 13:51:28 +01:00
Sylvain Gugger	6f1adc4334	Fix group_lengths for short datasets (#12558 )	2021-07-08 07:23:41 -04:00
Ibraheem Moosa	122d7dc34f	Remove logging of GPU count etc logging. (#12569 ) Successfully logging this requires Pytorch. For the purposes of this script we are not using Pytorch.	2021-07-07 23:05:47 +01:00
Patrick von Platen	7d321b7689	[Flax] Allow retraining from save checkpoint (#12559 ) * fix_torch_device_generate_test * remove @ * finish	2021-07-07 19:13:43 +05:30
Suraj Patil	2d42915abe	[examples/flax] add adafactor optimizer (#12544 ) * add adafactor * Update examples/flax/language-modeling/run_mlm_flax.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-07-07 11:50:30 +05:30
Patrick von Platen	208df208bf	[Flax] Adapt examples to be able to use eval_steps and save_steps (#12543 ) * fix_torch_device_generate_test * remove @ * up * up * correct * upload Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-07-06 19:41:51 +01:00
Patrick von Platen	4605b2b8ec	[Flax] Fix another bug in logging steps (#12516 ) * fix_torch_device_generate_test * remove @ * up	2021-07-05 18:35:22 +01:00
Patrick von Platen	d0f7508abe	[Flax] Correct logging steps flax (#12515 ) * fix_torch_device_generate_test * remove @ * push	2021-07-05 18:21:00 +01:00
Patrick von Platen	bb4ac2b5a8	[Flax] Correct flax training scripts (#12514 ) * fix_torch_device_generate_test * remove @ * add logging steps * correct training scripts * correct readme * correct	2021-07-05 18:14:50 +01:00
Patrick von Platen	813328682e	[Flax] Example scripts - correct weight decay (#12409 ) * fix_torch_device_generate_test * remove @ * finish * finish * correct style	2021-06-29 12:01:08 +01:00
Patrick von Platen	31c3e7e75b	[Flax] Add T5 pretraining script (#12355 ) * fix_torch_device_generate_test * remove @ * add length computatan * finish masking * finish * upload * fix some bugs * finish * fix dependency table * correct tensorboard * Apply suggestions from code review * correct processing * slight change init * correct some more mistakes * apply suggestions * improve readme * fix indent * Apply suggestions from code review Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * correct tokenizer * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2021-06-28 20:11:29 +01:00
Patrick von Platen	2d70c91206	[Flax] Adapt flax examples to include `push_to_hub` (#12391 ) * fix_torch_device_generate_test * remove @ * finish * correct summary writer * correct push to hub * fix indent * finish * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-28 19:23:35 +01:00
Stas Bekman	4a872caef4	remove extra white space from log format (#12360 )	2021-06-25 13:20:14 -07:00
Avital Oliver	9b393240a2	Use a released version of optax rather than installing from Git. (#12173 ) Use a released version of optax rather than installing from Git	2021-06-15 16:42:51 +05:30
Patrick von Platen	7566fefa69	[Flax] Add links to google colabs (#12146 ) * fix_torch_device_generate_test * remove @ * add colab links	2021-06-14 11:00:29 +01:00
Suraj Patil	d36fce8237	add readme for flax clm (#12111 ) * add readme for flax clm * use section link for tokenizer * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update metrics Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 15:03:55 +05:30
Patrick von Platen	16c0efca2c	Add mlm pretraining xla torch readme (#12011 ) * fix_torch_device_generate_test * remove @ * upload * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update examples/flax/language-modeling/README.md * add more info * finish * fix Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-14 10:31:21 +01:00
Suraj Patil	15b498f3b8	Flax CLM script (#12023 ) * first draft * max_seq_length => block_size * fix arg names * fix typos * fix loss calculation * add max examples, fix train eval steps, metrics * optimizer mask * fix perpelexity, metric logging * fix logging * data_collator = > data_loader * refactor loss_fn * support single GPU * pass distributed to write_metric * fix jitting * fix single device training * fix single device metrics * close inner progress bars once finished * add overwrite_cache arg * ifx dataset caching issue * add more logs * few small fixes, * address nicholas suggestions * fix docstr * address patricks suggestions * make flake happy * pass new new_dropout_rng to apply_gradients * reset train metrics after every epoc * remove distributed logis, small fixes	2021-06-11 15:16:20 +05:30
Suraj Patil	d1500d9151	pass decay_mask fn to optimizer (#12087 )	2021-06-09 18:49:27 +01:00
Patrick von Platen	242ec31aa5	[Flax] Refactor MLM (#12013 ) * fix_torch_device_generate_test * remove @ * finish refactor Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-03 16:31:32 +01:00
Patrick von Platen	f580604157	[Flax] Fix PyTorch import error (#11839 ) * fix_torch_device_generate_test * remove @ * change pytorch import to flax import	2021-05-24 10:41:10 +01:00
Patrick von Platen	00440e350f	[Flax MLM] Refactor run mlm with optax (#11745 ) * refactor * update * update * update * refactor run mlm * finalize * refactor more * fix typo * update * finish refactor * modify run mlm * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * small fixes * upload * upload * finish run mlm script Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-19 12:00:58 +01:00
Patrick von Platen	084a187da3	[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470 ) * add flax roberta * make style * correct initialiazation * modify model to save weights * fix copied from * fix copied from * correct some more code * add more roberta models * Apply suggestions from code review * merge from master * finish * finish docs Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-04 19:57:59 +02:00
Patrick von Platen	b48cf7124c	correct typo (#11393 )	2021-04-23 11:34:59 +02:00
Sylvain Gugger	dabeb15292	Examples reorg (#11350 ) * Base move * Examples reorganization * Update references * Put back test data * Move conftest * More fixes * Move test data to test fixtures * Update path * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments and clean Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-21 11:11:20 -04:00

1 2

90 Commits