transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
Stas Bekman	ebe5413589	[trainer] 2 bug fixes and a rename (#12309 ) * bug fixes and a rename * add extended DDP test	2021-06-22 11:13:23 -07:00
Patrick von Platen	64029abe4c	[Flax] Main doc for event orga (#12305 ) * fix_torch_device_generate_test * remove @ * push * finish * some typos * add more info on communication * add suggestions	2021-06-22 18:02:52 +01:00
Stas Bekman	dad414d5f9	[trainer + examples] set log level from CLI (#12276 ) * set log level from CLI * add log_level_replica + test + extended docs * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename datasets objects to allow datasets module * improve the doc * style * doc improve Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-21 19:30:50 -07:00
Matt	e3cb7a0b60	Tensorflow QA example (#12252 ) * New Tensorflow QA example! * Style pass * Updating README.md for the new example * flake8 fixes * Update examples/tensorflow/question-answering/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-21 16:37:28 +01:00
Vishal Burman	b53bc55ba9	Fix for making student ProphetNet for Seq2Seq Distillation (#12130 ) * make_student.py: fix to make student ProphetNet * reformat	2021-06-21 09:36:44 -04:00
Bhavitvya Malik	e43e11260f	update desc for map in all examples (#12226 ) * update desc for map in all examples * added plm * suggestions	2021-06-17 15:37:31 -04:00
Lysandre	0daadc1919	Docs for v4.8.0	2021-06-17 18:17:42 +02:00
Lysandre	7a6c9fab8e	Release: v4.7.0	2021-06-17 17:57:42 +02:00
Sylvain Gugger	7d7ceca396	Model card defaults (#12122 ) * [WIP] Model card defaults * finetuned_from default value * Add all mappings to the mapping file * Be more defensive on finetuned_from arg * Add default task tag * Separate tags from tasks * Edge case for dataset * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-15 16:01:37 -04:00
kumapo	955b2b97a6	Enable add_prefix_space if model_type is roberta or gpt2 (#12116 )	2021-06-15 09:33:21 -04:00
Avital Oliver	9b393240a2	Use a released version of optax rather than installing from Git. (#12173 ) Use a released version of optax rather than installing from Git	2021-06-15 16:42:51 +05:30
Stas Bekman	88e84186e5	[style] consistent nn. and nn.functional: part 4 `examples` (#12156 ) * consistent nn. and nn.functional: p4 examples * restore	2021-06-14 12:28:24 -07:00
Kumar Abhishek	9de62cfbce	[lm examples] Replicate --config_overrides addition to other LM examples (#12135 ) * [lm examples] Replicate --config_overrides addition to other LM examples * Removing no trainer files changes * Update README Co-authored-by: Kumar Abhishek <kabhishek@expedia.com>	2021-06-14 08:12:22 -04:00
Nicholas Broad	cd7961b632	Use text_column_name variable instead of "text" (#12132 ) * Use text_column_name variable instead of "text" `text_column_name` was already defined above where I made the changes and it was also used below where I made changes. This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway. * black formatting * make style Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>	2021-06-14 08:11:13 -04:00
Sylvain Gugger	b8ab541340	Don't log anything before logging is setup in examples (#12121 ) * Don't log anything before logging is setup in examples * Last example	2021-06-14 08:03:33 -04:00
Patrick von Platen	7566fefa69	[Flax] Add links to google colabs (#12146 ) * fix_torch_device_generate_test * remove @ * add colab links	2021-06-14 11:00:29 +01:00
Suraj Patil	d36fce8237	add readme for flax clm (#12111 ) * add readme for flax clm * use section link for tokenizer * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update metrics Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 15:03:55 +05:30
Patrick von Platen	16c0efca2c	Add mlm pretraining xla torch readme (#12011 ) * fix_torch_device_generate_test * remove @ * upload * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update examples/flax/language-modeling/README.md * add more info * finish * fix Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-14 10:31:21 +01:00
Suraj Patil	15b498f3b8	Flax CLM script (#12023 ) * first draft * max_seq_length => block_size * fix arg names * fix typos * fix loss calculation * add max examples, fix train eval steps, metrics * optimizer mask * fix perpelexity, metric logging * fix logging * data_collator = > data_loader * refactor loss_fn * support single GPU * pass distributed to write_metric * fix jitting * fix single device training * fix single device metrics * close inner progress bars once finished * add overwrite_cache arg * ifx dataset caching issue * add more logs * few small fixes, * address nicholas suggestions * fix docstr * address patricks suggestions * make flake happy * pass new new_dropout_rng to apply_gradients * reset train metrics after every epoc * remove distributed logis, small fixes	2021-06-11 15:16:20 +05:30
Bhavitvya Malik	d2753dcbec	add relevant description to tqdm in examples (#11927 ) * add relevant `desc` in examples * require_version datasets>=1.8.0	2021-06-10 15:59:55 -04:00
Matt	bebbdd0fc9	Appending label2id and id2label to models to ensure inference works properly (#12102 )	2021-06-10 15:25:04 +01:00
Matt	4cda08decb	Minor style edits	2021-06-10 15:10:57 +01:00
Matt	7f08dbd10a	Update README.md to cover the TF GLUE example.	2021-06-10 14:33:42 +01:00
Sylvain Gugger	d72e5a3a6d	Fix quality	2021-06-10 09:27:11 -04:00
Matt	73a532651a	New TF GLUE example (#12028 ) * Pushing partially-complete new GLUE example * First draft of the new TF GLUE example! Needs a little more testing to be sure but it's almost ready. * Fix to the fit() call * Bugfixes, making sure TPU and multi-GPU support is ready * Remove logger line that depends on Pytorch * Style pass * Deleting old TF GLUE example * Include label2id and id2label in the saved model config * Don't clobber the existing model.config.label2id * Style fixes * Update examples/tensorflow/text-classification/run_glue.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-10 14:14:37 +01:00
kumapo	472a867626	Add text_column_name and label_column_name to run_ner and run_ner_no_trainer args (#12083 ) * Add text_column_name and label_column_name to run_ner args * Minor fix: grouping for text and label column name	2021-06-10 08:03:20 -04:00
Stas Bekman	61e191987d	rm require_version_examples (#12088 )	2021-06-09 11:02:52 -07:00
Suraj Patil	d1500d9151	pass decay_mask fn to optimizer (#12087 )	2021-06-09 18:49:27 +01:00
Anton Lozhkov	d472bd7b18	Wav2Vec2 Pretraining (#11306 ) * Working quantizer forward * Working quantizer forward * Clean up unused model parts, test reproducibility * Working quantizer forward * Clean up unused model parts, test reproducibility * Remove custom outputs from the shared ones * correct conversion * correct bug * add first pretrain script * save intermediate * static shapes * save intermediate * finish first pretrain script version * more refactor * remove wanddb * refactor more * improve test * correct perplexity compute bug * finish model implementation * add to docs * finish docs * finish pretraining script * finish pretraining script * remove wandb * finish PR for merge * finish config * finish * make deepspeed work * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions * fix flaky test Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-09 18:40:56 +01:00
Stas Bekman	d14e0af274	sync LayerDrop for Wav2Vec2Encoder + tests (#12076 )	2021-06-09 13:21:03 +01:00
Koichi Yasuoka	82a2b76c95	Update run_ner.py with id2label config (#12001 )	2021-06-09 07:27:05 -04:00
Stas Bekman	11d86d3de4	[Deepspeed Wav2vec2] integration (#11638 ) * wip * wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044 * cleanup * workaround * working 5/8 modes * solve fp32 distributed zero3 * style * sync * sync * rework * deprecation * cleanup * https://github.com/microsoft/DeepSpeed/pull/1044 pr was merged * clean up * add a guide * more prose * more prose * fix * more prose * sub_group_size was too big * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor * bug fix * make the true check explicit * new deepspeed release Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-08 12:32:03 -07:00
Sylvain Gugger	fd6902838a	Properly indent block_size (#12070 )	2021-06-08 10:27:02 -04:00
cdleong	49bee0aea4	Add torch to requirements.txt in language-modeling (#12040 ) * Add torch to requirements.txt in language-modeling * Update examples/pytorch/language-modeling/requirements.txt Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-08 09:02:35 -04:00
Mario Šaško	f5eec0d8e9	Replace legacy tensor.Tensor with torch.tensor/torch.empty (#12027 ) * Replace legacy torch.Tensor constructor with torch.{tensor, empty} * Remove torch.Tensor in examples	2021-06-08 13:58:38 +01:00
Shamane Siri	e33085d648	updated the original RAG implementation to be compatible with latest Pytorch-Lightning (#11806 ) * updated the original RAG implementation to be compatible with the latest PL version * updated the requirements.txt file * execute make style * code quality test * code quality * conflix resolved in requirement.txt * code quality * changed the MyDDP class name to CustomDDP	2021-06-08 13:42:49 +01:00
Russell Klopfer	e363e1d936	adds metric prefix. (#12057 ) * adds metric prefix. * update tests to include prefix	2021-06-07 22:34:10 -04:00
Patrick von Platen	242ec31aa5	[Flax] Refactor MLM (#12013 ) * fix_torch_device_generate_test * remove @ * finish refactor Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-03 16:31:32 +01:00
Nicholas Vadivelu	4674061b2a	Fix weight decay masking in `run_flax_glue.py` (#11964 ) * Fix weight decay masking in `run_flax_glue.py` Issues with the previous implementation: - The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods. - `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped. - Flax's LayerNorm calls the scale parameter `scale` not `weight` * Fix formatting with black * adapt results Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-03 11:35:26 +01:00
dependabot[bot]	6db3a87de2	Bump urllib3 from 1.25.8 to 1.26.5 in /examples/research_projects/lxmert (#11983 ) Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.8 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.25.8...1.26.5) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-06-02 03:40:20 -04:00
Fan Zhang	7e73601f32	modify qa-trainer (#11872 ) * modify qa-trainer * fix flax model	2021-06-01 08:28:41 -04:00
Shamane Siri	9ec0f01b6c	RAG-2nd2end-revamp (#11893 ) * initial * code quality test * code quality * added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver * minor change in test_modeling_rag * fixed tests * Update examples/research_projects/rag-end2end-retriever/README.md typo corrected as suggested by lhoestq Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update examples/research_projects/rag-end2end-retriever/finetune_rag.py type change suggested by lhoestq Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update src/transformers/models/rag/retrieval_rag.py Adding this change as mentioned by lhoestq. Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * completed the minor changes suggested by the reviewers Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>	2021-06-01 07:32:26 +01:00
Philip May	cfca638acb	Add MT5ForConditionalGeneration as supported arch. to summarization README (#11961 ) * Add MT5ForConditionalGeneration as supported arch. * Update README.md	2021-05-31 21:24:33 +05:30
Nicholas Vadivelu	1ab147d648	Remove redundant `nn.log_softmax` in `run_flax_glue.py` (#11920 ) * Remove redundant `nn.log_softmax` in `run_flax_glue.py` `optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference. * Remove unused 'flax.linen' import	2021-05-31 15:29:04 +01:00
Avital Oliver	2df546918e	Link official Cloud TPU JAX docs (#11892 )	2021-05-26 15:44:40 -04:00
Stas Bekman	1b6530104d	[Examples] create model with custom config on the fly (#11798 ) * create custom model on the flight * better wording * add update_from_string * cleanup * cleanup * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more bool options * style * fix logger * add test * add the doc * assert on conflict of options Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-05-25 10:40:49 -07:00
Stas Bekman	6287c929c1	[lm examples] fix overflow in perplexity calc (#11855 ) * fix overflow in perplexity calc * use inf * fix	2021-05-25 08:11:26 -07:00
Sylvain Gugger	f086652b16	Add option to log only once in multinode training (#11819 ) * Add option to long only once in multinode training * Use an alternate property	2021-05-25 08:03:43 -04:00
Wang Ran (汪然)	b8344a274f	typo (#11858 )	2021-05-25 04:23:46 -04:00
Patrick von Platen	f580604157	[Flax] Fix PyTorch import error (#11839 ) * fix_torch_device_generate_test * remove @ * change pytorch import to flax import	2021-05-24 10:41:10 +01:00

1 2 3 4 5 ...

1642 Commits