transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-20 04:58:22 +06:00

History

Stas Bekman 2df34f4aba [trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>		2021-01-12 19:05:18 -08:00
..
_static	v4.1.1 docs	2020-12-17 11:28:38 -05:00
imgs	Guide to fixed-length model perplexity evaluation (#5449 )	2020-07-07 16:04:15 -06:00
internal	Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150 )	2021-01-06 17:11:42 +01:00
main_classes	[trainer] deepspeed integration (#9211 )	2021-01-12 19:05:18 -08:00
model_doc	Improve LayoutLM (#9476 )	2021-01-12 09:26:32 -05:00
benchmarks.rst	Make doc styler detect lists on rst (#9488 )	2021-01-11 08:53:41 -05:00
bertology.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
conf.py	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
contributing.md	Update installation page and add contributing to the doc (#5084 )	2020-06-17 14:01:10 -04:00
converting_tensorflow_models.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
custom_datasets.rst	correct docs (#9378 )	2021-01-04 17:27:29 +01:00
examples.md	per_device instead of per_gpu/error thrown when argument unknown (#4618 )	2020-05-27 11:36:55 -04:00
favicon.ico	Adding usage examples for common tasks (#2850 )	2020-02-25 13:48:24 -05:00
glossary.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
index.rst	[TFBart] Split TF-Bart (#9497 )	2021-01-12 02:06:32 +01:00
installation.md	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
migration.md	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
model_sharing.rst	up (#9454 )	2021-01-07 11:51:02 +01:00
model_summary.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
multilingual.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
notebooks.md	Update notebooks (#3620 )	2020-04-06 14:32:39 -04:00
perplexity.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
philosophy.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
preprocessing.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
pretrained_models.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
quicktour.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
serialization.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
task_summary.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
testing.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
tokenizer_summary.rst	Minor documentation revisions from copyediting (#9266 )	2020-12-23 10:15:49 -05:00
training.rst	[trainer] deepspeed integration (#9211 )	2021-01-12 19:05:18 -08:00