transformers/docs/source/main_classes
Stas Bekman c6d664849b
[DeepSpeed] ZeRO Stage 3 (#10753)
* synced gpus

* fix

* fix

* need to use t5-small for quality tests

* notes

* complete merge

* fix a disappearing std stream problem

* start zero3 tests

* wip

* tune params

* sorting out the pre-trained model loading

* reworking generate loop wip

* wip

* style

* fix tests

* split the tests

* refactor tests

* wip

* parameterized

* fix

* workout the resume from non-ds checkpoint pass + test

* cleanup

* remove no longer needed code

* split getter/setter functions

* complete the docs

* suggestions

* gpus and their compute capabilities link

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* style

* remove invalid paramgd

* automatically configure zero3 params that rely on hidden size

* make _get_resized_embeddings zero3-aware

* add test exercising resize_token_embeddings()

* add docstring

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-08 09:53:01 -07:00
..
callback.rst Add example for registering callbacks with trainers (#10928) 2021-04-05 12:27:23 -04:00
configuration.rst Copyright (#8970) 2020-12-07 18:36:34 -05:00
feature_extractor.rst Add ImageFeatureExtractionMixin (#10905) 2021-03-26 11:23:56 -04:00
logging.rst Logging propagation (#10092) 2021-02-09 10:27:49 -05:00
model.rst [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) 2020-12-16 13:03:32 +01:00
optimizer_schedules.rst Seq2seq trainer (#9241) 2020-12-22 11:33:44 -05:00
output.rst Remove unsupported methods from ModelOutput doc (#10505) 2021-03-03 14:55:18 -05:00
pipelines.rst TableQuestionAnsweringPipeline (#9145) 2020-12-16 12:31:50 -05:00
processors.rst Fix documentation links always pointing to master. (#9217) 2021-01-05 06:18:48 -05:00
tokenizer.rst Documentation about loading a fast tokenizer within Transformers (#11029) 2021-04-05 10:51:16 -04:00
trainer.rst [DeepSpeed] ZeRO Stage 3 (#10753) 2021-04-08 09:53:01 -07:00