* doc: Trainer.hyperparameter_search docstring discrepancy solved
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
19d58d31f has introduced a context manager to manage subtests of
test_training_gradient_checkpointing. However, test body was not
moved under "with" statement. Thus, while tests are correctly
marked as skipped, test bodies were still executed. In some cases,
as with llama this caused attribute errors.
Fixes: #34722
Fixes: 19d58d31f ("Add MLLama (#33703)")
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Add model skeletion with transformers-cli add-new-model-like
* Convert config to modular, add rms_norm_eps, delete clip_qkv
* Convert model to modular, add RMSNorm
* Add flash attention with qk norm and no qkv clipping
* Add decoder layer with RMSNorm after attention/feedforward layers
* Add base and causal model
* Add converter improvements from OLMo repo
* Update weight loading in OLMo to HF converter
* Set correct default for rms_norm_eps
* Set correct pipeline_model_mapping in test
* Run make fixup
* Fix model type
* Re-run modular conversion
* Manually set config docs to fix build errors
* Convert olmo-1124 to olmo_1124 to fix flash attention docs errors
* Start updating tests
* Update tests
* Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124
* Rename input_layernorm and post_attention_layernorm to reflect their ops better
* Use correct tokenizer
* Remove test unsupported by GPT2 tokenizer
* Create GenerationConfig outside of from_pretrained call
* Use simpler init file structure
* Add explicit __all__ to support simplified init
* Make safetensor serialization the default
* Update OLMo November 2024 docs
* remove v4.44 deprecations
* PR comments
* deprecations scheduled for v4.50
* hub version update
* make fiuxp
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Remove FSDP wrapping from sub-models.
* solve conflict trainer.py
* make fixup
* add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size
* put back extract_model_from_parallel
* use transformers unwrap_model
* Retain newlines in chat template when
* Add try/except
* Add regression test
* Simplify test
* Apply suggestions from code review
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* add XPU path
* use accelerate API
* Update docs/source/en/tasks/semantic_segmentation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update more places with accelerate API
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/torchscript.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Merge troubleshooting.md with this Branch
* Update _toctree.yml
* Update torchscript.md
* Update troubleshooting.md
---------
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update llm_engine.py
- Added support for optional token and max_tokens parameters in the constructor.
- Provided usage examples and detailed documentation for each method.
* Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update trainer.md
* Update trainer.md
* Update trainer.md
* Create _toctree.yml
* Delete docs/source/ar/_toctree.yml
* Update _toctree.yml - add trainer
* Update _toctree.yml
* merge serialization.md into this branch
* merge sagemaker.md into this PR
* Update _toctree.yml
* Update docs/source/ar/trainer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ar/trainer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* save/load sub-configs
* nit forgot these
* fix copies
* move test to common
* use dict for sub-configs
* add load-save-laod test
* clean up modeling check
* oops this are correct keys
* fix some tests, missed some composite configs
* this model was missed
FIX Broken repr of TorchAoConfig
The __repr__ method references a non-existent self.kwargs. This is now
fixed.
There does not appear to be a uniform way of defining __repr__ for
quantization configs. I copied the method as implemented for HQQ:
e2ac16b28a/src/transformers/utils/quantization_config.py (L285-L287)
* Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized.
* Propagate the quantization state using a context manager
* make fixup