transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

History

Zach Mueller d9f733625c Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283 ) * Enable grad accum fix across all models + trainer fully in forward() * handle peft case * Account for DDP: need to run scale tests * Use accelerator state * Quality * Guard * Experiment w/ only fairseq fix * Fairseq only * Revert multiply_grads fix * Mult by grad accum to fully bring back solution * Style * Good to go now * Skip fx tests for now * Bookmark * Working now		2024-10-23 11:24:57 -04:00
..
__init__.py	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
test_modeling_cohere.py	Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283 )	2024-10-23 11:24:57 -04:00
test_tokenization_cohere.py	Skip tests properly (#31308 )	2024-06-26 21:59:08 +01:00