mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 10:12:23 +06:00
![]() * Enable grad accum fix across all models + trainer fully in forward() * handle peft case * Account for DDP: need to run scale tests * Use accelerator state * Quality * Guard * Experiment w/ only fairseq fix * Fairseq only * Revert multiply_grads fix * Mult by grad accum to fully bring back solution * Style * Good to go now * Skip fx tests for now * Bookmark * Working now |
||
---|---|---|
.. | ||
__init__.py | ||
test_modeling_cohere.py | ||
test_tokenization_cohere.py |