transformers/tests/models/cohere
Zach Mueller d9f733625c
Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)
* Enable grad accum fix across all models + trainer fully in forward()

* handle peft case

* Account for DDP: need to run scale tests

* Use accelerator state

* Quality

* Guard

* Experiment w/ only fairseq fix

* Fairseq only

* Revert multiply_grads fix

* Mult by grad accum to fully bring back solution

* Style

* Good to go now

* Skip fx tests for now

* Bookmark

* Working now
2024-10-23 11:24:57 -04:00
..
__init__.py Cohere Model Release (#29622) 2024-03-15 14:29:11 +01:00
test_modeling_cohere.py Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283) 2024-10-23 11:24:57 -04:00
test_tokenization_cohere.py Skip tests properly (#31308) 2024-06-26 21:59:08 +01:00