Commit Graph

3 Commits

Author SHA1 Message Date
Ilyas Moutawwakil
18e0cae207
Fix many HPU failures in the CI (#39066)
* more torch.hpu patches

* increase top_k because it results in flaky behavior when Tempreture, TopP and TopK are used together, which ends up killing beams early.

* remove temporal fix

* fix scatter operation when input and src are the same

* trigger

* fix and reduce

* skip finding batch size as it makes the hpu go loco

* fix fsdp (yay all are passing)

* fix checking equal nan values

* style

* remove models list

* order

* rename to cuda_extensions

* Update src/transformers/trainer.py
2025-07-03 11:17:27 +02:00
Yih-Dar
fbb41cd420
consistent job / pytest report / artifact name correspondence (#30392)
* better names

* run better names

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-24 22:32:42 +02:00
Yih-Dar
4735866141
Split daily CI using 2 level matrix (#28773)
* update / add new workflow files

* Add comment

* Use env.NUM_SLICES

* use scripts

* use scripts

* use scripts

* Fix

* using one script

* Fix

* remove unused file

* update

* fail-fast: false

* remove unused file

* fix

* fix

* use matrix

* inputs

* style

* update

* fix

* fix

* no model name

* add doc

* allow args

* style

* pass argument

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-31 18:04:43 +01:00