Matt
508a704055
No more Tuple, List, Dict ( #38797 )
...
* No more Tuple, List, Dict
* make fixup
* More style fixes
* Docstring fixes with regex replacement
* Trigger tests
* Redo fixes after rebase
* Fix copies
* [test all]
* update
* [test all]
* update
* [test all]
* make style after rebase
* Patch the hf_argparser test
* Patch the hf_argparser test
* style fixes
* style fixes
* style fixes
* Fix docstrings in Cohere test
* [test all]
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-06-17 19:37:18 +01:00
Yih-Dar
e8b292e35f
Fix utils/notification_service.py
( #38556 )
...
* fix
* fix
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-06-03 13:59:31 +00:00
ivarflakstad
5f49e180a6
Add mi300 to amd daily ci workflows definition ( #38415 )
2025-05-28 09:17:41 +02:00
Yih-Dar
eb74cf977b
Use one utils/notification_service.py
( #38379 )
...
* step 1
* step 2
* step 3
* step 4
* step 5
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-26 16:15:29 +02:00
Yih-Dar
4a03044ddb
Hot fix for AMD CI workflow ( #38349 )
...
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-25 11:15:31 +02:00
Yih-Dar
d0c9c66d1c
new failure CI reports for all jobs ( #38298 )
...
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run
Build documentation / build (push) Waiting to run
Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run
Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions
Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run
Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions
Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions
Secret Leaks / trufflehog (push) Waiting to run
Update Transformers metadata / build_and_package (push) Waiting to run
Check Tiny Models / Check tiny models (push) Has been cancelled
* new failures
* report_repo_id
* report_repo_id
* report_repo_id
* More fixes
* More fixes
* More fixes
* ruff
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-24 19:15:02 +02:00
Yih-Dar
feec294dea
CI reporting improvements ( #38230 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-20 19:34:58 +02:00
Yih-Dar
b1375177fc
add job links to new model failure report ( #37973 )
...
* update for job link
* stye
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-06 15:10:29 +02:00
ivarflakstad
afbc293e2b
More fault tolerant notification service ( #37924 )
...
* Let notification service succeed even when artifacts and reported jobs on github have mismatch
* Use default trace msg if no trace msg available
* Add pop_default helper fn
* style
2025-05-05 15:19:48 +02:00
Yuanyuan Chen
da4ff2a5f5
Add Optional to remaining types ( #37808 )
...
More Optional typing
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-28 14:20:45 +01:00
Yih-Dar
d9e76656ae
Fix new failure reports not including anything other than tests/models/
( #37415 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-10 14:47:23 +02:00
Yih-Dar
4f139f5a50
Send trainer/fsdp/deepspeed CI job reports to a single channel ( #37411 )
...
* send trainer/fsdd/deepspeed channel
* update
* change name
* no .
* final
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-10 13:17:31 +02:00
Yih-Dar
c6814b4ee8
Update ruff to 0.11.2
( #36962 )
...
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 16:00:11 +01:00
Joao Gante
90b46e983f
Remove old benchmark
code ( #35730 )
...
* remove traces of the old deprecated benchmarks
* also remove old tf benchmark example, which uses deleted code
* run doc builder
2025-01-21 17:56:43 +00:00
Yih-Dar
40821a2478
Fix CI slack reporting issue ( #34833 )
...
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-20 21:36:13 +01:00
Yih-Dar
9360f1827d
Tiny update after #34383 ( #34404 )
...
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-28 12:01:05 +01:00
Yih-Dar
fce1fcfe71
Ping team members for new failed tests in daily CI ( #34171 )
...
* ping
* fix
* fix
* fix
* remove runner
* update members
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-17 16:11:52 +02:00
Yih-Dar
f2122cc6eb
Upload new model failure report to Hub ( #32264 )
...
upload
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-29 09:42:54 +02:00
Yih-Dar
d4564df1d4
Revive Nightly/Past CI ( #31159 )
...
* build
* build
* build
* build
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-20 18:57:24 +02:00
Yih-Dar
3714f3f86b
Upload (daily) CI results to Hub ( #31168 )
...
* build
* build
* build
* build
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-04 21:20:54 +02:00
Yih-Dar
a3cdff417b
save the list of new model failures ( #31013 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 15:20:25 +02:00
Yih-Dar
1432f641b8
Finally fix the missing new model failure CI report ( #30968 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-22 17:48:26 +02:00
Yih-Dar
82c1625ec3
Save other CI jobs' result (torch/tf pipeline, example, deepspeed etc) ( #30699 )
...
* update
* update
* update
* update
* update
* update
* update
* update
* Update utils/notification_service.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-13 17:27:44 +02:00
Yih-Dar
884e3b1c53
Rename artifact name prev_ci_results
to ci_results
( #30697 )
...
* rename
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-07 16:59:16 +02:00
Yih-Dar
fbb41cd420
consistent job / pytest report / artifact name correspondence ( #30392 )
...
* better names
* run better names
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-24 22:32:42 +02:00
Marc Sun
58a939c6b7
Fix quantization tests ( #29914 )
...
* revert back to torch 2.1.1
* run test
* switch to torch 2.2.1
* udapte dockerfile
* fix awq tests
* fix test
* run quanto tests
* update tests
* split quantization tests
* fix
* fix again
* final fix
* fix report artifact
* build docker again
* Revert "build docker again"
This reverts commit 399a5f9d93
.
* debug
* revert
* style
* new notification system
* testing notfication
* rebuild docker
* fix_prev_ci_results
* typo
* remove warning
* fix typo
* fix artifact name
* debug
* issue fixed
* debug again
* fix
* fix time
* test notif with faling test
* typo
* issues again
* final fix ?
* run all quantization tests again
* remove name to clear space
* revert modfiication done on workflow
* fix
* build docker
* build only quant docker
* fix quantization ci
* fix
* fix report
* better quantization_matrix
* add print
* revert to the basic one
2024-04-09 17:10:29 +02:00
Yih-Dar
b17b54d3dd
Refactor daily CI workflow ( #30012 )
...
* separate jobs
* separate jobs
* use channel name directly instead of ID
* use channel name directly instead of ID
* use channel name directly instead of ID
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-05 15:49:51 +02:00
Marc Sun
f54d82cace
[CI] Quantization workflow ( #29046 )
...
* [CI] Quantization workflow
* build dockerfile
* fix dockerfile
* update self-cheduled.yml
* test build dockerfile on push
* fix torch install
* udapte to python 3.10
* update aqlm version
* uncomment build dockerfile
* tests if the scheduler works
* fix docker
* do not trigger on psuh again
* add additional runs
* test again
* all good
* style
* Update .github/workflows/self-scheduled.yml
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* test build dockerfile with torch 2.2.0
* fix extra
* clean
* revert changes
* Revert "revert changes"
This reverts commit 4cb52b8822
.
* revert correct change
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-28 10:09:25 -05:00
Yih-Dar
4735866141
Split daily CI using 2 level matrix ( #28773 )
...
* update / add new workflow files
* Add comment
* Use env.NUM_SLICES
* use scripts
* use scripts
* use scripts
* Fix
* using one script
* Fix
* remove unused file
* update
* fail-fast: false
* remove unused file
* fix
* fix
* use matrix
* inputs
* style
* update
* fix
* fix
* no model name
* add doc
* allow args
* style
* pass argument
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-31 18:04:43 +01:00
Yih-Dar
95346e9dcd
Add artifact name in job step to maintain job / artifact correspondence ( #28682 )
...
* avoid using job name
* apply to other files
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-31 15:58:17 +01:00
Yih-Dar
79e7655906
Fix notification_service.py
( #27903 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-08 14:55:02 +01:00
Yih-Dar
9f1f11a2e7
Show new failing tests in a more clear way in slack report ( #27881 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-07 15:09:30 +01:00
Yih-Dar
e0d2e69582
restructure AMD scheduled CI ( #27743 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-04 15:32:05 +01:00
fxmarty
f93c1e9ece
Add RoCm scheduled CI & upgrade RoCm CI to PyTorch 2.1 ( #26940 )
...
* add scheduled ci on amdgpu
* fix likely typo
* more tests, avoid parallelism
* precise comment
* fix report channel
* trigger docker build on this branch
* fix
* fix
* run rocm scheduled ci
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-11-21 14:55:13 +01:00
Yih-Dar
9dc4ce9ea7
Disable CI runner check ( #27170 )
...
Disable runner check
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-31 11:59:21 +01:00
Yih-Dar
211ad4c9cc
Fix slack report failing for doctest ( #27042 )
...
* fix slack report for doctest
* separate reports
* style
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 10:48:24 +01:00
Yih-Dar
12cc123359
Better way to run AMD CI with different flavors ( #26634 )
...
* Enable testing against mi250
* Change BERT to trigger tests
* Revert BERT's change
* AMD CI
* AMD CI
---------
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-16 16:24:30 +02:00
Yih-Dar
41a8fa4e14
Add the number of model
test failures to slack CI report ( #24207 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 21:27:10 +02:00
Yih-Dar
60f9649653
Fix DeepSpeed
CI job link in Past CI ( #22967 )
...
* Fix job link
* fix artifact name logic
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-25 09:52:19 +02:00
Yih-Dar
5166c30e29
Fix a minor bug in CI slack report ( #22906 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-21 20:36:35 +02:00
Yih-Dar
3080fb714f
Fix Slack report for Nightly CI and Past CI ( #22901 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-21 11:23:16 +02:00
Yih-Dar
435abb22cb
Fix counting in Slack report for some jobs ( #22913 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-21 11:22:23 +02:00
Yih-Dar
648bd5a8aa
Show diff between 2 CI runs on Slack reports ( #22798 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-19 19:27:37 +02:00
Yih-Dar
0fe6c6bdca
(Re-)Enable Nightly + Past CI ( #22393 )
...
* Enable Nightly + Past CI
* put schedule
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-30 21:06:35 +02:00
Yih-Dar
90a7c95496
Show the number of huggingface_hub
warnings in CI report ( #22054 )
...
* show hfh warnings
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-09 15:39:05 +01:00
Yih-Dar
99c5c6079d
Update notification_service.py
( #21992 )
...
* better check
* better check
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-07 14:20:39 +01:00
Yih-Dar
aab895c396
Make Slack CI reporting stronger ( #21823 )
...
* Use token
* Avoid failure
* better error
* Fix
* fix style
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-28 17:12:44 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
Yih-Dar
e8d448edcf
extract warnings in GH workflows ( #20487 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-29 15:58:54 +01:00
Yih-Dar
0cea8d5555
Add offline runners info in the Slack report ( #19169 )
...
* send slack report for offline runners
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-23 19:23:05 +02:00