Cyril Vallez
|
60226c6ff3
|
TP initialization module-by-module (#35996)
* module-by-module loading!
* Update modeling_utils.py
* dtyle and comments
* Update modeling_utils.py
* Update modeling_utils.py
* Update test
* Update modeling_utils.py
* Update modeling_utils.py
* Update test_tp.py
* Update test_tp.py
* Update modeling_utils.py
* re-trigger CIs
* re-trigger CIs
|
2025-02-19 14:04:57 +01:00 |
|
Arthur
|
7eecdf2a86
|
Update-tp test (#35844)
* update test for now
* up
* cleanup
* update todo
|
2025-02-03 09:37:02 +01:00 |
|
Cyril Vallez
|
f48ecd7608
|
Fix TP initialization (#35860)
* fix tp
* Update modeling_utils.py
* style
* style
* Update test_tp.py
* Update test_tp.py
* style
* Update test_tp.py
* Update test_tp.py
* Update test_tp.py
* Update test_tp.py
|
2025-01-28 15:07:37 +01:00 |
|
Ke Wen
|
20142ab542
|
Simplify Tensor Parallel implementation with PyTorch TP (#34184)
* Simplify Tensor Parallel implementation with PyTorch TP
* Move tp_plan to config
* Lint
* Format and warning
* Disable copy-from check
* Conditionally get attr from config
* make fix-copies
* Move base_model_tp_plan to PretrainedConfig
* Move TP into from_pretrained
* Add device context for load
* Do not serialize
* Move _tp_plan setting to post_init
* Add has_tp_plan
* Add test_tp
* Add 'Multi-gpu inference' doc
* Add backward support for device type identification
* Auto-detect accelerator
* supports_tp_plan
* copyright year
* Fix copy
|
2024-11-18 19:51:49 +01:00 |
|