Anthony Song
346f1eebbd
docs: fix typo ( #37567 )
...
Co-authored-by: Anthony <anthony.song@capitalone.com>
2025-04-17 14:54:44 +01:00
Nouamane Tazi
51ed61e2f0
Mention UltraScale Playbook 🌌 in docs ( #36589 )
2025-03-06 14:48:11 -08:00
co63oc
37508816d6
chore: Fix typos in docs and examples ( #36524 )
...
Fix typos in docs and examples
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-03-04 13:47:41 +00:00
Steven Liu
c0f8d055ce
[docs] Redesign ( #31757 )
...
* toctree
* not-doctested.txt
* collapse sections
* feedback
* update
* rewrite get started sections
* fixes
* fix
* loading models
* fix
* customize models
* share
* fix link
* contribute part 1
* contribute pt 2
* fix toctree
* tokenization pt 1
* Add new model (#32615 )
* v1 - working version
* fix
* fix
* fix
* fix
* rename to correct name
* fix title
* fixup
* rename files
* fix
* add copied from on tests
* rename to `FalconMamba` everywhere and fix bugs
* fix quantization + accelerate
* fix copies
* add `torch.compile` support
* fix tests
* fix tests and add slow tests
* copies on config
* merge the latest changes
* fix tests
* add few lines about instruct
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* "to be not" -> "not to be" (#32636 )
* "to be not" -> "not to be"
* Update sam.md
* Update trainer.py
* Update modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* fix hfoption tag
* tokenization pt. 2
* image processor
* fix toctree
* backbones
* feature extractor
* fix file name
* processor
* update not-doctested
* update
* make style
* fix toctree
* revision
* make fixup
* fix toctree
* fix
* make style
* fix hfoption tag
* pipeline
* pipeline gradio
* pipeline web server
* add pipeline
* fix toctree
* not-doctested
* prompting
* llm optims
* fix toctree
* fixes
* cache
* text generation
* fix
* chat pipeline
* chat stuff
* xla
* torch.compile
* cpu inference
* toctree
* gpu inference
* agents and tools
* gguf/tiktoken
* finetune
* toctree
* trainer
* trainer pt 2
* optims
* optimizers
* accelerate
* parallelism
* fsdp
* update
* distributed cpu
* hardware training
* gpu training
* gpu training 2
* peft
* distrib debug
* deepspeed 1
* deepspeed 2
* chat toctree
* quant pt 1
* quant pt 2
* fix toctree
* fix
* fix
* quant pt 3
* quant pt 4
* serialization
* torchscript
* scripts
* tpu
* review
* model addition timeline
* modular
* more reviews
* reviews
* fix toctree
* reviews reviews
* continue reviews
* more reviews
* modular transformers
* more review
* zamba2
* fix
* all frameworks
* pytorch
* supported model frameworks
* flashattention
* rm check_table
* not-doctested.txt
* rm check_support_list.py
* feedback
* updates/feedback
* review
* feedback
* fix
* update
* feedback
* updates
* update
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
Mehant Kammakomati
c3ba53303b
feat: add support for tensor parallel training workflow with accelerate ( #34194 )
...
* feat: add support for tensor parallel flow using accelerate
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* fix: add tp degree to env variable
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* fix: add version check for accelerate to allow TP
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* docs: tensor parallelism
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* nit: rename plugin name
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* fix: guard accelerate version before allow tp
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
* docs: add more docs and updates related to TP
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
---------
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-18 14:05:46 +01:00
Stas Bekman
9dc1efa5d4
DeepSpeed github repo move sync ( #36021 )
...
deepspeed github repo move
2025-02-05 08:19:31 -08:00
Henry Hyeonmok Ko
31299670cd
Multiple typo fixes in Tutorials docs ( #35035 )
...
* Fixed typo in multi gpu docs and OLMoE version
* Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction
* Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs
2024-12-02 15:26:34 +00:00
Rémy Léone
22b41b3f8a
Update perf_train_gpu_many.md ( #31451 )
...
* Update perf_train_gpu_many.md
* Update docs/source/en/perf_train_gpu_many.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_train_gpu_many.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-06-18 11:00:26 -07:00
Lysandre Debut
f497f564bb
Update all references to canonical models ( #29001 )
...
* Script & Manual edition
* Update
2024-02-16 08:16:58 +01:00
Klaus Hipp
fe3df9d5b3
[Docs] Add language identifiers to fenced code blocks ( #28955 )
...
Add language identifiers to code blocks
2024-02-12 10:48:31 -08:00
Peter Götz
2875195887
[docs
] Improve visualization for vertical parallelism ( #28583 )
...
The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.
2024-01-25 17:55:11 +00:00
Steven Liu
01c081d138
[docs] Trainer docs ( #28145 )
...
* fsdp, debugging, gpu selection
* fix hfoption
* fix
2023-12-20 10:37:23 -08:00
Peter Pan
ce31508134
docs: replace torch.distributed.run by torchrun ( #27528 )
...
* docs: replace torch.distributed.run by torchrun
`transformers` now officially support pytorch >= 1.10.
The entrypoint `torchrun`` is present from 1.10 onwards.
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
* Update src/transformers/trainer.py
with @ArthurZucker's suggestion
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-11-27 16:26:33 +00:00
Maria Khalusova
9beb2737d7
[docs] fixed links with 404 ( #27327 )
...
* fixed links with 404
* make style
2023-11-06 19:45:03 +00:00
Maria Khalusova
9333bf0769
[docs] Performance docs refactor p.2 ( #26791 )
...
* initial edits
* improvements for clarity and flow
* improvements for clarity and flow, removed the repetead section
* removed two docs that had no content
* Revert "removed two docs that had no content"
This reverts commit e98fa2fa0d
.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feedback addressed
* more feedback addressed
* feedback addressed
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-24 13:10:06 -04:00
statelesshz
8ba26c18cf
deprecate sharded_ddp
training argument ( #24825 )
...
* deprecate fairscale's ShardedDDP
* fix code style
* roll back
* deprecate the `sharded_ddp` training argument
---------
Co-authored-by: jihuazhong <jihuazhong1@huawei.com>
2023-07-17 06:57:42 -04:00
Sylvain Gugger
eb849f6604
Migrate doc files to Markdown. ( #24376 )
...
* Rename index.mdx to index.md
* With saved modifs
* Address review comment
* Treat all files
* .mdx -> .md
* Remove special char
* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2023-06-20 18:07:47 -04:00