Raushan Turganbay
e26ae89281
[docs] update cache docs with new info ( #38775 )
...
* update docs with new info
* Update docs/source/en/kv_cache.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-13 07:10:56 +00:00
Joao Gante
beaed8ce01
[generate] move SinkCache
to a custom_generate
repo ( #38399 )
...
remove sink cache
2025-06-02 12:13:30 +02:00
Manuel de Prada Corral
2c60a442f3
fix link in kv_cache.md ( #37652 )
...
fix typo in kv_cache.md
2025-04-21 09:01:11 -07:00
Steven Liu
c0f8d055ce
[docs] Redesign ( #31757 )
...
* toctree
* not-doctested.txt
* collapse sections
* feedback
* update
* rewrite get started sections
* fixes
* fix
* loading models
* fix
* customize models
* share
* fix link
* contribute part 1
* contribute pt 2
* fix toctree
* tokenization pt 1
* Add new model (#32615 )
* v1 - working version
* fix
* fix
* fix
* fix
* rename to correct name
* fix title
* fixup
* rename files
* fix
* add copied from on tests
* rename to `FalconMamba` everywhere and fix bugs
* fix quantization + accelerate
* fix copies
* add `torch.compile` support
* fix tests
* fix tests and add slow tests
* copies on config
* merge the latest changes
* fix tests
* add few lines about instruct
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* "to be not" -> "not to be" (#32636 )
* "to be not" -> "not to be"
* Update sam.md
* Update trainer.py
* Update modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* fix hfoption tag
* tokenization pt. 2
* image processor
* fix toctree
* backbones
* feature extractor
* fix file name
* processor
* update not-doctested
* update
* make style
* fix toctree
* revision
* make fixup
* fix toctree
* fix
* make style
* fix hfoption tag
* pipeline
* pipeline gradio
* pipeline web server
* add pipeline
* fix toctree
* not-doctested
* prompting
* llm optims
* fix toctree
* fixes
* cache
* text generation
* fix
* chat pipeline
* chat stuff
* xla
* torch.compile
* cpu inference
* toctree
* gpu inference
* agents and tools
* gguf/tiktoken
* finetune
* toctree
* trainer
* trainer pt 2
* optims
* optimizers
* accelerate
* parallelism
* fsdp
* update
* distributed cpu
* hardware training
* gpu training
* gpu training 2
* peft
* distrib debug
* deepspeed 1
* deepspeed 2
* chat toctree
* quant pt 1
* quant pt 2
* fix toctree
* fix
* fix
* quant pt 3
* quant pt 4
* serialization
* torchscript
* scripts
* tpu
* review
* model addition timeline
* modular
* more reviews
* reviews
* fix toctree
* reviews reviews
* continue reviews
* more reviews
* modular transformers
* more review
* zamba2
* fix
* all frameworks
* pytorch
* supported model frameworks
* flashattention
* rm check_table
* not-doctested.txt
* rm check_support_list.py
* feedback
* updates/feedback
* review
* feedback
* fix
* update
* feedback
* updates
* update
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
Fanli Lin
531d1511f5
[docs] no hard-coding cuda ( #36043 )
...
make device-agnostic
2025-02-05 08:22:33 -08:00
Joao Gante
ece8c42488
Test: generate with torch.compile(model.forward)
as a fast test ( #34544 )
2025-01-28 14:10:38 +00:00
Steven Liu
f11f57c925
[doctest] Fixes ( #35863 )
...
doctest fixes
2025-01-26 15:26:38 -08:00
Fanli Lin
baa3b22137
[docs] add a comment that offloading requires CUDA GPU ( #35055 )
...
* add commen to offloading
* Update docs/source/en/kv_cache.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-04 07:48:34 -08:00
Joao Gante
80b774eb29
Cache: don't show warning in forward passes when past_key_values
is None ( #33541 )
2024-09-19 12:02:46 +01:00
Joao Gante
2b789f27f3
Docs: add more cross-references to the KV cache docs ( #33323 )
...
* add more cross-references
* nit
* import guard
* more import guards
* nit
* Update src/transformers/generation/configuration_utils.py
2024-09-06 10:22:00 +01:00
Raushan Turganbay
ebbe8d8014
Cache docs: update ( #32929 )
...
* some changes
* more updates
* fix cache copy
* nits
* nits
* add tests
2024-09-04 15:05:31 +05:00
Gerben van V
5129671290
Add a static cache that offloads to the CPU or other device ( #32161 )
...
* Add a static cache that offloads to the CPU or other device
* Fix PR comments, add unit-tests
2024-08-29 11:51:09 +02:00
Raushan Turganbay
37c5ca5eb9
Cache: create docs ( #32150 )
...
* draft
* updates
* works?
* try adding python example in hidden section
* another try
* hwo do i render python
* format as html code?
* Update docs/source/en/kv_cache.md
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update docs/source/en/kv_cache.md
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update docs/source/en/kv_cache.md
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update docs/source/en/kv_cache.md
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update docs/source/en/kv_cache.md
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* one more small update
* should render hidden secrtion now
* add outputs
* fix links
* check links
* update all links
* update with offloaded cache
* all cache is importable, so they appear in docs
* fix copies
* docstring...
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-08-06 10:24:19 +05:00