Lysandre Debut
d538293f62
Transformers cli clean command ( #37657 )
...
* transformers-cli -> transformers
* Chat command works with positional argument
* update doc references to transformers-cli
* doc headers
* deepspeed
---------
Co-authored-by: Joao Gante <joao@huggingface.co>
2025-04-30 12:15:43 +01:00
AfafEL
cfe666919e
Update model card for Gemma ( #37674 )
...
* Update Gemma model card
* Updated after review
* Update following review
2025-04-24 09:58:46 -07:00
Steven Liu
c0f8d055ce
[docs] Redesign ( #31757 )
...
* toctree
* not-doctested.txt
* collapse sections
* feedback
* update
* rewrite get started sections
* fixes
* fix
* loading models
* fix
* customize models
* share
* fix link
* contribute part 1
* contribute pt 2
* fix toctree
* tokenization pt 1
* Add new model (#32615 )
* v1 - working version
* fix
* fix
* fix
* fix
* rename to correct name
* fix title
* fixup
* rename files
* fix
* add copied from on tests
* rename to `FalconMamba` everywhere and fix bugs
* fix quantization + accelerate
* fix copies
* add `torch.compile` support
* fix tests
* fix tests and add slow tests
* copies on config
* merge the latest changes
* fix tests
* add few lines about instruct
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* "to be not" -> "not to be" (#32636 )
* "to be not" -> "not to be"
* Update sam.md
* Update trainer.py
* Update modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* fix hfoption tag
* tokenization pt. 2
* image processor
* fix toctree
* backbones
* feature extractor
* fix file name
* processor
* update not-doctested
* update
* make style
* fix toctree
* revision
* make fixup
* fix toctree
* fix
* make style
* fix hfoption tag
* pipeline
* pipeline gradio
* pipeline web server
* add pipeline
* fix toctree
* not-doctested
* prompting
* llm optims
* fix toctree
* fixes
* cache
* text generation
* fix
* chat pipeline
* chat stuff
* xla
* torch.compile
* cpu inference
* toctree
* gpu inference
* agents and tools
* gguf/tiktoken
* finetune
* toctree
* trainer
* trainer pt 2
* optims
* optimizers
* accelerate
* parallelism
* fsdp
* update
* distributed cpu
* hardware training
* gpu training
* gpu training 2
* peft
* distrib debug
* deepspeed 1
* deepspeed 2
* chat toctree
* quant pt 1
* quant pt 2
* fix toctree
* fix
* fix
* quant pt 3
* quant pt 4
* serialization
* torchscript
* scripts
* tpu
* review
* model addition timeline
* modular
* more reviews
* reviews
* fix toctree
* reviews reviews
* continue reviews
* more reviews
* modular transformers
* more review
* zamba2
* fix
* all frameworks
* pytorch
* supported model frameworks
* flashattention
* rm check_table
* not-doctested.txt
* rm check_support_list.py
* feedback
* updates/feedback
* review
* feedback
* fix
* update
* feedback
* updates
* update
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
Joseph Enguehard
07bf2dff78
Add TokenClassification for Mistral, Mixtral and Qwen2 ( #29878 )
...
* Add MistralForTokenClassification
* Add tests and docs
* Add token classification for Mixtral and Qwen2
* Save llma for token classification draft
* Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2
* Formatting
* Add token classification support for Qwen2Moe model
* Add dropout layer to each ForTokenClassification model
* Add copied from in tests
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Propagate suggested changes
* Style
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-05-20 10:06:57 +02:00
Arthur
594c1277b2
[ gemma
] Adds support for Gemma 💎 ( #29167 )
...
* inital commit
* update
* update conversion checkpoint
* update conversion script
* nits
* some fixes
* nits
* merge
* fix permute
* nits
* fix
* nits
* nits
* nits
* fix rope
* fix both rope
* nites
* style
* make sure flax works
* fix flax init code
* fix foward
* nits
* print flax generation out
* current code
* nits
* SIIIIIIIIIIIIIIIIIII
* update
* add new tokenizer
* correct fast tokenizer
* fix conversion
* more comments
* fix modeling and conversion
* nits and nits
* nits testing
* add some tokenization tests
* add some edge cases
* add slow tests and fix them
* fixup
* fix copies for modeling
* fix copies
* add 7B slow tests
* fix
* fix
* fix tests
* make tokenizer cis go green
* styling
* last tokenizer nits
* update jax tests
* fix flax for 7b
* add jit testing 🤗
* cleanups
* isolated nit, inv_freq for rotary_emb.inv_freq
* propagate to jax
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* adjust test
* fix conversion script
* change name
* correct file names
* update conversion script
* Fix bos and eos token ids in the model configuration (#3 )
* update modelling
* update conversion script
* add static cache for gemma
* fix sdpa generate
* fix batched
* multiple fixes
* fix FA2
* final fix
* Rename a few missing strings and filenames (#4 )
* merge with upstream main
* fix copies
* fix copies
* fix fixup
* fix fixup
* fix
* fix
* final tests
* fix fx gemma tests
* fix fx bf16/fp16 tests
* update slow fx tests
* fx slow tests: one logits, one generation
* move jit test standalone
* Apply suggestions from code review
* nits
* tokenizer updates
* more tokenization updates: custom GemmaSentencepieceExtrator
* style
* Update src/transformers/cache_utils.py
* Update src/transformers/models/gemma/__init__.py
* Update tests/models/gemma/test_modeling_flax_gemma.py
* small nits
* style
* update tokenization test
* fix the rotary embedding
* with style
* fix slow tests
* WARNING this commit might be very important for precisions
* Update tests/models/gemma/test_modeling_flax_gemma.py
* Update src/transformers/models/gemma/configuration_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update src/transformers/models/gemma/modeling_flax_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* small nits here and there!
* forgotten nit
* remove on the fly computation of inv_freq
* revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_flax_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* nit conversion script link
* fix some tests
* add not doctest and pr doctest
* repo consistency
* fix last CIs 🚀
* update all readmes
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-02-21 14:21:28 +01:00