Steven Liu
c0f8d055ce
[docs] Redesign ( #31757 )
...
* toctree
* not-doctested.txt
* collapse sections
* feedback
* update
* rewrite get started sections
* fixes
* fix
* loading models
* fix
* customize models
* share
* fix link
* contribute part 1
* contribute pt 2
* fix toctree
* tokenization pt 1
* Add new model (#32615 )
* v1 - working version
* fix
* fix
* fix
* fix
* rename to correct name
* fix title
* fixup
* rename files
* fix
* add copied from on tests
* rename to `FalconMamba` everywhere and fix bugs
* fix quantization + accelerate
* fix copies
* add `torch.compile` support
* fix tests
* fix tests and add slow tests
* copies on config
* merge the latest changes
* fix tests
* add few lines about instruct
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* "to be not" -> "not to be" (#32636 )
* "to be not" -> "not to be"
* Update sam.md
* Update trainer.py
* Update modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* fix hfoption tag
* tokenization pt. 2
* image processor
* fix toctree
* backbones
* feature extractor
* fix file name
* processor
* update not-doctested
* update
* make style
* fix toctree
* revision
* make fixup
* fix toctree
* fix
* make style
* fix hfoption tag
* pipeline
* pipeline gradio
* pipeline web server
* add pipeline
* fix toctree
* not-doctested
* prompting
* llm optims
* fix toctree
* fixes
* cache
* text generation
* fix
* chat pipeline
* chat stuff
* xla
* torch.compile
* cpu inference
* toctree
* gpu inference
* agents and tools
* gguf/tiktoken
* finetune
* toctree
* trainer
* trainer pt 2
* optims
* optimizers
* accelerate
* parallelism
* fsdp
* update
* distributed cpu
* hardware training
* gpu training
* gpu training 2
* peft
* distrib debug
* deepspeed 1
* deepspeed 2
* chat toctree
* quant pt 1
* quant pt 2
* fix toctree
* fix
* fix
* quant pt 3
* quant pt 4
* serialization
* torchscript
* scripts
* tpu
* review
* model addition timeline
* modular
* more reviews
* reviews
* fix toctree
* reviews reviews
* continue reviews
* more reviews
* modular transformers
* more review
* zamba2
* fix
* all frameworks
* pytorch
* supported model frameworks
* flashattention
* rm check_table
* not-doctested.txt
* rm check_support_list.py
* feedback
* updates/feedback
* review
* feedback
* fix
* update
* feedback
* updates
* update
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
Anton Vlasjuk
7434c0ed21
Mistral-related models for QnA ( #34045 )
...
* mistral qna start
* mixtral qna
* oops
* qwen2 qna
* qwen2moe qna
* add missing input embed methods
* add copied to all methods, can't directly from llama due to the prefix
* make top level copied from
2024-10-14 08:53:32 +02:00
Dr. Artificial曾小健
5f4ee98a7a
Update qwen2.md ( #32108 )
...
* Update qwen2.md
outdated description
* Update qwen2.md
amended
* Update qwen2.md
Update
* Update qwen2.md
fix wrong version code, now good to go
2024-07-24 11:54:41 +01:00
Joseph Enguehard
07bf2dff78
Add TokenClassification for Mistral, Mixtral and Qwen2 ( #29878 )
...
* Add MistralForTokenClassification
* Add tests and docs
* Add token classification for Mixtral and Qwen2
* Save llma for token classification draft
* Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2
* Formatting
* Add token classification support for Qwen2Moe model
* Add dropout layer to each ForTokenClassification model
* Add copied from in tests
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Propagate suggested changes
* Style
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-05-20 10:06:57 +02:00
Arthur
89c64817ce
[Doc
] update model doc qwen2 ( #29238 )
...
* update model doc qwen2
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-23 10:43:31 +01:00
Junyang Lin
d6ffe74dfa
Add qwen2 ( #28436 )
...
* add config, modeling, and tokenization
* add auto and init
* update readme
* update readme
* update team name
* fixup
* fixup
* update config
* update code style
* update for fixup
* update for fixup
* update for fixup
* update for testing
* update for testing
* fix bug for config and tokenization
* fix bug for bos token
* not doctest
* debug tokenizer
* not doctest
* debug tokenization
* debug init for tokenizer
* fix style
* update init
* delete if in token auto
* add tokenizer doc
* add tokenizer in init
* Update dummy_tokenizers_objects.py
* update
* update
* debug
* Update tokenization_qwen2.py
* debug
* Update convert_slow_tokenizer.py
* add copies
* add copied from and make style
* update files map
* update test
* fix style
* fix merge reading and update tests
* fix tests
* fix tests
* fix style
* debug a variable in readme
* Update src/transformers/models/qwen2/configuration_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update test and copied from
* fix style
* update qwen2 tokenization and tests
* Update tokenization_qwen2.py
* delete the copied from after property
* fix style
* update tests
* update tests
* add copied from
* fix bugs
* update doc
* add warning for sliding window attention
* update qwen2 tokenization
* fix style
* Update src/transformers/models/qwen2/modeling_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix tokenizer fast
---------
Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-17 16:02:22 +01:00