amyeroberts
|
1de7dc7403
|
Skip tests properly (#31308)
* Skip tests properly
* [test_all]
* Add 'reason' as kwarg for skipTest
* [test_all] Fix up
* [test_all]
|
2024-06-26 21:59:08 +01:00 |
|
Cyril Vallez
|
8bcf9c8dd4
|
Fix jetmoe model (#31279)
* Fix jetmoe model
* Remove skip-tests
|
2024-06-07 11:51:41 +02:00 |
|
amyeroberts
|
940fde8daf
|
Skip failing JetMOE generation tests (#31266)
Skip failing tests for now
|
2024-06-05 19:06:46 +01:00 |
|
Yikang Shen
|
ccdabc5642
|
Add JetMoE model (#30005)
* init jetmoe code
* update archive maps
* remove flax import
* fix import error
* update README
* ruff fix
* update readme
* fix
* update config
* fix issue
* merge files
* fix model bug
* fix test
* auto fix
* model size
* add comments
* fix form
* add flash attention support
* fix attention head number
* fix init
* fix support list
* sort auto mapping
* fix test
* fix docs
* update test
* fix test
* fix test
* change variable name
* fix config
* fix init
* update format
* clean code
* fix config
* fix config
* change default config
* update config
* fix issues
* update formate
* update config argument
* update format
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* change to mixtral aux loss
* change to cache_position
* debug
* fix bugs
* debug
* fix format
* fix format
* fix copy
* fix format
* fix format
* fix sort
* fix sort
* fix sort
* add copy comment
* add copy from
* remove debug code
* revert readme update
* add copy
* debug
* remove debug code
* fix flash attention
* add comments
* clean code
* clean format
* fix format
* fix format
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* change variable name
* add copied from
* fix variable name
* remove deprecated functinos
* sync to llama implementation
* fix format
* fix copy
* fix format
* update format
* remove repr
* add comment for moe weight
* fix copy
* Update src/transformers/models/jetmoe/configuration_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add comments and reformat config
* fix format
* fix format
* fix format
* update test
* update doc string in config
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update config doc
* update attention cache
* fix format
* fix copy
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2024-05-14 16:32:01 +02:00 |
|