Arthur
dcb183f4bd
[MPT
] Add MosaicML's MPT
model to transformers ( #24629 )
...
* draft add new model like
* some cleaning of the config
* nits
* add nested configs
* nits
* update
* update
* added layer norms + triton kernels
* consider only LPLayerNorm for now.
* update
* all keys match.
* Update
* fixing nits here and there
* working forward pass.
* removed einops dependency
* nits
* format
* add alibi
* byebye head mask
* refactor attention
* nits.
* format
* fix nits.
* nuke ande updates
* nuke tokenizer test
* don't reshape query with kv heads
* added a bit of documentation.
* remove unneeded things
* nuke more stuff
* nit
* logits match - same generations
* rm unneeded methods
* 1 remaining failing CI test
* nit
* fix nits
* fix docs
* fix docs
* rm tokenizer
* fixup
* fixup
* fixup and fix tests
* fixed configuration object.
* use correct activation
* few minor fixes
* clarify docs a bit
* logits match à 1e-12
* skip and unskip a test
* added some slow tests.
* fix readme
* add more details
* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix configuration issues
* more fixes in config
* added more models
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove unneeded position ids
* fix some comments
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* revert suggestion
* mpt alibi + added batched generation
* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove init config
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix nit
* add another slow test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fits in one line
* some refactor because make fixup doesn't pass
* add ft notebook
* update md
* correct doc path
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-25 14:32:40 +02:00