pglorio
f319ba16fa
Add Zamba ( #30950 )
...
* Update index.md
* Rebase
* Rebase
* Updates from make fixup
* Update zamba.md
* Batched inference
* Update
* Fix tests
* Fix tests
* Fix tests
* Fix tests
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update configuration_zamba.py
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update configuration_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba
* Update ZambaForCausalLM
* Update ZambaForCausalLM
* Describe diffs with original mamba layer
* Moved mamba init into `_init_weights`
* Update index.md
* Rebase
* Rebase
* Updates from make fixup
* Update zamba.md
* Batched inference
* Update
* Fix tests
* Fix tests
* Fix tests
* Fix tests
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update configuration_zamba.py
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update configuration_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba
* Update ZambaForCausalLM
* Moved mamba init into `_init_weights`
* Update ZambaForCausalLM
* Describe diffs with original mamba layer
* make fixup fixes
* quality test fixes
* Fix Zamba model path
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* Update
* circleci fixes
* fix zamba test from merge
* fix ValueError for disabling mamba kernels
* add HF copyright
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* shared_transf --> shared_transformer
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fixes
* Move attention head dim to config
* Fix circle/ci tests
* Update modeling_zamba.py
* apply GenerationMixin inheritance change from upstream
* apply import ordering
* update needed transformers version for zamba
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add contribution author
* add @slow to avoid CI
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Define attention_hidden_size
* Added doc for attention_head_size
* trigger CI
* Fix doc of attention_hidden_size
* [run-slow] zamba
* Fixed shared layer logic, swapped up<->gate in mlp
* shared_transformer -> shared_transf
* reformat HybridLayer __init__
* fix docstrings in zamba config
* added definition of _get_input_ids_and_config
* fixed formatting of _get_input_ids_and_config
---------
Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-10-04 22:28:05 +02:00