transformers/utils
pglorio f319ba16fa
Add Zamba (#30950)
* Update index.md

* Rebase

* Rebase

* Updates from make fixup

* Update zamba.md

* Batched inference

* Update

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update configuration_zamba.py

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update configuration_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba

* Update ZambaForCausalLM

* Update ZambaForCausalLM

* Describe diffs with original mamba layer

* Moved mamba init into `_init_weights`

* Update index.md

* Rebase

* Rebase

* Updates from make fixup

* Update zamba.md

* Batched inference

* Update

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update configuration_zamba.py

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update configuration_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba

* Update ZambaForCausalLM

* Moved mamba init into `_init_weights`

* Update ZambaForCausalLM

* Describe diffs with original mamba layer

* make fixup fixes

* quality test fixes

* Fix Zamba model path

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* Update

* circleci fixes

* fix zamba test from merge

* fix ValueError for disabling mamba kernels

* add HF copyright

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* shared_transf --> shared_transformer

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fixes

* Move attention head dim to config

* Fix circle/ci tests

* Update modeling_zamba.py

* apply GenerationMixin inheritance change from upstream

* apply import ordering

* update needed transformers version for zamba

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add contribution author

* add @slow to avoid CI

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Define attention_hidden_size

* Added doc for attention_head_size

* trigger CI

* Fix doc of attention_hidden_size

* [run-slow] zamba

* Fixed shared layer logic, swapped up<->gate in mlp

* shared_transformer -> shared_transf

* reformat HybridLayer __init__

* fix docstrings in zamba config

* added definition of _get_input_ids_and_config

* fixed formatting of _get_input_ids_and_config

---------

Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-10-04 22:28:05 +02:00
..
test_module AutoImageProcessor (#20111) 2022-11-08 19:54:41 +00:00
tf_ops Check TF ops for ONNX compliance (#10025) 2021-02-15 07:55:10 -05:00
add_pipeline_model_mapping_to_test.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
check_build.py Fix import of FalconMambaForCausalLM (#33381) 2024-09-10 09:14:54 +02:00
check_config_attributes.py Add Zamba (#30950) 2024-10-04 22:28:05 +02:00
check_config_docstrings.py Granitemoe (#33207) 2024-09-21 01:43:50 +02:00
check_copies.py Improve error message for mismatched copies in code blocks (#31535) 2024-06-25 13:55:11 +02:00
check_doc_toc.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
check_docstrings.py Add Flax Dinov2 (#31960) 2024-08-19 09:28:13 +01:00
check_doctest_list.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
check_dummies.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
check_inits.py Fix import of FalconMambaForCausalLM (#33381) 2024-09-10 09:14:54 +02:00
check_model_tester.py Add a new script to check model testers' config (#22063) 2023-03-13 19:11:19 +01:00
check_modular_conversion.py Modular transformers: modularity and inheritance for new model additions (#33248) 2024-09-24 15:54:07 +02:00
check_repo.py Add Idefics 3! (#32473) 2024-09-25 21:28:49 +02:00
check_self_hosted_runner.py Tiny fix for check_self_hosted_runner.py (#24052) 2023-06-06 18:17:41 +02:00
check_support_list.py [RoBERTa-based] Add support for sdpa (#30510) 2024-08-28 10:26:00 +02:00
check_table.py Add OmDet-Turbo (#31843) 2024-09-25 13:26:28 -04:00
check_tf_ops.py Check TF ops for ONNX compliance (#10025) 2021-02-15 07:55:10 -05:00
create_dependency_mapping.py Modular transformers: modularity and inheritance for new model additions (#33248) 2024-09-24 15:54:07 +02:00
create_dummy_models.py CI: fix efficientnet pipeline timeout and prevent future similar issues due to large image size (#33123) 2024-08-27 11:58:27 +01:00
custom_init_isort.py Import structure & first three model refactors (#31329) 2024-09-10 11:10:53 +02:00
deprecate_models.py Remove copied froms for deprecated models (#31153) 2024-06-03 09:42:53 +01:00
download_glue_data.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
extract_warnings.py update github actions packages' version to suppress warnings (#30249) 2024-04-15 15:08:09 +02:00
get_ci_error_statistics.py Add artifact name in job step to maintain job / artifact correspondence (#28682) 2024-01-31 15:58:17 +01:00
get_github_job_time.py Make Slack CI reporting stronger (#21823) 2023-02-28 17:12:44 +01:00
get_modified_files.py exclude deleted files in the fixup script (#21436) 2023-02-03 12:57:02 -05:00
get_previous_daily_ci.py Update workflow_id in utils/get_previous_daily_ci.py (#30695) 2024-05-07 16:58:50 +02:00
get_test_info.py CI: fix efficientnet pipeline timeout and prevent future similar issues due to large image size (#33123) 2024-08-27 11:58:27 +01:00
important_models.txt ENH: [CI] Add new workflow to run slow tests of important models on push main if they are modified (#29235) 2024-04-12 10:01:28 +02:00
models_to_deprecate.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
modular_model_converter.py [modular] fixes! (#33820) 2024-09-30 16:43:55 +02:00
not_doctested.txt Add Zamba (#30950) 2024-10-04 22:28:05 +02:00
notification_service_doc_tests.py Refactor doctest (#30210) 2024-04-15 13:20:36 +02:00
notification_service_quantization.py Revive Nightly/Past CI (#31159) 2024-06-20 18:57:24 +02:00
notification_service.py Upload new model failure report to Hub (#32264) 2024-07-29 09:42:54 +02:00
past_ci_versions.py (Re-)Enable Nightly + Past CI (#22393) 2023-03-30 21:06:35 +02:00
patch_helper.py helper (#31152) 2024-05-31 08:49:33 +02:00
pr_slow_ci_models.py Avoid duplication in PR slow CI model list (#30634) 2024-05-03 18:19:30 +02:00
print_env.py Print more library versions in CI (#17384) 2022-06-02 10:24:16 +02:00
process_test_artifacts.py fix the parallel number of CI nodes when it is smaller than number of tests (#33276) 2024-09-03 16:53:21 +02:00
release.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
set_cuda_devices_for_ci.py Fix Cohere CI (#31263) 2024-06-10 15:16:58 +02:00
slow_documentation_tests.txt Update CodeLlama references (#30218) 2024-05-09 22:57:52 +02:00
sort_auto_mappings.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
split_doctest_jobs.py Refactor doctest (#30210) 2024-04-15 13:20:36 +02:00
split_model_tests.py consistent job / pytest report / artifact name correspondence (#30392) 2024-04-24 22:32:42 +02:00
tests_fetcher.py Migrate the CI runners to the new clusters (#33849) 2024-10-03 14:39:49 +02:00
update_metadata.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
update_tiny_models.py Mention model_info.id instead of model_info.modelId (#32106) 2024-07-22 14:14:47 +01:00