transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

History

pglorio f319ba16fa Add Zamba (#30950 ) * Update index.md * Rebase * Rebase * Updates from make fixup * Update zamba.md * Batched inference * Update * Fix tests * Fix tests * Fix tests * Fix tests * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update configuration_zamba.py * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update modeling_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Update configuration_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Merge branch 'main' of https://github.com/Zyphra/transformers_zamba * Update ZambaForCausalLM * Update ZambaForCausalLM * Describe diffs with original mamba layer * Moved mamba init into `_init_weights` * Update index.md * Rebase * Rebase * Updates from make fixup * Update zamba.md * Batched inference * Update * Fix tests * Fix tests * Fix tests * Fix tests * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update configuration_zamba.py * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update modeling_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Update configuration_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Merge branch 'main' of https://github.com/Zyphra/transformers_zamba * Update ZambaForCausalLM * Moved mamba init into `_init_weights` * Update ZambaForCausalLM * Describe diffs with original mamba layer * make fixup fixes * quality test fixes * Fix Zamba model path * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * Update * circleci fixes * fix zamba test from merge * fix ValueError for disabling mamba kernels * add HF copyright Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * shared_transf --> shared_transformer * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixes * Move attention head dim to config * Fix circle/ci tests * Update modeling_zamba.py * apply GenerationMixin inheritance change from upstream * apply import ordering * update needed transformers version for zamba Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add contribution author * add @slow to avoid CI * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Define attention_hidden_size * Added doc for attention_head_size * trigger CI * Fix doc of attention_hidden_size * [run-slow] zamba * Fixed shared layer logic, swapped up<->gate in mlp * shared_transformer -> shared_transf * reformat HybridLayer __init__ * fix docstrings in zamba config * added definition of _get_input_ids_and_config * fixed formatting of _get_input_ids_and_config --------- Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal> Co-authored-by: Quentin Anthony <qganthony@yahoo.com>		2024-10-04 22:28:05 +02:00
..
agents	Decorator for easier tool building (#33439 )	2024-09-18 11:07:51 +02:00
benchmark
bettertransformer	Fixed malapropism error (#26660 )	2023-10-09 11:04:57 +02:00
deepspeed	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
extended	[tests] skip tests for xpu (#33553 )	2024-09-19 19:28:04 +01:00
fixtures	Implementation of SuperPoint and AutoModelForKeypointDetection (#28966 )	2024-03-19 14:43:02 +00:00
fsdp	🚨🚨🚨 Update min version of accelerate to 0.26.0 (#32627 )	2024-08-20 11:42:36 +02:00
generation	Add Zamba (#30950 )	2024-10-04 22:28:05 +02:00
models	Add Zamba (#30950 )	2024-10-04 22:28:05 +02:00
optimization	fix: Fixed the `1st argument` name in classmethods (#31907 )	2024-07-11 12:11:50 +01:00
peft_integration	[PEFT] Support low_cpu_mem_usage option for PEFT loading adapters (#33725 )	2024-10-03 16:15:36 +02:00
pipelines	Make ASR pipeline compliant with Hub spec + add tests (#33769 )	2024-10-01 18:15:04 +01:00
quantization	Enables CPU AWQ model with IPEX version. (#33460 )	2024-10-04 16:25:10 +02:00
repo_utils	Refactor CI: more explicit (#30674 )	2024-08-30 18:17:25 +02:00
sagemaker	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
tokenization	Fix for slow the bug tokenizer adding spaces to single id decodes (#32564 )	2024-09-18 12:32:02 +02:00
trainer	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
utils	Ignore keys on `validate_rope` (#33753 )	2024-10-04 12:39:37 +02:00
__init__.py
test_backbone_common.py	Align backbone stage selection with out_indices & out_features (#27606 )	2023-12-20 18:33:17 +00:00
test_configuration_common.py	Refactor: Removed un-necessary `object` base class (#32230 )	2024-07-26 10:33:02 +02:00
test_feature_extraction_common.py
test_image_processing_common.py	Update kwargs validation for `preprocess` with decorator (#32024 )	2024-08-06 11:33:05 +01:00
test_image_transforms.py	fix: center_crop occasionally outputs off-by-one dimension matrix (#30934 )	2024-05-21 13:56:52 +01:00
test_modeling_common.py	Fix attn mask ignore logic in training-time trace (#32613 )	2024-10-04 19:00:45 +02:00
test_modeling_flax_common.py	add sdpa to ViT [follow up of #29325 ] (#30555 )	2024-05-16 10:56:11 +01:00
test_modeling_tf_common.py	Port IDEFICS to tensorflow (#26870 )	2024-05-13 15:59:46 +01:00
test_pipeline_mixin.py	Make ASR pipeline compliant with Hub spec + add tests (#33769 )	2024-10-01 18:15:04 +01:00
test_processing_common.py	Uniformize model processors (#31368 )	2024-10-02 10:41:08 +02:00
test_sequence_feature_extraction_common.py	Fix typo (#25966 )	2023-09-05 10:12:25 +02:00
test_tokenization_common.py	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00