transformers/tests
Anton Vlasjuk 5a2aedca1e
[Mamba2] Fix caching, slow path, and multi-gpu (#35154)
* fixup mamba2 - caching and several other small fixes

* fixup cached forward

* correct fix this time

* fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step)

* remove unnecessary (un)squeeze

* fixup cache position

* simplify a few things

* [run-slow] mamba2

* multi gpu attempt two

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* add newer slow path fix

* [run-slow] mamba2
2024-12-20 09:27:47 +01:00
..
agents Add token cost + runtime monitoring to Agent and HfEngine children (#34548) 2024-12-03 13:14:52 +01:00
benchmark [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
bettertransformer Fixed malapropism error (#26660) 2023-10-09 11:04:57 +02:00
deepspeed Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
extended [tests] skip tests for xpu (#33553) 2024-09-19 19:28:04 +01:00
fixtures Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) 2024-03-19 14:43:02 +00:00
fsdp FSDP grad accum fix (#34645) 2024-11-15 22:28:06 +01:00
generation Add the Bamba Model (#34982) 2024-12-18 20:18:17 +01:00
models [Mamba2] Fix caching, slow path, and multi-gpu (#35154) 2024-12-20 09:27:47 +01:00
optimization fix: Fixed the 1st argument name in classmethods (#31907) 2024-07-11 12:11:50 +01:00
peft_integration [PEFT] Better Trainer error when prompt learning with loading best model at the end (#35087) 2024-12-11 12:44:39 +01:00
pipelines Fix seamless TTS generate (#34968) 2024-12-11 15:38:42 +01:00
quantization change bnb tests (#34713) 2024-12-18 09:49:59 -05:00
repo_utils Refactor CI: more explicit (#30674) 2024-08-30 18:17:25 +02:00
sagemaker Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
tokenization VLM: special multimodal Tokenizer (#34461) 2024-11-04 16:37:51 +01:00
tp Simplify Tensor Parallel implementation with PyTorch TP (#34184) 2024-11-18 19:51:49 +01:00
trainer Fix GA loss bugs and add unit test (#35121) 2024-12-09 09:57:41 +01:00
utils 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
test_backbone_common.py Align backbone stage selection with out_indices & out_features (#27606) 2023-12-20 18:33:17 +00:00
test_configuration_common.py Load sub-configs from composite configs (#34410) 2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py Split common test from core tests (#24284) 2023-06-15 07:30:24 -04:00
test_image_processing_common.py Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785) 2024-12-15 14:00:36 -05:00
test_image_transforms.py fix: center_crop occasionally outputs off-by-one dimension matrix (#30934) 2024-05-21 13:56:52 +01:00
test_modeling_common.py Fix some fa2 tests (#35340) 2024-12-19 17:05:25 +01:00
test_modeling_flax_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
test_processing_common.py Separate chat templates into a single file (#33957) 2024-11-26 14:18:04 +00:00
test_sequence_feature_extraction_common.py Fix typo (#25966) 2023-09-05 10:12:25 +02:00
test_tokenization_common.py Separate chat templates into a single file (#33957) 2024-11-26 14:18:04 +00:00