transformers/tests
Pablo Montalvo a25f7d3c12
Paligemma causal attention mask (#30967)
* PaliGemma working causal attention

* Formatting

* Style

* Docstrings + remove commented code

* Update docstring for PaliGemma Config

* PaliGemma - add separator ind to model/labels

* Refactor + docstring paligemma processor method

* Style

* return token type ids when tokenizing labels

* use token type ids when building causal mask

* add token type ids to tester

* remove separator from config

* fix style

* don't ignore separator

* add processor documentation

* simplify tokenization

* fix causal mask

* style

* fix label propagation, revert suffix naming

* fix style

* fix labels tokenization

* [run-slow]paligemma

* add eos if suffixes are present

* [run-slow]paligemma

* [run-slow]paligemma

* add misssing tokens to fast version

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

* [run-slow]paligemma

---------

Co-authored-by: Peter Robicheaux <peter@roboflow.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-22 19:37:15 +02:00
..
agents Reboot Agents (#30387) 2024-05-07 12:59:49 +02:00
benchmark
bettertransformer
deepspeed Update ds_config_zero3.json (#30829) 2024-05-15 10:02:31 -04:00
extended CI: update to ROCm 6.0.2 and test MI300 (#30266) 2024-05-13 18:14:36 +02:00
fixtures Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) 2024-03-19 14:43:02 +00:00
fsdp Add FSDP config for CPU RAM efficient loading through accelerate (#30002) 2024-04-22 13:15:28 +01:00
generation Generation: get special tokens from model config (#30899) 2024-05-22 18:15:41 +02:00
models Paligemma causal attention mask (#30967) 2024-05-22 19:37:15 +02:00
optimization Add WSD scheduler (#30231) 2024-04-25 12:07:21 +01:00
peft_integration FIX [CI]: Fix failing tests for peft integration (#29330) 2024-02-29 03:56:16 +01:00
pipelines [Whisper] Strip prompt before finding common subsequence (#27836) 2024-05-22 17:25:47 +01:00
quantization FEAT / Bitsandbytes: Add dequantize API for bitsandbytes quantized models (#30806) 2024-05-15 17:17:09 +02:00
repo_utils Allow # Ignore copy (#27328) 2023-12-07 10:00:08 +01:00
sagemaker update ruff version (#30932) 2024-05-22 06:40:15 +02:00
tokenization update ruff version (#30932) 2024-05-22 06:40:15 +02:00
trainer Enforce saving at end of training if saving option chosen (#30160) 2024-05-21 07:50:11 -04:00
utils 🚨 out_indices always a list (#30941) 2024-05-22 15:23:04 +01:00
__init__.py
test_backbone_common.py Align backbone stage selection with out_indices & out_features (#27606) 2023-12-20 18:33:17 +00:00
test_cache_utils.py Generate: add tests for caches with pad_to_multiple_of (#29462) 2024-03-06 10:57:04 +00:00
test_configuration_common.py
test_configuration_utils.py [tests] remove deprecated tests for model loading (#29450) 2024-03-15 14:18:41 +00:00
test_feature_extraction_common.py
test_feature_extraction_utils.py [tests] remove deprecated tests for model loading (#29450) 2024-03-15 14:18:41 +00:00
test_image_processing_common.py Raise unused kwargs image processor (#29063) 2024-02-20 16:20:20 +01:00
test_image_processing_utils.py [tests] remove deprecated tests for model loading (#29450) 2024-03-15 14:18:41 +00:00
test_image_transforms.py fix: center_crop occasionally outputs off-by-one dimension matrix (#30934) 2024-05-21 13:56:52 +01:00
test_modeling_common.py Fix low cpu mem usage tests (#30808) 2024-05-22 14:09:01 +02:00
test_modeling_flax_common.py add sdpa to ViT [follow up of #29325] (#30555) 2024-05-16 10:56:11 +01:00
test_modeling_flax_utils.py Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599) 2024-01-23 10:28:23 +01:00
test_modeling_tf_common.py Port IDEFICS to tensorflow (#26870) 2024-05-13 15:59:46 +01:00
test_modeling_tf_utils.py Cast bfloat16 to float32 for Numpy conversions (#29755) 2024-03-21 14:04:11 +00:00
test_modeling_utils.py Llama: fix custom 4D masks, v2 (#30348) 2024-05-13 13:46:06 +02:00
test_pipeline_mixin.py Image Feature Extraction pipeline (#28216) 2024-02-05 14:50:07 +00:00
test_processing_common.py Don't save processor_config.json if a processor has no extra attribute (#28584) 2024-01-19 09:59:14 +00:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py update ruff version (#30932) 2024-05-22 06:40:15 +02:00
test_tokenization_utils.py [tests] remove deprecated tests for model loading (#29450) 2024-03-15 14:18:41 +00:00