transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-24 23:08:57 +06:00

History

Pablo Montalvo 1360801a69 Add PaliGemma (#30814 ) * add new model like * add state dict slicing + new model config * update palma config and weights, passes vision activations * fix * update * reorder loading/unpacking * clean up * add debug statements * change device * fix * debugging * fix noncausal mask * fixup sdpa + causal mask * fix activation function * remove debug before changing modeling file * add variants * debug attention mask in generate * revert to non-debug sdpa * revert gemma modifications * add custom language modeling * use Processor * add language modeling file to init * try thin wrapper around generate * Update * update mask * breakpoints galore * remove conflict * switch to left-padding * add incomplete model doc * add paligemma global files * batch rename paligemma * make generation match outputs and captioning * style * style * remove copied from + doc * remove more copied from * remove copy from projector * minor fix * update config and style * add readme - dummy * CORRECT image captioning * moving to args * add siglip proper + fix merging image + text features * take update_causal_mask from upstream * remove breakpoint * leverage AutoModel * fix input_ids slicing * make siglip head conditional * remove encoder_decoder value * remove unneeded modeling file * add commented 4d attention mask * FIXED generation with 4D mask * Update src/transformers/models/siglip/modeling_siglip.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix left padding detection * shuffle order of verifications * fix missing labels for training * fix * vectorize merging of features, improve slicing * improve testing before conversion * handle merging in processor * image token index depends on checkpoint * add variants, save processor too * save processors, base tokenizer off spm file * expand model embeddings due to additional image token * pass image processing args * add convert rgb to siglip processor * add \n token separately * fix tokenizer and prompts * fix docstrings * change to camel * fix casing * debug pos_ids and sdpa * pass and use cache_position * add flag for newline tokenization * Update src/transformers/models/paligemma/processing_paligemma.py Co-authored-by: Merve Noyan <merveenoyan@gmail.com> * simplify conversion script * add copied from * add precision to conversion script * Update src/transformers/models/paligemma/modeling_paligemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * clean up * Shift attention mask from `1:` After discussion with @molbap * add docs, fix quality * quality, tied weights inheritance, and logits/label alignment * fix more tests * pass attn_implementation to language model correctly * add SiglipVisionTransformer to no split modules * skip paligemma test for sdpa dispatch to flash * skip incompatible tests * quality * [broken archive maps] * Apply suggestions - remove archive lists - style - take shape of inputs_embeds for batch Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/utils/dummy_pt_objects.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * simplify conversion script * add suggestions * add suggestions * add copied from * fix * move labels out * revert * fix * remove placeholder labels if None * use cache_position * fix quality + docstrings * fix quality * fix paligemma 4d gemma mask incompatibility * fix config docstring * fix query and attn_mask dtype --------- Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Merve Noyan <merveenoyan@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>		2024-05-14 22:07:15 +02:00
..
internal	Add Watermarking LogitsProcessor and WatermarkDetector (#29676 )	2024-05-14 13:31:39 +05:00
main_classes	Add Watermarking LogitsProcessor and WatermarkDetector (#29676 )	2024-05-14 13:31:39 +05:00
model_doc	Add PaliGemma (#30814 )	2024-05-14 22:07:15 +02:00
tasks	Update object detection guide (#30683 )	2024-05-08 15:16:14 +01:00
_config.py	[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 )	2024-04-08 14:21:16 +01:00
_redirects.yml	Extended semantic segmentation to image segmentation (#27039 )	2023-11-23 15:58:21 +00:00
_toctree.yml	Add PaliGemma (#30814 )	2024-05-14 22:07:15 +02:00
accelerate.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
add_new_model.md	Remove add-new-model in favor of add-new-model-like (#30424 )	2024-04-24 09:38:18 +02:00
add_new_pipeline.md	add `push_to_hub` to pipeline (#29172 )	2024-04-16 15:34:04 +01:00
agents.md	Reboot Agents (#30387 )	2024-05-07 12:59:49 +02:00
attention.md	[Docs] Fix broken links and syntax issues (#28918 )	2024-02-08 14:13:35 -08:00
autoclass_tutorial.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
benchmarks.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	[docs] Big model loading (#29920 )	2024-04-01 18:47:32 -07:00
chat_templating.md	Deprecate default chat templates (#30346 )	2024-04-19 15:41:26 +01:00
community.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
conversations.md	Add sidebar tutorial for chat models (#30401 )	2024-04-25 19:38:48 +01:00
create_a_model.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
custom_models.md	[Docs] Add language identifiers to fenced code blocks (#28955 )	2024-02-12 10:48:31 -08:00
debugging.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
deepspeed.md	Rename torch.run to torchrun (#30405 )	2024-04-23 09:04:17 -07:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fsdp.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00
generation_strategies.md	Add Watermarking LogitsProcessor and WatermarkDetector (#29676 )	2024-05-14 13:31:39 +05:00
glossary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hf_quantizer.md	[CI] Quantization workflow (#29046 )	2024-02-28 10:09:25 -05:00
hpo_train.md	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
index.md	Add PaliGemma (#30814 )	2024-05-14 22:07:15 +02:00
installation.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
llm_optims.md	Cache: Static cache as a standalone object (#30476 )	2024-04-30 16:37:19 +01:00
llm_tutorial_optimization.md	F.scaled_dot_product_attention support (#26572 )	2023-12-09 05:38:14 +09:00
llm_tutorial.md	Generate: update links on LLM tutorial doc (#30550 )	2024-04-30 18:14:12 +01:00
model_memory_anatomy.md	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 )	2024-04-18 12:49:43 -04:00
model_sharing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_summary.md	model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702 )	2024-03-23 18:29:39 -07:00
multilingual.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	[Doc] Spanish translation of pad_truncation.md (#27890 )	2023-12-08 10:32:18 -08:00
peft.md	[`Peft`] `modules_to_save` support for peft integration (#27466 )	2023-11-14 10:32:57 +01:00
perf_hardware.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_infer_cpu.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md	Add PaliGemma (#30814 )	2024-05-14 22:07:15 +02:00
perf_torch_compile.md	Fix rendering for `torch.compile()` docs (#25432 )	2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_cpu.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_one.md	Fix minor typo: softare => software (#29602 )	2024-03-12 10:39:56 +00:00
perf_train_special.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perplexity.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_tutorial.md	More fixes for doctest (#30265 )	2024-04-16 11:58:55 +02:00
pipeline_webserver.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pr_checks.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
preprocessing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
quantization.md	Add HQQ quantization support (#29637 )	2024-05-02 17:51:49 +01:00
quicktour.md	Add HQQ quantization support (#29637 )	2024-05-02 17:51:49 +01:00
run_scripts.md	Fix broken link to Transformers notebooks (#30512 )	2024-04-29 10:57:51 +01:00
sagemaker.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
serialization.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
task_summary.md	More fixes for doctest (#30265 )	2024-04-16 11:58:55 +02:00
tasks_explained.md	[docs] Spanish translation of tasks_explained.md (#29224 )	2024-02-26 08:18:15 -08:00
testing.md	[doc] fix some typos and add `xpu` to the testing documentation (#29894 )	2024-03-28 09:42:49 +00:00
tf_xla.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tflite.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tokenizer_summary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
torchscript.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
trainer.md	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 )	2024-04-18 12:49:43 -04:00
training.md	Added the necessay import of module (#30804 )	2024-05-14 18:45:06 +01:00
troubleshooting.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00