transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 03:58:25 +06:00

History

Raushan Turganbay 24cfcc2114 Chameleon: add model (#31534 ) * Chameleon model integration Co-authored-by: Jacob Kahn <jacobkahn1@gmail.com> Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com> * fix 7B, again. mask away image tokens * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove pretrained_config_map * make fixup passing up to utils/check_config_docstrings.py; vqgan moved to the modeling file * remove tokenizer (use llama's); remove codechameleon tests * a few copied from statements and minor changes * copied from in ChameleonModel * some copies in ChameleonForCausalLM * a few more copies * VQModel moved to ChameleonModel (as opposed to being in the processor) * ChameleonProcessor ready * Fix chameleon weights convert * update conversion script * clean-up processing * update modeling a bit * update * update (throws error...) * correct conversion ready * fix tests * fix docs * docs * ve swin norm * fix device for vocab map * add normalization * update * update script with rope rotations * final fix on model conversion * add slow tests * more info in docs * fix repo consistency tests * fix repo tests * fix-copies * hope this will make CI happy * fix for 30b model * Update docs/source/en/index.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address comments * remove assertion in conversion script * add image processor test * not copied * port changes for qk layernorm * fix-copies * read token decorator for tests * [run-slow] chameleon * one more read-token * address some comments * qk norm changes * tests and repo check * moved rope permutations to conversion, YAY! * fix past kv check * docs * layernorm done! * let's be consistent in naming * fix slow tests * weird thing with slow CI, but let's see * once more try * remove past-kv as tuple following llama * ignore * style --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: jacobkahn <jacobkahn1@gmail.com> Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com> Co-authored-by: Leonid Shamis <lshamis@meta.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>		2024-07-17 10:41:43 +05:00
..
internal	[whisper] static kv cache (#31166 )	2024-07-02 13:24:15 +01:00
main_classes	Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771 )	2024-07-16 09:32:01 -04:00
model_doc	Chameleon: add model (#31534 )	2024-07-17 10:41:43 +05:00
quantization	Docs / AQLM: Clarify `torch.compile` support for AQLM (#31473 )	2024-06-19 11:26:25 +02:00
tasks	Update depth estimation task guide (#31860 )	2024-07-09 22:13:30 +03:00
_config.py	[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 )	2024-04-08 14:21:16 +01:00
_redirects.yml	Docs / Quantization: Redirect deleted page (#31063 )	2024-05-28 18:29:22 +02:00
_toctree.yml	Chameleon: add model (#31534 )	2024-07-17 10:41:43 +05:00
accelerate.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
add_new_model.md	Remove add-new-model in favor of add-new-model-like (#30424 )	2024-04-24 09:38:18 +02:00
add_new_pipeline.md	add `push_to_hub` to pipeline (#29172 )	2024-04-16 15:34:04 +01:00
agents.md	Adds final answer tool for all agents (#31703 )	2024-07-03 11:36:09 +02:00
attention.md	[Docs] Fix broken links and syntax issues (#28918 )	2024-02-08 14:13:35 -08:00
autoclass_tutorial.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
benchmarks.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	[docs] Big model loading (#29920 )	2024-04-01 18:47:32 -07:00
chat_templating.md	Repeating an important warning in the chat template docs (#31796 )	2024-07-05 15:30:24 +01:00
community.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
conversations.md	Add sidebar tutorial for chat models (#30401 )	2024-04-25 19:38:48 +01:00
create_a_model.md	Enable HF pretrained backbones (#31145 )	2024-06-06 22:02:38 +01:00
custom_models.md	[Docs] Add language identifiers to fenced code blocks (#28955 )	2024-02-12 10:48:31 -08:00
debugging.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
deepspeed.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fsdp.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00
generation_strategies.md	Generate: Add new decoding strategy "DoLa" in `.generate()` (#29619 )	2024-07-09 17:37:38 +01:00
gguf.md	Add Qwen2 GGUF loading support (#31175 )	2024-06-03 14:55:10 +01:00
glossary.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
hpo_train.md	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
index.md	Chameleon: add model (#31534 )	2024-07-17 10:41:43 +05:00
installation.md	Use `HF_HUB_OFFLINE` + fix has_file in offline mode (#31016 )	2024-05-29 11:55:43 +01:00
llm_optims.md	Add torch.compile for Mistral (#30642 )	2024-05-20 16:27:24 +02:00
llm_tutorial_optimization.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
llm_tutorial.md	Generate: update links on LLM tutorial doc (#30550 )	2024-04-30 18:14:12 +01:00
model_memory_anatomy.md	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 )	2024-04-18 12:49:43 -04:00
model_sharing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_summary.md	model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702 )	2024-03-23 18:29:39 -07:00
multilingual.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	[Doc] Spanish translation of pad_truncation.md (#27890 )	2023-12-08 10:32:18 -08:00
peft.md	Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (#31136 )	2024-05-30 16:47:35 +02:00
perf_hardware.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
perf_infer_cpu.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md	Chameleon: add model (#31534 )	2024-07-17 10:41:43 +05:00
perf_torch_compile.md	docs: fix style (#31340 )	2024-06-10 09:53:25 +01:00
perf_train_cpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_cpu.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md	Update perf_train_gpu_many.md (#31451 )	2024-06-18 11:00:26 -07:00
perf_train_gpu_one.md	Add torch_empty_cache_steps to TrainingArguments (#31546 )	2024-07-04 13:20:49 -04:00
perf_train_special.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perplexity.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_tutorial.md	Allow FP16 or other precision inference for Pipelines (#31342 )	2024-07-05 17:21:50 +01:00
pipeline_webserver.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pr_checks.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
preprocessing.md	chore: remove duplicate words (#31853 )	2024-07-09 10:38:29 +01:00
quicktour.md	docs: fix broken link (#31370 )	2024-06-12 11:33:00 +01:00
run_scripts.md	Fix broken link to Transformers notebooks (#30512 )	2024-04-29 10:57:51 +01:00
sagemaker.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
serialization.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
task_summary.md	More fixes for doctest (#30265 )	2024-04-16 11:58:55 +02:00
tasks_explained.md	[docs] Spanish translation of tasks_explained.md (#29224 )	2024-02-26 08:18:15 -08:00
testing.md	chore: remove duplicate words (#31853 )	2024-07-09 10:38:29 +01:00
tf_xla.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tflite.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tokenizer_summary.md	[docs] Spanish translation of tokenizer_summary.md (#31154 )	2024-06-03 16:52:23 -07:00
torchscript.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
trainer.md	FEAT / Trainer: LOMO optimizer support (#30178 )	2024-05-21 10:16:37 +02:00
training.md	Added the necessay import of module (#30804 )	2024-05-14 18:45:06 +01:00
troubleshooting.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00