transformers/docs/source/en
Yaswanth Gali a2ef3cf537
Add Janus model (#36053)
* Iterative generation using input embeds

* Add Janus model

* discard changes

* Janus imports

* Refactor config and processor

* Added Vision tower of Janus

* Import Janus Image processor

* Vision tower fixes

* Refactor code

* Added VQ Model

* Complete model integration

* temp conversion script

* processor refactor

* Adding files to facilitate pulling

* Fixes after debugging

* Skip test for these models

* Add Janus Model

* discard changes

* Janus imports

* Refactor config and processor

* Added Vision tower of Janus

* Import Janus Image processor

* Vision tower fixes

* Refactor code

* Added VQ Model

* Complete model integration

* temp conversion script

* processor refactor

* Adding files to facilitate pulling

* Fixes after debugging

* Refactor to Text config

*  Added generate function

* Saving intermediate convert file. Still need to read configs from the hub and convert them to our format.

* Adding version that reads from the JSON files. Still have to tweak some parameters manually.

* relative imports

* Initial tests

* Refactor image processor

* Seemingly working version of the conversion script, will need to test further.

* Adding command message

* Fixing conflicting JanusTextConfig class

* Incorporating some of the discussed changes.

* Small fix to create dir.

* Removing system from JINJA template

* Adding draft processor tests

* style fixes

* Minor fixes and enhancement

* added generation config

* Initial tests

* Small modifications, tests are now passing.

* Small changes I noticed while reading code.

* more fixes

* Added JanusModel class

* Small merge adaptations

* Small merge adaptations

* Image processing tests passing

* More tests and fixes

* Convert script updated and refactored

* Tests and cleanup

* make style

* Postprocessing for image generation

* generate refactor

* fixes

* - Passing tests that write a part of the model to cpu (e.g. test_cpu_offload)
- Passing tests of dispatching SDPA
- Only gradient checkpointing tests are left.

* Removing temporary code

* Changes

* Writing change to modular

* Added JanusVisionModel. SDPA dispatch tests pass more robustly. Gradient checkpoint tests are next

* Gradient checkpoint tests passing

* Removing debug code

* Major generate refactor 😮‍💨

* Temp changes for testing

* Green quality CI

* 2 out of 4 integration tests passing

* breadcrumbs

* Usage Examples

* Regenerate modeling after merge

* dirty code

* JanusIntegrationTest are passing

* breadcrumbs

* happy CI

* fixes

* Changing template

* nits

* Text generation logits matching original codebase at 100% precision

* Remove ./tmp from git tracking

* Remove ./tmp from git tracking

* Checkpointing changes after reviewing

* Fixing code in docstrings

* CHanging comments and small bug in convert file

* Fixing bug in image_token_id for 7B version

* Removing line that was added by both of us

* Pushing changes after discussion. Only one left is to change the key mapping for convert file.

* Updating module file

* New convert file using dict. Tested that it is equivalent to the old one by:
- comparing keys in a script
- comparing checksums of the output files between version generated with the current convert script and those generated with the old script. This is a more reliable test.

* revert changes

* mistake

* consistency change for CI

* make style

* doc fixes

* more fixes

* experimenting with masking out pad token

* checkpoint

* Batched generation with multi-images working for 1B models. Will test 7B next.

* Device fix.

* Writing changes to modular, previous ones were written to modeling just for quick testing.

* Using passed processor attention mask (only in modeling for now)

* Matching performance done in the non-standard way

* Working version of batched generation. Will change how some args are passed to make it more similar to language case

* More compliant version of the code

* Removed duplicated `_prepare_4d_causal_attention_mask_with_cache_position`

* Updating modular file, making masked filling with paddings more efficient

* Slightly more efficient version

* Modifying JanusVisionModel to be a wrapper

* Fixing test to comply with new names

* Modular overhaul

* More refactoring

* - Changing JanusVisionModel back
- Changing forward pass
- Adding boi token to the comparison

* - Removing whole context model_ids
- Using inherited implementation of prepare_inputs_for_generation

* Moving the way boi token is passed to the model

* Fixing sdpa test

* Minor changes

* testing changes

* Minor fix

* - Adding postprocessing test
- checking values of generated image on integration test

* changes

* Removing pooled attention vision module, fixing convert script as a consequence

* More changes

* Fixes

* Draft after merge

* Bug fixes

* More bug fix

* Fixing docs

* Nits

* Refactor return dict

* Moving image post processing test to main processor post process

* Passing guidance_scale as kwarg

* make style

* 🔥 refactor

* make style

* Update and green CI

* Nits and tests update

* up

* Added MID block

* fix

* Dead code

* update testcase

* update

* model_id change

* init_weight changes

---------

Co-authored-by: hsilva664 <metallic-silver@hotmail.com>
2025-04-17 09:18:51 +02:00
..
internal Simplify soft dependencies and update the dummy-creation process (#36827) 2025-04-11 11:08:36 +02:00
main_classes [agents] remove agents 🧹 (#37368) 2025-04-11 18:42:37 +01:00
model_doc Add Janus model (#36053) 2025-04-17 09:18:51 +02:00
quantization Update quantization docs (#37439) 2025-04-16 15:44:53 +02:00
tasks VDR task guide (#37485) 2025-04-15 08:55:13 -07:00
_config.py Add optimized PixtralImageProcessorFast (#34836) 2024-11-28 16:04:05 +01:00
_redirects.yml Docs / Quantization: Redirect deleted page (#31063) 2024-05-28 18:29:22 +02:00
_toctree.yml Add Janus model (#36053) 2025-04-17 09:18:51 +02:00
accelerate.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
add_new_model.md Add support for fast image processors in add-new-model-like CLI (#36313) 2025-03-13 14:16:37 -04:00
add_new_pipeline.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
agents.md [agents] remove agents 🧹 (#37368) 2025-04-11 18:42:37 +01:00
attention_interface.md Fix AttentionInterface following feedback (#37010) 2025-03-28 18:00:35 +01:00
attention.md [Docs] Fix broken links and syntax issues (#28918) 2024-02-08 14:13:35 -08:00
backbones.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
cache_explanation.md Fix typos (#36910) 2025-03-24 14:08:29 +00:00
chat_extras.md Update chat_extras.md with content correction (#36599) 2025-03-07 13:09:02 +00:00
chat_templating_multimodal.md [chat-template] Unify tests and clean up 🧼 (#37275) 2025-04-10 14:42:32 +02:00
chat_templating_writing.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
chat_templating.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
community.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
conversations.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
custom_models.md Fix typos (#36910) 2025-03-24 14:08:29 +00:00
debugging.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
deepspeed.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
executorch.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
fast_tokenizers.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
feature_extractors.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
fsdp.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
generation_features.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
generation_strategies.md fix typos in the docs directory (#36639) 2025-03-11 09:41:41 -07:00
gguf.md Fix gguf docs (#36601) 2025-03-11 15:29:14 +01:00
glossary.md Fix typos (#31819) 2024-07-08 11:52:47 +01:00
gpu_selection.md Fix typos (#36910) 2025-03-24 14:08:29 +00:00
how_to_hack_models.md fix typos in the docs directory (#36639) 2025-03-11 09:41:41 -07:00
hpo_train.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
image_processors.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
index.md Adding Qwen3 and Qwen3MoE (#36878) 2025-03-31 09:50:49 +02:00
installation.md byebye torch 2.0 (#37277) 2025-04-07 15:19:47 +02:00
kv_cache.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
llm_optims.md [CI] green llama tests (#37244) 2025-04-03 14:15:53 +01:00
llm_tutorial_optimization.md fix typos in the docs directory (#36639) 2025-03-11 09:41:41 -07:00
llm_tutorial.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
model_memory_anatomy.md Enable BNB multi-backend support (#31098) 2024-09-24 03:40:56 -06:00
model_sharing.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
model_summary.md model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702) 2024-03-23 18:29:39 -07:00
models.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
modular_transformers.md Support custom dosctrings in modular (#36726) 2025-03-18 14:00:54 -04:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
optimizers.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
pad_truncation.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
peft.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_hardware.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
perf_infer_cpu.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_infer_gpu_multi.md enable tp on CPU (#36299) 2025-03-31 10:55:47 +02:00
perf_infer_gpu_one.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_torch_compile.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_cpu_many.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_cpu.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_gpu_many.md Mention UltraScale Playbook 🌌 in docs (#36589) 2025-03-06 14:48:11 -08:00
perf_train_gpu_one.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_special.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_tpu_tf.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perplexity.md [docs] use device-agnostic API instead of cuda (#34913) 2024-11-26 09:23:34 -08:00
philosophy.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
pipeline_gradio.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
pipeline_tutorial.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
pipeline_webserver.md fix and enhance pipeline_webserver.md (#36992) 2025-04-15 08:35:05 -07:00
pr_checks.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
processors.md [docs] Fix image link (#36869) 2025-03-25 11:34:21 -07:00
quicktour.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
run_scripts.md Remove research projects (#36645) 2025-03-11 13:47:38 +00:00
serialization.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
serving.md [docs] Serving LLMs (#36522) 2025-03-10 13:14:19 -07:00
task_summary.md [doctest] Fixes (#35863) 2025-01-26 15:26:38 -08:00
tasks_explained.md fix: Wrong task mentioned in docs (#34757) 2024-11-18 18:42:28 +00:00
testing.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
tf_xla.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
tflite.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
tokenizer_summary.md [docs] Spanish translation of tokenizer_summary.md (#31154) 2024-06-03 16:52:23 -07:00
tools.md [agents] remove agents 🧹 (#37368) 2025-04-11 18:42:37 +01:00
torchscript.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
trainer.md (Part 2) feat: allow for tp_size attr for tplizing the model (#37054) 2025-04-10 17:44:09 +02:00
training.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
troubleshooting.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00