transformers/docs/source/en
Yoach Lacombe 9ba021ea75
Moshi integration (#33624)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* first moshi WIP

* converting weights working + configuration + generation configuration

* finalize converting script - still missing tokenizer and FE and processor

* fix saving model w/o default config

* working generation

* use GenerationMixin instead of inheriting

* add delay pattern mask

* fix right order: moshi codes then user codes

* unconditional inputs + generation config

* get rid of MoshiGenerationConfig

* blank user inputs

* update convert script:fix conversion, add  tokenizer, feature extractor and bf16

* add and correct Auto classes

* update modeling code, configuration and tests

* make fixup

* fix some copies

* WIP: add integration tests

* add dummy objects

* propose better readiblity and code organisation

* update tokenization tests

* update docstrigns, eval and modeling

* add .md

* make fixup

* add MoshiForConditionalGeneration to ignore Auto

* revert mimi changes

* re

* further fix

* Update moshi.md

* correct md formating

* move prepare causal mask to class

* fix copies

* fix depth decoder causal

* fix and correct some tests

* make style and update .md

* correct config checkpoitn

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make style

* Update src/transformers/models/moshi/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* change firm in copyrights

* udpate config with nested dict

* replace einsum

* make style

* change split to True

* add back splt=False

* remove tests in convert

* Update tests/models/moshi/test_modeling_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add default config repo + add model to FA2 docstrings

* remove logits float

* fix some tokenization tests and ignore some others

* make style tokenization tests

* update modeling with sliding window + update modeling tests

* [run-slow] moshi

* remove prepare for generation frol CausalLM

* isort

* remove copied from

* ignore offload tests

* update causal mask and prepare 4D mask aligned with recent changes

* further test refine + add back prepare_inputs_for_generation for depth decoder

* correct conditional use of prepare mask

* update slow integration tests

* fix multi-device forward

* remove previous solution to device_map

* save_load is flaky

* fix generate multi-devices

* fix device

* move tensor to int

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-10-16 11:21:49 +02:00
..
internal Add a static cache that offloads to the CPU or other device (#32161) 2024-08-29 11:51:09 +02:00
main_classes Fix flax failures (#33912) 2024-10-11 14:38:35 +02:00
model_doc Moshi integration (#33624) 2024-10-16 11:21:49 +02:00
quantization [Docs] Update compressed_tensors.md (#33961) 2024-10-10 15:22:41 +02:00
tasks Add auto model for image-text-to-text (#32472) 2024-10-08 14:26:43 +02:00
_config.py [Doc]: Broken link in Kubernetes doc (#33879) 2024-10-04 11:20:56 +02:00
_redirects.yml Docs / Quantization: Redirect deleted page (#31063) 2024-05-28 18:29:22 +02:00
_toctree.yml Moshi integration (#33624) 2024-10-16 11:21:49 +02:00
accelerate.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
add_new_model.md Model addition timeline (#33762) 2024-09-27 17:15:13 +02:00
add_new_pipeline.md add push_to_hub to pipeline (#29172) 2024-04-16 15:34:04 +01:00
agents_advanced.md Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
agents.md Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
attention.md [Docs] Fix broken links and syntax issues (#28918) 2024-02-08 14:13:35 -08:00
autoclass_tutorial.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
benchmarks.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
bertology.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
big_models.md [docs] Big model loading (#29920) 2024-04-01 18:47:32 -07:00
chat_templating.md Add a section on writing tool templates to the chat template docs (#33924) 2024-10-04 14:40:44 +01:00
community.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
conversations.md [docs] change temperature to a positive value (#32077) 2024-07-23 17:47:51 +01:00
create_a_model.md Enable HF pretrained backbones (#31145) 2024-06-06 22:02:38 +01:00
custom_models.md Updated the custom_models.md changed cross_entropy code (#33118) 2024-08-26 13:15:43 +02:00
debugging.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
deepspeed.md Fix typos (#31819) 2024-07-08 11:52:47 +01:00
fast_tokenizers.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
fsdp.md [docs] Trainer docs (#28145) 2023-12-20 10:37:23 -08:00
generation_strategies.md Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) (#33383) 2024-10-10 14:41:53 +02:00
gguf.md Add GGUF for starcoder2 (#34094) 2024-10-14 10:22:49 +02:00
glossary.md Fix typos (#31819) 2024-07-08 11:52:47 +01:00
how_to_hack_models.md [Docs] Add Developer Guide: How to Hack Any Transformers Model (#33979) 2024-10-07 10:08:20 +02:00
hpo_train.md Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
index.md Moshi integration (#33624) 2024-10-16 11:21:49 +02:00
installation.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
kv_cache.md Cache: don't show warning in forward passes when past_key_values is None (#33541) 2024-09-19 12:02:46 +01:00
llm_optims.md Docs: add more cross-references to the KV cache docs (#33323) 2024-09-06 10:22:00 +01:00
llm_tutorial_optimization.md Enable BNB multi-backend support (#31098) 2024-09-24 03:40:56 -06:00
llm_tutorial.md Fix: typo (#33880) 2024-10-02 09:12:21 +01:00
model_memory_anatomy.md Enable BNB multi-backend support (#31098) 2024-09-24 03:40:56 -06:00
model_sharing.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
model_summary.md model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702) 2024-03-23 18:29:39 -07:00
modular_transformers.md Improve modular converter (#33991) 2024-10-08 14:53:58 +02:00
multilingual.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
peft.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perf_hardware.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perf_infer_cpu.md [Docs] Fix spelling and grammar mistakes (#28825) 2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md Moshi integration (#33624) 2024-10-16 11:21:49 +02:00
perf_torch_compile.md fix(docs): Fixed a link in docs (#32274) 2024-07-29 10:50:43 +01:00
perf_train_cpu_many.md [Doc]: Broken link in Kubernetes doc (#33879) 2024-10-04 11:20:56 +02:00
perf_train_cpu.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md Update perf_train_gpu_many.md (#31451) 2024-06-18 11:00:26 -07:00
perf_train_gpu_one.md Corrected max number for bf16 in transformer/docs (#33658) 2024-09-25 19:20:51 +02:00
perf_train_special.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
performance.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perplexity.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
philosophy.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
pipeline_tutorial.md Docs: Fixed whisper-large-v2 model link in docs (#32871) 2024-08-19 09:50:35 -07:00
pipeline_webserver.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
pr_checks.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
preprocessing.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
quicktour.md Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
run_scripts.md [docs] refine the doc for train with a script (#33423) 2024-09-12 10:16:12 -07:00
sagemaker.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
serialization.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
task_summary.md More fixes for doctest (#30265) 2024-04-16 11:58:55 +02:00
tasks_explained.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
testing.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
tf_xla.md fix(docs): Fixed a link in docs (#32274) 2024-07-29 10:50:43 +01:00
tflite.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
tiktoken.md Support reading tiktoken tokenizer.model file (#31656) 2024-09-06 14:24:02 +02:00
tokenizer_summary.md [docs] Spanish translation of tokenizer_summary.md (#31154) 2024-06-03 16:52:23 -07:00
torchscript.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
trainer.md Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
training.md Added the necessay import of module (#30804) 2024-05-14 18:45:06 +01:00
troubleshooting.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00