transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

History

Jaeyong Sung 583db52bc6 Add Dia model (#38405 ) * add dia model * add tokenizer files * cleanup some stuff * brut copy paste code * rough cleanup of the modeling code * nuke some stuff * more nuking * more cleanups * updates * add mulitLayerEmbedding vectorization * nits * more modeling simplifications * updates * update rope * update rope * just fixup * update configuration files * more cleanup! * default config values * update * forgotten comma * another comma! * update, more cleanups * just more nits * more config cleanups * time for the encoder * fix * sa=mall nit * nits * n * refacto a bit * cleanup * update cv scipt * fix last issues * fix last nits * styling * small fixes * just run 1 generation * fixes * nits * fix conversion * fix * more fixes * full generate * ouf! * fixes! * updates * fix * fix cvrt * fixup * nits * delete wrong test * update * update * test tokenization * let's start changing things bit by bit - fix encoder step * removing custom generation, moving to GenerationMixin * add encoder decoder attention masks for generation * mask changes, correctness checked against ad29837 in dia repo * refactor a bit already --> next cache * too important not to push :) * minimal cleanup + more todos * make main overwrite modeling utils * add cfg filter & eos filter * add eos countdown & delay pattern * update eos countdown * add max step eos countdown * fix tests * fix some things * fix generation with testing * move cfg & eos stuff to logits processor * make RepetitionPenaltyLogitsProcessor flexible - can accept 3D scores like (batch_size, channel, vocab) * fix input_ids concatenation dimension in GenerationMixin for flexibility * Add DiaHangoverLogitsProcessor and DiaExponentialDecayLengthPenalty classes; refactor logits processing in DiaForConditionalGeneration to utilize new configurations and improve flexibility. * Add stopping criteria * refactor * move delay pattern from processor to modeling like musicgen. - add docs - change eos countdown to eos delay pattern * fix processor & fix tests * refactor types * refactor imports * format code * fix docstring to pass ci * add docstring to DiaConfig & add DiaModel to test * fix docstring * add docstring * fix some bugs * check * porting / merging results from other branch - IMPORTANT: it very likely breaks generation, the goal is to have a proper forward path first * experimental testing of left padding for first channel * whoops * Fix merge to make generation work * fix cfg filter * add position ids * add todos, break things * revert changes to generation --> we will force 2d but go 3d on custom stuff * refactor a lot, change prepare decoder ids to work with left padding (needs testing), add todos * some first fixes to get to 10. in generation * some more generation fixes / adjustment * style + rope fixes * move cfg out, simplify a few things, more todos * nit * start working on custom logit processors * nit * quick fixes * cfg top k * more refactor of logits processing, needs a decision if gen config gets the new attributes or if we move it to config or similar * lets keep changes to core code minimal, only eos scaling is questionable atm * simpler eos delay logits processor * that was for debugging :D * proof of concept rope * small fix on device mismatch * cfg fixes + delay logits max len * transformers rope * modular dia * more cleanup * keep modeling consistently 3D, generate handles 2D internally * decoder starts with bos if nothing * post processing prototype * style * lol * force sample / greedy + fixes on padding * style * fixup tokenization * nits * revert * start working on dia tests * fix a lot of tests * more test fixes * nit * more test fixes + some features to simplify code more * more cleanup * forgot that one * autodocs * small consistency fixes * fix regression * small fixes * dia feature extraction * docs * wip processor * fix processor order * processing goes brrr * transpose before * small fix * fix major bug but needs now a closer look into the custom processors esp cfg * small thing on logits * nits * simplify indices and shifts * add simpler version of padding tests back (temporarily) * add logit processor tests * starting tests on processor * fix mask application during generation * some fixes on the weights conversion * style + fixup logits order * simplify conversion * nit * remove padding tests * nits on modeling * hmm * fix tests * trigger * probably gonna be reverted, just a quick design around audio tokenizer * fixup typing * post merge + more typing * initial design for audio tokenizer * more design changes * nit * more processor tests and style related things * add to init * protect import * not sure why tbh * add another protect * more fixes * wow * it aint stopping :D * another missed type issue * ... * change design around audio tokenizer to prioritize init and go for auto - in regards to the review * change to new causal mask function + docstrings * change ternary * docs * remove todo, i dont think its essential tbh * remove pipeline as current pipelines do not fit in the current scheme, same as csm * closer to wrapping up the processor * text to audio, just for demo purposes (will likely be reverted) * check if it's this * save audio function * ensure no grad * fixes on prefixed audio, hop length is used via preprocess dac, device fixes * integration tests (tested locally on a100) + some processor utils / fixes * style * nits * another round of smaller things * docs + some fixes (generate one might be big) * msytery solved * small fix on conversion * add abstract audio tokenizer, change init check to abstract class * nits * update docs + fix some processing :D * change inheritance scheme for audio tokenizer * delete dead / unnecessary code in copied generate loop * last nits on new pipeline behavior (+ todo on tests) + style * trigger --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Vasqu <antonprogamer@gmail.com>		2025-06-26 11:04:23 +00:00
..
internal	Remove all traces of `low_cpu_mem_usage` (#38792 )	2025-06-12 16:39:33 +02:00
main_classes	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
model_doc	Add Dia model (#38405 )	2025-06-26 11:04:23 +00:00
quantization	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
reference	Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag (#36835 )	2025-05-23 16:39:47 +00:00
tasks	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
_config.py	Add optimized `PixtralImageProcessorFast` (#34836 )	2024-11-28 16:04:05 +01:00
_redirects.yml	Docs / Quantization: Redirect deleted page (#31063 )	2024-05-28 18:29:22 +02:00
_toctree.yml	Add Dia model (#38405 )	2025-06-26 11:04:23 +00:00
accelerate.md	change fsdp_strategy to fsdp in TrainingArguments in accelerate doc (#38807 )	2025-06-13 15:32:40 +00:00
accelerator_selection.md	[docs] add xpu environment variable for gpu selection (#38194 )	2025-05-30 16:05:07 +00:00
add_new_model.md	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
add_new_pipeline.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
agents.md	[agents] remove agents 🧹 (#37368 )	2025-04-11 18:42:37 +01:00
attention_interface.md	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
auto_docstring.md	[`AutoDocstring`] Based on inspect parsing of the signature (#33771 )	2025-05-08 17:46:07 -04:00
backbones.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
cache_explanation.md	[docs] Format fix (#38414 )	2025-06-03 09:53:23 -07:00
chat_extras.md	Update chat_extras.md with content correction (#36599 )	2025-03-07 13:09:02 +00:00
chat_templating_multimodal.md	[chat-template] Unify tests and clean up 🧼 (#37275 )	2025-04-10 14:42:32 +02:00
chat_templating_writing.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
chat_templating.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
community.md	Fixed Majority of the Typos in `transformers[en]` Documentation (#33350 )	2024-09-09 10:47:24 +02:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
conversations.md	[`chat`] generate parameterization powered by `GenerationConfig` and UX-related changes (#38047 )	2025-05-12 14:04:41 +01:00
custom_models.md	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
debugging.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
deepspeed.md	chore: Fix typos in docs and examples (#36524 )	2025-03-04 13:47:41 +00:00
executorch.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
fast_tokenizers.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
feature_extractors.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
fsdp.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
generation_features.md	chore: Fix typos in docs and examples (#36524 )	2025-03-04 13:47:41 +00:00
generation_strategies.md	Fix custom generate from local directory (#38916 )	2025-06-20 17:36:57 +01:00
gguf.md	Fix gguf docs (#36601 )	2025-03-11 15:29:14 +01:00
glossary.md	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
how_to_hack_models.md	[doc] fix bugs in `how_to_hack_models.md` (#38198 )	2025-05-19 10:37:54 -07:00
hpo_train.md	[Nit] Add Note on SigOpt being in Public Archive Mode (#38610 )	2025-06-05 14:07:23 -07:00
image_processors.md	🔴 Video processors as a separate class (#35206 )	2025-05-12 11:55:51 +02:00
index.md	[docs] Update docs moved to the course (#38800 )	2025-06-13 12:02:27 -07:00
installation.md	byebye torch 2.0 (#37277 )	2025-04-07 15:19:47 +02:00
kv_cache.md	[docs] update cache docs with new info (#38775 )	2025-06-13 07:10:56 +00:00
llm_optims.md	[CI] green llama tests (#37244 )	2025-04-03 14:15:53 +01:00
llm_tutorial_optimization.md	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
llm_tutorial.md	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
model_memory_anatomy.md	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
model_sharing.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
models.md	Fix grammatical error in models documentation (#39019 )	2025-06-25 14:55:22 +00:00
modular_transformers.md	[modular] CLI allows positional arguments, and more defaults names for the optional arg (#38979 )	2025-06-23 12:40:01 +02:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
optimizers.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
pad_truncation.md	Fixed Majority of the Typos in `transformers[en]` Documentation (#33350 )	2024-09-09 10:47:24 +02:00
peft.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
perf_hardware.md	chore: Fix typos in docs and examples (#36524 )	2025-03-04 13:47:41 +00:00
perf_infer_cpu.md	remove ipex_optimize_model usage (#38632 )	2025-06-06 20:04:44 +02:00
perf_infer_gpu_multi.md	Fix: make docs work better with doc builder (#38213 )	2025-05-20 08:23:03 +00:00
perf_infer_gpu_one.md	Small typo lines 47 and 199 perf_infer_gpu_one.md (#37938 )	2025-05-06 14:32:55 +01:00
perf_torch_compile.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
perf_train_cpu_many.md	remove ipex_optimize_model usage (#38632 )	2025-06-06 20:04:44 +02:00
perf_train_cpu.md	remove ipex_optimize_model usage (#38632 )	2025-06-06 20:04:44 +02:00
perf_train_gaudi.md	Add Intel Gaudi doc (#37855 )	2025-04-29 13:28:06 -07:00
perf_train_gpu_many.md	docs: fix typo (#37567 )	2025-04-17 14:54:44 +01:00
perf_train_gpu_one.md	[docs] Typos - Single GPU efficient training features (#38964 )	2025-06-23 12:33:10 -07:00
perf_train_special.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
perf_train_tpu_tf.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
perplexity.md	[docs] use device-agnostic API instead of cuda (#34913 )	2024-11-26 09:23:34 -08:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_gradio.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
pipeline_tutorial.md	chore: Fix typos in docs and examples (#36524 )	2025-03-04 13:47:41 +00:00
pipeline_webserver.md	fix and enhance pipeline_webserver.md (#36992 )	2025-04-15 08:35:05 -07:00
pr_checks.md	Fixed Majority of the Typos in `transformers[en]` Documentation (#33350 )	2024-09-09 10:47:24 +02:00
processors.md	[docs] add Audio import (#38195 )	2025-05-19 13:16:35 +00:00
quicktour.md	Add Hugging Face authentication procedure for IDEs (PyCharm, VS Code,… (#38954 )	2025-06-24 11:48:15 -07:00
run_scripts.md	Remove research projects (#36645 )	2025-03-11 13:47:38 +00:00
serialization.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
serving.md	fix docs serving typos. (#37936 )	2025-05-06 14:32:44 +01:00
testing.md	[tests] remove TF tests (uses of `require_tf`) (#38944 )	2025-06-25 17:29:10 +00:00
tf_xla.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
tflite.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
tokenizer_summary.md	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
tools.md	[agents] remove agents 🧹 (#37368 )	2025-04-11 18:42:37 +01:00
torchscript.md	Fix wording in `torchscript.md` (#38004 )	2025-05-08 16:47:45 +01:00
trainer.md	feat: add flexible Liger Kernel configuration to TrainingArguments (#38911 )	2025-06-19 15:54:08 +00:00
training.md	[docs] Redesign (#31757 )	2025-03-03 10:33:46 -08:00
troubleshooting.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
video_processors.md	🔴 Video processors as a separate class (#35206 )	2025-05-12 11:55:51 +02:00