transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 12:08:22 +06:00

Author	SHA1	Message	Date
amyeroberts	e71bf70e33	Pixtral update example checkpoint (#33633 ) * Update pixtral example checkpoint * Fix typo	2024-09-21 01:01:16 +01:00
Mayank Mishra	e472e077c2	Granitemoe (#33207 ) * first commit * drop tokenizer * drop tokenizer * drop tokenizer * drop convert * granite * drop tokenization test * mup * fix * reformat * reformat * reformat * fix docs * stop checking for checkpoint * update support * attention multiplier * update model * tiny drop * saibo drop * skip test * fix test * fix test * drop * drop useless imports * update docs * drop flash function * copied from * drop pretraining tp * drop pretraining tp * drop pretraining tp * drop unused import * drop code path * change name * softmax scale * head dim * drop legacy cache * rename params * cleanup * fix copies * comments * add back legacy cache * multipliers * multipliers * multipliers * text fix * fix copies * merge * multipliers * attention multiplier * drop unused imports * add granitemoe * add decoration * remove moe from sequenceclassification * fix test * fix * fix * fix * move rope? * merge * drop bias * drop bias * Update src/transformers/models/granite/configuration_granite.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * Update src/transformers/models/granite/modeling_granite.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix * fix * fix * drop * drop * fix * fix * cleanup * cleanup * fix * fix granite tests * fp32 test * fix * drop jitter * fix * rename * rename * fix config * add gen test --------- Co-authored-by: Yikang Shen <yikang.shn@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-21 01:43:50 +02:00
Omar Salman	653eb40425	Add sdpa for BioGpt (#33592 ) * Add sdpa for BioGpt * Updates * Add the docs * [run_slow] biogpt * Use the copy mechanism to ensure consistency * [run_slow] biogpt	2024-09-20 14:27:32 +01:00
Yoni Gozlan	f111d5b783	Uniformize kwargs for Paligemma processor and update docs (#33571 ) * Uniformize paligemma processor * nit	2024-09-19 14:14:06 -04:00
Joao Gante	80b774eb29	Cache: don't show warning in forward passes when `past_key_values` is None (#33541 )	2024-09-19 12:02:46 +01:00
Yoach Lacombe	5af7d41e49	Codec integration (#33565 ) * clean mimi commit * some nits suggestions from Arthur * make fixup * rename repo id + change readme * Update docs/source/en/model_doc/mimi.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add flaky flag to batching equivalence due to audio_codes failing sometimes --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-09-18 19:23:44 +02:00
Aymeric Roucher	e6d9f39dd7	Decorator for easier tool building (#33439 ) * Decorator for tool building	2024-09-18 11:07:51 +02:00
Yoni Gozlan	d8500cd229	Uniformize kwargs for Pixtral processor (#33521 ) * add uniformized pixtral and kwargs * update doc * fix _validate_images_text_input_order * nit	2024-09-17 14:44:27 -04:00
Antoine Dussolle	763548427d	Add explicit example for RAG chat templating (#33503 ) * Add explicit example for RAG chat templating * Add Tip box and reformulate Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-09-17 16:08:05 +01:00
Max Buckley	ac5a0556f1	Update chameleon.md — fix runtime type error (#33494 ) Update chameleon.md Fix error RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same	2024-09-17 13:32:49 +02:00
Ahmed Almaghz	c2d05897bf	[i18n-ar] Add File : `docs/source/ar/_toctree.yml` (#32696 ) * Update ar lang build_documentation.yml * Update ar lang build_pr_documentation.yml * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/pipeline_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/autoclass_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/preprocessing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/training.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/run_scripts.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/accelerate.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/accelerate.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/accelerate.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/accelerate.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/accelerate.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/accelerate.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Create _config.py * Update _toctree.yml * Update _toctree.yml * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/peft.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/model_sharing.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/conversations.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/agents.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/llm_tutorial.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update llm_tutorial.md * Update _toctree.yml * Update autoclass_tutorial.md * Update autoclass_tutorial.md * Update preprocessing.md * Update glossary.md * Update run_scripts.md * Update run_scripts.md * Update run_scripts.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-09-16 10:02:03 -07:00
Sergio Paniego Blanco	c7a91f5adf	`Agents, supercharged - Multi-agents, External tools, and more` docs typo fixed (#33478 ) * Typo fixed in Agents, supercharged	2024-09-16 18:52:27 +02:00
Merve Noyan	ce62a41880	Add keypoint-detection task guide (#33274 ) --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-09-16 13:08:31 +02:00
Arthur	8bd2b1e8c2	Add support for Pixtral (#33449 ) * initial commit * gloups * updates * work * weights match * nits * nits * updates to support the tokenizer :) * updates * Pixtral processor (#33454) * rough outline * Add in image break and end tokens * Fix * Udo some formatting changes * Set patch_size default * Fix * Fix token expansion * nit in conversion script * Fix image token list creation * done * add expected results * Process list of list of images (#33465) * updates * working image and processor * this is the expected format * some fixes * push current updated * working mult images! * add a small integration test * Uodate configuration docstring * Formatting * Config docstring fix * simplify model test * fixup modeling and etests * Return BatchMixFeature in image processor * fix some copies * update * nits * Update model docstring * Apply suggestions from code review * Fix up * updates * revert modeling changes * update * update * fix load safe * addd liscence * update * use pixel_values as required by the model * skip some tests and refactor * Add pixtral image processing tests (#33476) * Image processing tests * Add processing tests * woops * defaults reflect pixtral image processor * fixup post merge * images -> pixel values * oups sorry Mr docbuilder * isort * fix * fix processor tests * small fixes * nit * update * last nits * oups this was really breaking! * nits * is composition needs to be true --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-09-14 12:28:39 +02:00
Sergio Paniego Blanco	e39b6c1c7c	Corrected `Agents and tools` documentation links typos (#33471 ) * Corrected agents task link typo * Corrected chat templating link * Corrected chat templating link 2	2024-09-13 17:15:20 +02:00
Fanli Lin	a05ce550bf	[docs] refine the doc for `train with a script` (#33423 ) * add xpu note * add one more case * add more * Update docs/source/en/run_scripts.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-09-12 10:16:12 -07:00
Raushan Turganbay	2f611d30d9	Qwen2-VL: clean-up and add more tests (#33354 ) * clean-up on qwen2-vl and add generation tests * add video tests * Update tests/models/qwen2_vl/test_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix and add better tests * Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update docs and address comments * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update * remove size at all --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-09-12 18:24:04 +02:00
Sergio Paniego Blanco	516ee6adc2	Fix incomplete sentence in `Zero-shot object detection` documentation (#33430 ) Rephrase sentence in zero-shot object detection docs	2024-09-12 11:25:44 +02:00
Michael Currin	e0ff4321d1	Docs - update formatting of llama3 model card (#33438 ) update formatting of llama3 content	2024-09-12 11:24:56 +02:00
Fanli Lin	cea9ec086a	[docs] add the missing tokenizer when pushing models to huggingface hub (#33428 ) * add tokenizer * typo	2024-09-11 09:56:55 -07:00
Fanli Lin	c403441339	[docs] add the missing huggingface hub username (#33431 ) * add username * update username * add username	2024-09-11 09:56:40 -07:00
Guang Yang	f38590dade	Make StaticCache configurable at model construct time (#32830 ) * Make StaticCache configurable at model construct time * integrations import structure * add new doc file to toc --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2024-09-10 16:35:57 +01:00
Alazar	96429e74a8	Add support for GGUF Phi-3 (#31844 ) * Update docs for GGUF supported models * Add tensor mappings and define class GGUFPhi3Converter * Fix tokenizer * Working version * Attempt to fix some CI failures * Run ruff format * Add vocab, merges, decoder methods like LlamaConverter * Resolve conflicts since Qwen2Moe was added to gguf - I missed one place when resolving conflict - I also made a mistake with tests_ggml.py and now has been fixed to reflect its master version.	2024-09-10 13:32:38 +02:00
Nilay Bhatnagar	eedd21b9e7	Fixed Majority of the Typos in `transformers[en]` Documentation (#33350 ) * Fixed typo: insted to instead * Fixed typo: relase to release * Fixed typo: nighlty to nightly * Fixed typos: versatible, benchamarks, becnhmark to versatile, benchmark, benchmarks * Fixed typo in comment: quantizd to quantized * Fixed typo: architecutre to architecture * Fixed typo: contibution to contribution * Fixed typo: Presequities to Prerequisites * Fixed typo: faste to faster * Fixed typo: extendeding to extending * Fixed typo: segmetantion_maps to segmentation_maps * Fixed typo: Alternativelly to Alternatively * Fixed incorrectly defined variable: output to output_disabled * Fixed typo in library name: tranformers.onnx to transformers.onnx * Fixed missing import: import tensorflow as tf * Fixed incorrectly defined variable: token_tensor to tokens_tensor * Fixed missing import: import torch * Fixed incorrectly defined variable and typo: uromaize to uromanize * Fixed incorrectly defined variable and typo: uromaize to uromanize * Fixed typo in function args: numpy.ndarry to numpy.ndarray * Fixed Inconsistent Library Name: Torchscript to TorchScript * Fixed Inconsistent Class Name: OneformerProcessor to OneFormerProcessor * Fixed Inconsistent Class Named Typo: TFLNetForMultipleChoice to TFXLNetForMultipleChoice * Fixed Inconsistent Library Name Typo: Pytorch to PyTorch * Fixed Inconsistent Function Name Typo: captureWarning to captureWarnings * Fixed Inconsistent Library Name Typo: Pytorch to PyTorch * Fixed Inconsistent Class Name Typo: TrainingArgument to TrainingArguments * Fixed Inconsistent Model Name Typo: Swin2R to Swin2SR * Fixed Inconsistent Model Name Typo: EART to BERT * Fixed Inconsistent Library Name Typo: TensorFLow to TensorFlow * Fixed Broken Link for Speech Emotion Classification with Wav2Vec2 * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed minor missing word Typo * Fixed Punctuation: Two commas * Fixed Punctuation: No Space between XLM-R and is * Fixed Punctuation: No Space between [~accelerate.Accelerator.backward] and method * Added backticks to display model.fit() in codeblock * Added backticks to display openai-community/gpt2 in codeblock * Fixed Minor Typo: will to with * Fixed Minor Typo: is to are * Fixed Minor Typo: in to on * Fixed Minor Typo: inhibits to exhibits * Fixed Minor Typo: they need to it needs * Fixed Minor Typo: cast the load the checkpoints To load the checkpoints * Fixed Inconsistent Class Name Typo: TFCamembertForCasualLM to TFCamembertForCausalLM * Fixed typo in attribute name: outputs.last_hidden_states to outputs.last_hidden_state * Added missing verbosity level: fatal * Fixed Minor Typo: take To takes * Fixed Minor Typo: heuristic To heuristics * Fixed Minor Typo: setting To settings * Fixed Minor Typo: Content To Contents * Fixed Minor Typo: millions To million * Fixed Minor Typo: difference To differences * Fixed Minor Typo: while extract To which extracts * Fixed Minor Typo: Hereby To Here * Fixed Minor Typo: addition To additional * Fixed Minor Typo: supports To supported * Fixed Minor Typo: so that benchmark results TO as a consequence, benchmark * Fixed Minor Typo: a To an * Fixed Minor Typo: a To an * Fixed Minor Typo: Chain-of-though To Chain-of-thought	2024-09-09 10:47:24 +02:00
Aymeric Roucher	489cbfd6d3	Add visit webpage tool (#33353 ) * Add VisitWebpageTool	2024-09-09 10:32:42 +02:00
Wing Lian	62aecd85ff	schedulefree optimizers (#30079 ) * schedulefree optimizers * fix train instead of eval for optimizer * fixes and update docs * chore: lint * add tests and drop overly-verbose _32bit suffix * chore: lint * fix for docs * fix code review issues * use duck-typing to avoid per-optimizer patches * fixup style * fixup style * warn if incorrect accelerate version with schedule free Co-authored-by: Aman Gupta Karmani <aman@tmm1.net> --------- Co-authored-by: Aman Karmani <aman@tmm1.net>	2024-09-09 09:51:39 +02:00
Nicholas Broad	66bc4def95	add sdpa mbart (#32033 ) * add sdpa mbart useful for donut * update sdpa docs * formatting * add self._use_sdpa in mbartencoder * use self.config to check attn * retrigger checks * [run-slow] mbart	2024-09-06 17:31:24 -07:00
Daniel Lok	a70286f827	Update author for QLorA/PEFT community notebook (#33338 ) update author Signed-off-by: Daniel Lok <daniel.lok@databricks.com>	2024-09-06 22:50:26 +02:00
Matt	d7b04ea14d	Fix Prefill docs (#33352 ) last -> final	2024-09-06 17:57:54 +01:00
Ita Zaporozhets	e48e5f1f13	Support reading tiktoken tokenizer.model file (#31656 ) * use existing TikTokenConverter to read tiktoken tokenizer.model file * del test file * create titktoken integration file * adding tiktoken llama test * ALTNATIVE IMPLEMENTATION: supports llama 405B * fix one char * remove redundant line * small fix * rm unused import * flag for converting from tiktokeng * remove unneeded file * ruff * remove llamatiktokenconverter, stick to general converter * tiktoken support v2 * update test * remove stale changes * udpate doc * protect import * use is_protobuf_available * add templateprocessor in tiktokenconverter * reverting templateprocessor from tiktoken support * update test * add require_tiktoken * dev-ci * trigger build * trigger build again * dev-ci * [build-ci-image] tiktoken * dev-ci * dev-ci * dev-ci * dev-ci * change tiktoken file name * feedback review * feedback rev * applying feedback, removing tiktoken converters * conform test * adding docs for review * add doc file for review * add doc file for review * add doc file for review * support loading model without config.json file * Revert "support loading model without config.json file" This reverts commit 2753602e51c34cef2f184eb11f36d2ad1b02babb. * remove dev var * updating docs * safely import protobuf * fix protobuf import error * fix protobuf import error * trying isort to fix ruff error * fix ruff error * try to fix ruff again * try to fix ruff again * try to fix ruff again * doc table of contents * add fix for consistency.dockerfile torchaudio * ruff * applying feedback * minor typo * merging with push-ci-image * clean up imports * revert dockerfile consistency	2024-09-06 14:24:02 +02:00
Joao Gante	2b789f27f3	Docs: add more cross-references to the KV cache docs (#33323 ) * add more cross-references * nit * import guard * more import guards * nit * Update src/transformers/generation/configuration_utils.py	2024-09-06 10:22:00 +01:00
Daniel Lok	5792c459ed	Add a community notebook for fine-tuning with QLoRA, PEFT, and MLflow (#33319 ) add notebook for finetuning with mlflow Signed-off-by: Daniel Lok <daniel.lok@databricks.com>	2024-09-06 09:35:01 +02:00
Vladislav Bronzov	5d11de4a2f	Add Qwen2Moe GGUF loading support (#33264 ) * update gguf doc, config and tensor mapping * add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests * apply code style fixes * reformat files * assign GGUFQwen2Converter to qwen2_moe	2024-09-05 17:42:03 +02:00
Niklas Muennighoff	03164ba14e	Add paper link (#33305 )	2024-09-05 15:49:28 +02:00
Raushan Turganbay	43df47d8e7	Llava Onevision: add model (#32673 ) * working version * fix copies * update * tests * update docs * codestyle * add more tests * add returns for docs * clean up * Update src/transformers/models/llava_onevision/processing_llava_onevision.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * codestyle * style * shouldn't be reversed * [run-slow] llava_onevision * [run-slow] llava_onevision * add pooling in videos * [run-slow] llava_onevision * num-logits-to-keep * [run-slow] llava_onevision * [run-slow] llava_onevision * Update tests/test_modeling_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * video matched orig impl * fix tests * chat template was modified * Update docs/source/en/model_doc/llava_onevision.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add morer info in the doc page --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-09-05 14:43:20 +05:00
Aymeric Roucher	cfd92c64f5	Add new documentation page for advanced agent usage (#33265 ) * Add new documentation page for advanced agent usage	2024-09-04 18:19:54 +02:00
Matt	01c8c6c419	Add a warning to the chat template docs about the tool_calls format (#33277 ) * Add a warning to the chat template docs * Add a warning to the chat template docs * Add a warning to the chat template docs	2024-09-04 17:13:34 +01:00
Raushan Turganbay	ebbe8d8014	Cache docs: update (#32929 ) * some changes * more updates * fix cache copy * nits * nits * add tests	2024-09-04 15:05:31 +05:00
Niklas Muennighoff	ecd61c6286	Add OLMoE (#32406 ) * Add OLMoE * Add OLMoE * Updates * Make norm optional; add keys * Add output * Add * Fix dtype * Fix eos config * Update * Add OLMoE * Fix OLMoE path * Format * Format * Rmv copy statement * Rmv copy statement * Format * Add copies * Cp rotary * Fix aming * Fix naming * Update RoPE integration; num_logits_to_keep; Add copy statements * Add eps to config * Format * Add aux loss * Adapt router_aux_loss_coef * Update md * Adapt * adapt tests	2024-09-03 18:43:12 +02:00
Omar Salman	03c12d0d63	Add sdpa support for Albert (#32092 ) * Add sdpa support for Albert * [run_slow] albert * Add benchmarks and PR suggestion * Fix quality * Fix * [run_slow] albert	2024-09-03 14:01:00 +01:00
Matt	0d86727354	Update chat template docs to remove Blenderbot (#33254 ) * Update docs to remove obsolete Blenderbot * Remove another reference to Blenderbot	2024-09-03 12:18:04 +01:00
Isotr0py	edeca4387c	🚨 Support dequantization for most GGML types (#32625 ) * use gguf internal dequantize * add Q5_0 test * add iq1 test * add remained test * remove duplicated test * update docs * add gguf version limit * make style * update gguf import catch * revert vocab_size patch * make style * use GGUF_MIN_VERSION everywhere	2024-09-03 12:58:14 +02:00
Sergio Paniego Blanco	28952248b1	Fixed typo repeated word in DETR docs (#33250 )	2024-09-02 17:19:18 +02:00
Matt	52a0213755	Add assistant prefill for chat templates and TextGenerationPipeline (#33198 ) * Add assistant prefill to chat templates * Add assistant prefill to pipeline * Add assistant prefill to pipeline * Tweak another test that ended in assistant message * Update tests that ended in assistant messages * Update tests that ended in assistant messages * Replace assistant_prefill with continue_final_message * Allow passing continue_final_message to pipeline * Small fixup * Add continue_final_message as a pipeline kwarg * Update docstrings * Move repos to hf-internal-testing! * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Add explanatory comment * make fixup * Update chat templating docs to explain continue_last_message --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-09-02 13:23:47 +01:00
Aymeric Roucher	1ca9ff5c91	Add duckduckgo search tool (#32882 ) * Add duckduckgo search tool	2024-09-02 09:56:20 +02:00
Merve Noyan	2e3f8f7474	Add video text to text docs (#33164 ) --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-09-01 12:06:31 +03:00
Yijun Lee	db70426854	🌐 [i18n-KO] Translated `llm_optims.md` to Korean (#32325 ) * docs: ko: llm_optims.md * feat: nmt draft * fix toc title * fix: manual edits * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com> * Update docs/source/ko/llm_optims.md Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com> * Update llm_optims.md * fix: resolve suggestions * fix: resolve suggestions * Apply suggestions from code review fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>	2024-08-30 09:52:41 -07:00
Aymeric Roucher	c79bfc71b8	Create local Transformers Engine (#33218 ) * Create local Transformers Engine	2024-08-30 18:22:27 +02:00
Gerben van V	5129671290	Add a static cache that offloads to the CPU or other device (#32161 ) * Add a static cache that offloads to the CPU or other device * Fix PR comments, add unit-tests	2024-08-29 11:51:09 +02:00
JB (Don)	f1a385b1de	[RoBERTa-based] Add support for sdpa (#30510 ) * Adding SDPA support for RoBERTa-based models * add not is_cross_attention * fix copies * fix test * add minimal test for camembert and xlm_roberta as their test class does not inherit from ModelTesterMixin * address some review comments * use copied from * style * consistency * fix lists --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-28 10:26:00 +02:00
Mayank Mishra	c35d2ccf5a	Granite language models (#31502 ) * first commit * drop tokenizer * drop tokenizer * drop tokenizer * drop convert * granite * drop tokenization test * mup * fix * reformat * reformat * reformat * fix docs * stop checking for checkpoint * update support * attention multiplier * update model * tiny drop * saibo drop * skip test * fix test * fix test * drop * drop useless imports * update docs * drop flash function * copied from * drop pretraining tp * drop pretraining tp * drop pretraining tp * drop unused import * drop code path * change name * softmax scale * head dim * drop legacy cache * rename params * cleanup * fix copies * comments * add back legacy cache * multipliers * multipliers * multipliers * text fix * fix copies * merge * multipliers * attention multiplier * drop unused imports * fix * fix * fix * move rope? * Update src/transformers/models/granite/configuration_granite.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * Update src/transformers/models/granite/modeling_granite.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix * fix * fix * fix-copies * torch rmsnorm * add authors * change model path * fix * test * drop static cache test * uupdate readme * drop non-causal * readme * drop useless imports * Update docs/source/en/model_doc/granite.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-27 21:27:21 +02:00
Juan Pizarro	7591ca5bc5	🚨 Add Blip2ForImageTextRetrieval (#29261 ) * add Blip2ForImageTextRetrieval * use one line and remove unnecessary space in tests Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * use value from the config, rather than hardcoded * change order of params in Blip2QFormerModel.forward * update docstring * fix style * update test_inference_opt * move embeddings out of Blip2QFormerModel * remove from_vision_qformer_configs * remove autocast float16 in Blip2QFormerModel * rename fiels into vision_projection,text_projection,use_image_text_matching_head * use CLIPOutput for Blip2ImageTextMatchingModelOutput * remove past_key_values_length from Blip2TextEmbeddings * fix small typo in the CLIPOutput docstring * add Blip2ForImageTextRetrieval to Zero Shot Image Classification mapping * update docstring and add require_torch_fp16 * rollback test_inference_opt * use use_image_text_matching_head=True in convert * skip test_model_get_set_embeddings * fix create_rename_keys error on new itm fields * revert to do scale after dot product between "query" and "key" * fix ValueError on convert script for blip2-opt-2.7b * update org of paths to Salesforce * add is_pipeline_test_to_skip for VisualQuestionAnsweringPipelineTests * [run_slow] blip_2 * removed Blip2ForImageTextRetrieval from IGNORE_NON_AUTO_CONFIGURED * fix docstring of Blip2ImageTextMatchingModelOutput * [run_slow] blip_2 * fix multi-gpu tests * [run_slow] blip_2 * [run_slow] blip_2 --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-08-27 18:50:27 +01:00
Ali Salamatian	27903de7ec	Very small change to one of the function parameters (#32548 ) Very small change to one of the parameters np.random.randint second parameter is not included in the possible options. Therefore, we want the upper range to be 2, so that we have some 1 labels in our classification as well.	2024-08-27 09:29:05 -07:00
Sae_Chan_Oh	6101d934a1	🌐 [i18n-KO] Translated `conversations.md` to Korean (#32468 ) * docs: ko: conversations.md * feat: hand-crafted translate docs * fix: modify typo after Grammar Check * Update docs/source/ko/conversations.md 감사합니다 Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * fix: accept suggestions about anchor and spacing * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/conversations.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/conversations.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * fix: anchor 'what happened inside piepeline?' be removed question mark * fix: translate the comments in the code block --------- Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>	2024-08-27 09:25:41 -07:00
Vaibhav Srivastav	6f0ecf1049	[docs] add quick usage snippet to Whisper. (#31289 ) * [docs] add quick usage snippet to Whisper. * Apply suggestions from review. * 💉 Fix the device for pipeline.	2024-08-27 14:11:52 +02:00
Pablo Montalvo	26f043bd4d	quickfix documentation (#32566 ) * fix documentation * update config	2024-08-26 17:49:44 +02:00
Ritik Nandwal	a378a54a57	Add changes for uroman package to handle non-Roman characters (#32404 ) * Add changes for uroman package to handle non-Roman characters * Update docs for uroman changes * Modifying error message to warning, for backward compatibility * Update instruction for user to install uroman * Update docs for uroman python version dependency and backward compatibility * Update warning message for python version compatibility with uroman * Refine docs	2024-08-26 17:07:01 +02:00
Shijie	19e6e80e10	support qwen2-vl (#32318 ) * support-qwen2-vl * tidy * tidy * tidy * tidy * tidy * tidy * tidy * hyphen->underscore * make style * add-flash2-tipd * delete-tokenize=False * remove-image_processor-in-init-file * add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES * format-doct * support-Qwen2VLVisionConfig * remove-standardize_cache_format * fix-letter-varaibles * remove-torch-in-image-processor * remove-useless-docstring * fix-one-letter-varaible-name * change-block-name * default-quick-gelu-in-vision * remove-useless-doc * use-preimplemented-flash-forward * fix-doc * fix-image-processing-doc * fix-apply-rotary-embed * fix-flash-attn-sliding-window * refactor * remove-default_template * remove-reorder_cache * simple-get-rope_deltas * update-prepare_inputs_for_generation * update-attention-mask * update-rotary_seq_len * remove-state * kv_seq_length * remove-warning * _supports_static_cache * remove-legacy-cache * refactor * fix-replace * mrope-section-doc * code-quality * code-quality * polish-doc * fix-image-processing-test * update readme * Update qwen2_vl.md * fix-test * Update qwen2_vl.md * nit * processor-kwargs * hard-code-norm_layer * code-quality * discard-pixel-values-in-gen * fix-inconsistent-error-msg * unify-image-video * hidden_act * add-docstring * vision-encode-as-PreTrainedModel * pixel-to-target-dtype * update doc and low memoryvit * format * format * channel-foramt * fix vit_flashatt * format * inherit-Qwen2VLPreTrainedModel * simplify * format-test * remove-one-line-func-in-image-processing * avoid-one-line-reshape * simplify-rotary_seq_len * avoid-single-letter-variable * no-for-loop-sdpa * avoid-single-letter-variable * remove-one-line-reshape * remove-one-line-reshape * remove-no-rope-in-vit-logic * default-mrope * add-copied-from * more-docs-for-mrope * polish-doc * comment-and-link * polish-doc * single-letter-variables * simplify-image-processing * video->images * kv_seq_len-update * vision-rope-on-the-fly * vision-eager-attention * change-processor-order --------- Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com> Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>	2024-08-26 15:16:44 +02:00
S M Jishanul Islam	8defc95df3	Updated the custom_models.md changed cross_entropy code (#33118 )	2024-08-26 13:15:43 +02:00
Matt	0a7af19f4d	Update Jinja docs with new functions and general cleanup (#33097 )	2024-08-23 17:40:06 +01:00
Jason (Siyu) Zhu	adb91179b9	Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer (#32860 ) * add liger integration * fix syntax * fix import issue * add trainer.md * Use _apply_liger_kernel() * Fixed log message * Update docs/source/en/trainer.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/trainer.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Byron Hsu <byronhsu1230@gmail.com> * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Byron Hsu <byronhsu1230@gmail.com> * Update docs/source/en/trainer.md Co-authored-by: Byron Hsu <byronhsu1230@gmail.com> * Fixed checkstyle and updated readme * Added test * Fixed checkstyle * fix docstring * rename use_liger to use_liger_kernel * Trigger Build * Added test * add fix-copies * Fixed copy inconsistencies --------- Co-authored-by: shimizust <sshimizu@linkedin.com> Co-authored-by: Steven Shimizu <shimizust@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>	2024-08-23 13:20:49 +02:00
Joao Gante	970a16ec7f	Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹 (#32659 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-08-23 11:12:53 +01:00
Jinuk	09e6579d2d	🌐 [i18n-KO] Translated `knowledge_distillation_for_image_classification.md to Korean" (#32334 ) * docs: ko: tasks/knowledge_distillation_for_image_classification.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-08-22 10:42:39 -07:00
Shubham Ugare	9282413611	Add SynCode to llm_tutorial (#32884 )	2024-08-22 15:30:22 +02:00
Matt	85345bb439	Add tip to clarify tool calling (#32883 )	2024-08-19 18:37:35 +01:00
Sai-Suraj-27	37204848f1	Docs: Fixed `whisper-large-v2` model link in docs (#32871 ) Fixed whisper-large-v2 model link in docs.	2024-08-19 09:50:35 -07:00
Kamil Akesbi	8260cb311e	Add Descript-Audio-Codec model (#31494 ) * dac model * original dac works * add dac model * dac can be instatiated * add forward pass * load weights * all weights are used * convert checkpoint script ready * test * add feature extractor * up * make style * apply cookicutter * fix tests * iterate on FeatureExtractor * nit * update dac doc * replace nn.Sequential with nn.ModuleList * nit * apply review suggestions 1/2 * Update src/transformers/models/dac/modeling_dac.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * up * apply review suggestions 2/2 * update padding in FeatureExtractor * apply review suggestions * iterate on design and tests * add integration tests * feature extractor tests * make style * all tests pass * make style * fixup * apply review suggestions * fix-copies * apply review suggestions * apply review suggestions * Update docs/source/en/model_doc/dac.md Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update docs/source/en/model_doc/dac.md Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * anticipate transfer weights to descript * up * make style * apply review suggestions * update slow test values * update slow tests * update test values * update with CI values * update with vorace values * update test with slice * make style --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>	2024-08-19 10:21:51 +01:00
MAHIR DAIYAN	843e5e20ca	Add Flax Dinov2 (#31960 ) * tfmsenv restored in main * installed flax * forward pass done and all tests passed * make fix-copies and cleaning the scripts * fixup attempt 1 * fixup attempt 2 * fixup third attempt * fixup attempt 4 * fixup attempt 5 * dinov2 doc fixed * FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE * external pos_encoding layer removed * fixup attempt 6 * fixed integration test values * fixup attempt 7 * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * comments removed * comment removed from the test * fixup * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * new fixes 1 * interpolate_pos_encoding function removed * droppath rng fixed, pretrained beit copied-from still not working * modeling_flax_dinov2.py reformatted * Update tests/models/dinov2/test_modeling_flax_dinov2.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * added Copied from, to the tests * copied from statements removed from tests * fixed copied from statements in the tests * [run_slow] dinov2 --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2024-08-19 09:28:13 +01:00
Yangshen⚡Deng	a27182b7fc	Fix AutoConfig and AutoModel support for Llava-Next-Video (#32844 ) * Fix: fix all model_type of Llava-Next-Video to llava_next_video * Fix doc for llava_next_video * * Fix formatting issues * Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation * Fix docs TOC for llava-next-video	2024-08-16 12:41:05 +01:00
Joao Gante	cf32ee1753	Cache: use `batch_size` instead of `max_batch_size` (#32657 ) * more precise name * better docstrings * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-16 11:48:45 +01:00
Joao Gante	70d5df6107	Generate: unify `LogitsWarper` and `LogitsProcessor` (#32626 )	2024-08-16 11:20:41 +01:00
Dina Suehiro Jones	6577c77d93	Update the distributed CPU training on Kubernetes documentation (#32669 ) * Update the Kubernetes CPU training example * Add namespace arg Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com> --------- Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>	2024-08-14 09:36:43 -07:00
Jerry Zhang	78d78cdf8a	Add TorchAOHfQuantizer (#32306 ) * Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>	2024-08-14 16:14:24 +02:00
Eric Hartford	481e15604a	Add support for GrokAdamW optimizer (#32521 ) * add grokadamw * reformat * code review feedback, unit test * reformat * reformat	2024-08-13 13:20:28 +01:00
Quentin Gallouédec	f1c8542ff7	"to be not" -> "not to be" (#32636 ) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py	2024-08-12 20:20:17 +01:00
Ahnjj_DEV	7f777ab7d9	🌐 [i18n-KO] Translated `awq.md`to Korean (#32324 ) * fix: manual edits * Apply suggestions from code review Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * fix:manual edits - 잘못된 경로에 번역본 파일을 생성해서 옮김 * Delete docs/source/ko/tasks/awq.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-08-12 10:12:48 -07:00
YONGSANG	4996990d61	🌐 [i18n-KO] Translated `deepspeed.md` to Korean (#32431 ) * Update _toctree.yml * docs: ko: deepspeed.md * Apply suggestions from code review Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/deepspeed.md * Update docs/source/ko/deepspeed.md Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * Apply suggestions from code review Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Update docs/source/ko/_toctree.yml --------- Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>	2024-08-12 10:07:31 -07:00
Matt	b7ea171403	Cleanup tool calling documentation and rename doc (#32337 ) * Rename "Templates for Chat Models" doc to "Chat Templates" * Small formatting fix * Small formatting fix * Small formatting fix * Cleanup tool calling docs as well * Remove unneeded 'revision' * Move tip to below main code example * Little bonus section on template editing	2024-08-12 16:20:14 +01:00
Younes Belkada	7c11491208	Add new model (#32615 ) * v1 - working version * fix * fix * fix * fix * rename to correct name * fix title * fixup * rename files * fix * add copied from on tests * rename to `FalconMamba` everywhere and fix bugs * fix quantization + accelerate * fix copies * add `torch.compile` support * fix tests * fix tests and add slow tests * copies on config * merge the latest changes * fix tests * add few lines about instruct * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-12 08:22:47 +02:00
wony617	48101cf8d1	🌐 [i18n-KO] Translated `agent.md` to Korean (#32351 ) * docs: ko: main_classes/agent * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com> Co-authored-by: SeungAhSon <gongsoonyee@gmail.com> * fix: resolve suggestions * fix: resolve code line number --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com> Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>	2024-08-09 09:58:52 -07:00
Steven Liu	85817d98fb	[docs] Translation guide (#32547 ) clarify	2024-08-08 13:43:14 -07:00
SeungAhSon	b01f9c484c	🌐 [i18n-KO] Translated `bitsandbytes.md` to Korean (#32408 ) * docs: ko: quantization/bitsandbytes.md * feat: nmt draft * fix: minor typos * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-08-08 09:40:50 -07:00
SeungYoun Lee	496207a166	🌐 [i18n-KO] Translated `fsdp.md` to Korean (#32261 ) * docs: ko: fsdp.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com> Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com> * fix: resolve suggestions * Update docs/source/ko/fsdp.md Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com> * Update docs/source/ko/fsdp.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com> Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-08-08 09:40:03 -07:00
HyeokJun SHIN	e0396bdaa0	🌐 [i18n-KO] Translated `eetq.md` to Korean (#32352 ) * docs: ko: quantization/eetq.md * feat: nmt draft * fix docs: ko: quantization/eetq.md * fix docs: ko: quantization/eetq.md * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggsetions --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>	2024-08-08 09:39:35 -07:00
Chulhwa (Evan) Han	96ba7f0c51	🌐 [i18n-KO] Translated `trainer.md` to Korean (#32260 ) * docs: ko: ko-trainer * feat: nmt draft * fix: manual edits * fix: manual edits * fix: glossary * fix: glossary * Apply suggestions from code review Co-authored-by: Jinuk <45095330+JinukHong@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> --------- Co-authored-by: Jinuk <45095330+JinukHong@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>	2024-08-08 09:38:58 -07:00
010kim	43f3fe879c	🌐 [i18n-KO] Translated `ko-llm_tutorial_optimization.md` to Korean (#32372 ) * docs: ko: llm_tutorial_optimization.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/llm_tutorial_optimization.md Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * Update docs/source/ko/llm_tutorial_optimization.md Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions - 1 Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> Co-authored-by: boyunJang <gobook1234@naver.com> * fix: resolve suggestions - 2 Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> Co-authored-by: boyunJang <gobook1234@naver.com>	2024-08-08 09:37:39 -07:00
Yunfei Chu	16ed0640be	Add Qwen2-Audio (#32137 ) * add qwen2audio * Update check_repo.py * fix style * fix test * fix style * add model size * Qwen2AudioEncoderModel->Qwen2AudioEncoder; add copy info * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * switch the attention_mask and the feature_attention_mask * add to PRIVATE_MODELS in check_repo.py; add to MODEL_NAMES_TO_IGNORE in check_table.py * fix initialization * update chat_template * fix consistency issue after copy * add docstrings to _merge_input_ids_with_audio_features * add copied from to prepare_inputs_for_generation * add more details to docs * rm comment * add init_std * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * update * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update tests * rm ignore_index * update processor * rm ffmpeg_read * Update tests/models/qwen2_audio/test_modeling_qwen2_audio.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update * typo * [run_slow] qwen2_audio * [run_slow] qwen2_audio * [run_slow] qwen2_audio * fix quality * [run_slow] qwen2_audio * [run_slow] qwen2_audio * [run_slow] qwen2_audio * add official model --------- Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-08-08 15:47:24 +02:00
Wonseok Lee (Jack)	e28784f821	Change Phi3 `_supports_sdpa` to True (#32457 ) * Change `_supports_sdpa` to True * add phi3 to sdpa support list	2024-08-08 13:28:20 +02:00
doomdagadiggiedahdah	1c944ac1e1	Fix issue #32518 : Update llm_tutorial.md (#32523 ) Update llm_tutorial.md remove comma re: issue 32518 https://github.com/huggingface/transformers/issues/32518	2024-08-08 10:54:02 +01:00
Jiyoon	78566dbdf0	🌐 [i18n-KO] Translated `chat_templating.md` to Korean (#32362 ) * docs: ko: chat_templating.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/chat_templating.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/chat_templating.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * fix: apply suggestions from code review - anchor Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * fix: manual edits Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com> * fix: manual edits * fix: delete 'default template' section --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>	2024-08-07 11:25:19 -07:00
Sai-Suraj-27	543df48914	Docs: Fixed WhisperModel.forward’s docstring link (#32498 ) Fixed WhisperModel.forward’s docstring link.	2024-08-07 11:01:33 -07:00
Jiwook Han	cba7bcf87b	🌐 [i18n-KO] Translated `image_feature_extraction.md` to Korean (#32239 ) * docs: ko: tasks/images_feature_extraction.md * feat: nmt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * feat: manual edits * Update docs/source/ko/tasks/image_feature_extraction.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/tasks/image_feature_extraction.md Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * fix: manual edits --------- Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>	2024-08-07 09:56:23 -07:00
Sungmin Oh	fa59fd87dd	🌐 [i18n-KO] Translated `quantization/quanto.md` to Korean (#32281 ) * docs: ko: quantization/quanto.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com> Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> --------- Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com> Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>	2024-08-07 09:52:57 -07:00
Chaewon Song	fcc4f2ae8f	🌐 [i18n-KO] Translated `prompting.md` to Korean (#32294 ) * docs: ko: tasks/prompting.md * feat: nmt-draft * fix: update translation in prompting.md * fix: update toctree.yml * fix: manual edits * fix: toctree edits * fix: resolve suggestions Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> --------- Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>	2024-08-07 09:44:31 -07:00
Minki Kim	1124d95dbb	🌐 [i18n-KO] Translated `gptq.md` to Korean (#32293 ) * fix: manual edits * fix: manual edits2 * fix: delete files * fix: resolve suggestions Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-08-07 09:19:35 -07:00
Joao Gante	b7fb393f68	Docs: alert for the possibility of manipulating logits (#32467 ) * logits * words	2024-08-07 16:34:46 +01:00
Aymeric Roucher	e0d82534cc	Agents use grammar (#31735 ) * Allow optional use of grammars to constrain generation	2024-08-07 11:42:52 +02:00
Raushan Turganbay	7ad784ae9d	Gemma2: add cache warning (#32279 ) * gemma2 fallback to dynamic cache * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * raise error and dont fallback to dynamic cache * prev will break most forward calls/tests * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update * fix copies --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-08-07 10:03:05 +05:00
HyunJi Shin	6af0854efa	🌐 [i18n-KO] Translated `image_to_image.md` to Korean (#32327 ) * docs: ko: tasks/image_to_image.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * fix: handle remaining suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> --------- Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>	2024-08-06 11:59:44 -07:00
boyunJang	3b193c7bae	🌐 [i18n-KO] Translated `idefics.md` to Korean (#32258 ) * docs: ko: tasks/idefics.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>	2024-08-06 11:58:21 -07:00
timdalxx	5301b981d7	🌐 [i18n-KO] Translated `mask_generation.md` to Korean (#32257 ) * docs: ko: tasks/mask_generation.md * feat: nmt draft * fix : toc local * fix : manual edits * fix : ko-toctree * fix: resolve suggestions Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>	2024-08-06 11:36:14 -07:00
Chris Toukmaji	50c3ba889a	Documentation: BOS token_id deprecation change for NLLB (#32443 ) Update nllb.md	2024-08-06 09:22:08 -07:00
Pablo Montalvo	80b90e7b2f	Add codestral mamba2 (#32080 ) * add new model like * draft cuda forward - mismatched keys (sharding on conv1) * match keys successfully * fix split * get generation/forward running (wrong gens, norm?) * :update * some refactoring * fixes * works up until copy to cache * fix * update * NON WORKING VERSION * version that work? * nit * fix config * fix conversion script * working cuda forward * nit * update * simplifcation * make mamba slow simple work * no einops * todo * fix style * no einops * update fix no einsum * nit * remove einops * bug: scan_output differs strongly * add rms norm option * fix fast + slow generation with and w/o cache ✔️ * draft integration tests * remove a big chunk of the einsum * fix slow, fast generations, without any einsum * fix copies * fix structure * fix up modeling and tests * fix tests * clamping is indeed worse * recover mamba2 cache test * fix copies * no cache position (yet) * fix tf tests * fix matmul for generate * fixup * skip cache tests for now * [run-slow]mamba2 * tune out hidden states for padding * test batched generation * propagate attention mask changes * fix past length * fix integration test * style * address comments * update readme * add mamba2 version check * fix tests * [run-slow]mamba2 * skip edge tests * [run-slow]mamba2 * last fixup * [run-slow]mamba2 * update README --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-08-06 16:39:52 +02:00
Ao Tang	6a03942db7	Add Nemotron HF Support (#31699 ) * Add nemotron support * fix inference * add unit test * add layernorm1p as a class to avoid meta device mismatch * test fixed * Add copied_from statements * remove pretraining_tp args * remove nemotronlayernorm * force LN computation done in FP32 * remove nemotrontokenizer and use llamatokenizer * license update * add option for kv_channels for minitron8b * remove assert * o_proj fixed * o_proj reshape * add gated_proj option * typo * remove todos * fix broken test after merging latest main * remove nezha/nat after meging main * chnage default config to 15b model * add nemo conversion script * rename conversion script * remove gate_proj option * pr comment resolved * fix unit test * rename kv_channels to head_dim * resolve PR issue * add nemotron md * fix broken tests * refactor rope for nemotron * test fix * remove linearscaling * whitespace and import * fix some copied-from * code style fix * reformatted * add position_embedding to nemotronattention * rope refactor to only use config, copied-from fix * format * Run make fix-copies * nemotron md with autodoc * doc fix * fix order * pass check_config_docstrings.py * fix config_attributes * remove all llama BC related code * Use PreTrainedTokenizerFast * ruff check examples * conversion script update * add nemotron to toctree	2024-08-06 15:42:05 +02:00
Raushan Turganbay	37c5ca5eb9	Cache: create docs (#32150 ) * draft * updates * works? * try adding python example in hidden section * another try * hwo do i render python * format as html code? * Update docs/source/en/kv_cache.md Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * one more small update * should render hidden secrtion now * add outputs * fix links * check links * update all links * update with offloaded cache * all cache is importable, so they appear in docs * fix copies * docstring... --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-08-06 10:24:19 +05:00
Francisco Kurucz	13dc6b0853	Fix documentation links and code reference to model llava-next (#32434 )	2024-08-05 15:14:50 -07:00
Raushan Turganbay	2af199c42b	Update docs (#32368 ) nits	2024-08-02 09:54:16 +05:00
Nikos Karampatziakis	ca59d6f77c	Offloaded KV Cache (#31325 ) * Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams	2024-08-01 14:42:07 +02:00
Sanchit Gandhi	e234061cdd	[whisper] compile compatibility with long-form decoding (#31772 ) * [whisper] compile compatibility with long-form decoding * clarify comment * fix after rebase * finalise * fix bsz * fix cache split * remove contiguous * style * finish * update doc * prevent cuda graph trace	2024-08-01 18:10:56 +08:00
Joao Gante	e68ec18ce2	Docs: formatting nits (#32247 ) * doc formatting nits * ignore non-autodocs * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-30 15:49:14 +01:00
Gilad Turok	3e8106d253	Docs: fix GaLore optimizer code example (#32249 ) Docs: fix GaLore optimizer example Fix incorrect usage of GaLore optimizer in Transformers trainer code example. The GaLore optimizer uses low-rank gradient updates to reduce memory usage. GaLore is quite popular and is implemented by the authors in [https://github.com/jiaweizzhao/GaLore](https://github.com/jiaweizzhao/GaLore). A few months ago GaLore was added to the HuggingFace Transformers library in https://github.com/huggingface/transformers/pull/29588. Documentation of the Trainer module includes a few code examples of how to use GaLore. However, the `optim_targe_modules` argument to the `TrainingArguments` function is incorrect, as discussed in https://github.com/huggingface/transformers/pull/29588#issuecomment-2006289512. This pull request fixes this issue.	2024-07-30 09:19:24 +02:00
Aymeric Roucher	a24a9a66f4	Add stream messages from agent run for gradio chatbot (#32142 ) * Add stream_to_gradio method for running agent in gradio demo	2024-07-29 20:12:44 +02:00
Joao Gante	7ffe25f2b9	Generate: end-to-end compilation (#30788 ) * mvp * added test (a few models need fixes) * fix a few test cases * test nits * harder test 😈 * revert changes in stablelm * test with improved condition * add todo * tmp commit * merged with main * nits * add todo * final corrections * add docs for generation compilation * docs nits * add tip * PR suggestions * add more details to the compilation docs * fix cache positions * cache is now init in generate; update docs * tag test as flaky * docs * post rebase make fixup and other nits * remove unintended changes * whisper (encoder-decoder) not supported * move token default updates to ; add tests for token defaults * push changes * manual rebase * chameleon doesn't support this * fix test_static_cache_mha_mqa_gqa (broken in another PR) * docs: dynamic is better with end-to-end compilation	2024-07-29 10:52:13 +01:00
Sai-Suraj-27	49928892d6	fix(docs): Fixed a link in docs (#32274 ) Fixed a link in docs.	2024-07-29 10:50:43 +01:00
Pavel Iakubovskii	9d6c0641c4	Fix code snippet for Grounding DINO (#32229 ) Fix code snippet for grounding-dino	2024-07-25 19:20:47 +01:00
Huazhong Ji	6ed0bf1e85	translate philosophy.md to chinese (#32177 ) * translate philosophy.md to chinese * add the missing link	2024-07-25 09:01:06 -07:00
Matt	edd68f4ed8	🚨 No more default chat templates (#31733 ) * No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again	2024-07-24 17:36:32 +01:00
Dr. Artificial曾小健	5f4ee98a7a	Update qwen2.md (#32108 ) * Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go	2024-07-24 11:54:41 +01:00
Fanli Lin	c85510f958	[docs] change temperature to a positive value (#32077 ) fix	2024-07-23 17:47:51 +01:00
RhuiDih	9cf4f2aa9a	Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629 ) * add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py	2024-07-23 15:56:41 +02:00
Raushan Turganbay	3aefb4ec7f	LLaVaNeXT: pad on right if training (#32134 ) * pad on right if training * docs * add tests	2024-07-23 10:23:55 +05:00
James Thewlis	251a2409c6	Add llama3-llava-next-8b to llava_next conversion script (#31395 ) * Add llama3-llava-next-8b to llava_next conversion script Adds support for the lmms-lab/llama3-llava-next-8b model to the convert_llava_next_weights_to_hf.py script, along with an example prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT repo. * Exclude <\|begin_of_text\|> from prompt example This token gets added automatically, so it should not be included in the prompt example. * Add llava-next-72b and llava-next-110b Adds the Qwen-based LLaVA-Next models to the conversion script, along with changes to load the models on multiple GPUs for inference. * Add llama3 and qwen prompt formats to docs * Chat prompt and padding side left for llama3 batched * update * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove code * better naming --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-23 10:12:16 +05:00
Marc Sun	96a074fa7e	Add new quant method (#32047 ) * Add new quant method * update * fix multi-device * add test * add offload * style * style * add simple example * initial doc * docstring * style again * works ? * better docs * switch to non persistant * remove print * fix init * code review	2024-07-22 20:21:59 +02:00
Bertrand Thia	7987710696	[RoBERTa] Minor clarifications to model doc (#31949 ) * minor edits and clarifications * address comment Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-07-22 10:08:27 -07:00
Woojun Jung	d1ec36b94f	Update `ko/_toctree.yml` and remove `custom_tools.md` to reflect latest changes (#31969 ) update `ko/_toctree.yml` and remove `custom_tools.md`	2024-07-22 08:27:13 -07:00
Lucain	f2a1e3ca68	Mention model_info.id instead of model_info.modelId (#32106 )	2024-07-22 14:14:47 +01:00
Raushan Turganbay	fe008d6ebe	Chameleon: not supported with fast load (#32091 ) fixes	2024-07-19 19:21:45 +05:00
Merve Noyan	46835ec6ae	Add image-text-to-text task guide (#31777 ) * Add image-text-to-text task page * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Address comments * Fix heading * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comments * Update image_text_to_text.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-19 13:40:40 +01:00
Merve Noyan	4bd8f12972	Fixes to chameleon docs (#32078 ) * Fixes * Let's not use auto	2024-07-19 12:50:34 +01:00
Raushan Turganbay	e316c5214f	VideoLLaVa: fix chat format in docs (#32083 ) fix chat format	2024-07-19 15:38:01 +05:00
NielsRogge	56a7745704	[Chameleon, Hiera] Improve docs (#32038 ) * Improve docs * Fix docs * Fix code snippet	2024-07-19 11:20:03 +03:00
Raushan Turganbay	b873234cb6	Llava: add default chat templates (#31691 ) * add default chat templates * Update src/transformers/models/llava/processing_llava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/llava_next/processing_llava_next.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more clear docstring and docs * Update docs/source/en/model_doc/llava.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/vipllava.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add tests * remove default templates (see #31733) * load chat template from another file * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert some changes in docs * forgot vipllava * chat template file is not temporary hack * warn if loading from processor * not that file * similarly modify `save_pretrained` * Update tests/models/llava_next/test_processor_llava_next.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vipllava/test_processor_vipllava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/vipllava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/vipllava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2024-07-19 10:08:56 +05:00
Raushan Turganbay	673d30b826	Chameleon: minor fixes after shipping (#32037 ) * fix merging * make chameleon conditional	2024-07-18 16:54:07 +05:00
Pavel Iakubovskii	1c37e8c1a6	Add `sdpa` and FA2 for CLIP (#31940 ) * Squashed commit of the following: commit 102842cd477219b9f9bcb23a0bca3a8b92bd732f Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Fri Jul 12 18:23:52 2024 +0000 Add model-specific sdpa tests commit 60e4c88581abf89ec098da84ed8e92aa904c997d Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Fri Jul 12 18:20:53 2024 +0000 Add fallback to eager (expensive operation) commit c29033d30e7ffde4327e8a15cbbc6bee37546f80 Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Thu Jul 11 17:09:55 2024 +0000 Fix attn_implementation propagation commit 783aed05f0f38cb2f99e758f81db6838ac55b9f8 Author: sayakpaul <spsayakpaul@gmail.com> Date: Sat May 25 09:05:27 2024 +0530 style commit e77e703ca75d00447cda277eca6b886cd32bddc0 Author: sayakpaul <spsayakpaul@gmail.com> Date: Sat May 25 09:04:57 2024 +0530 add comment to explain why I had to touch forbidden codebase. commit ab9d8849758e7773a31778ccba71588d18552623 Author: sayakpaul <spsayakpaul@gmail.com> Date: Sat May 25 09:03:02 2024 +0530 fix: flax attribute access. commit c570fc0abf9d1bd58c291aae3c7e384f995996d2 Author: sayakpaul <spsayakpaul@gmail.com> Date: Sat May 25 08:23:54 2024 +0530 fix tensorflow attribute name. commit 32c812871cfdb268d8a6e3e2c61c5c925c8ed47e Author: sayakpaul <spsayakpaul@gmail.com> Date: Sat May 25 07:57:10 2024 +0530 fix attribute access. commit 4f41a0138b6c417aed9c9332278f8bcd979cb7c2 Author: sayakpaul <spsayakpaul@gmail.com> Date: Sat May 25 07:44:02 2024 +0530 _from_config. commit 35aed64ff602422adcf41d7f677a0a24bd9eccae Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 24 18:46:52 2024 +0530 propagation of attn_implementation. commit 4c25c19845438b1dc1d35a5adf9436151c8c5940 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 24 09:24:36 2024 +0530 style again commit 5f7dc5c5015c0f8116408f737e8c318d1802c80c Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 24 09:19:05 2024 +0530 use from_config. commit b70c409956d0359fa6ae5372275d2a20ba7e3389 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 24 09:13:43 2024 +0530 quality commit a7b63beff53d0fc754c6564e2a7b51731ddee49d Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 10 14:35:10 2024 +0200 add benchmark numbers commit 455b0eaea50862b8458c8f422b60fe60ae40fdcb Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 10 13:50:16 2024 +0200 Revert "reflect feedback more" This reverts commit `dc123e71ef`. commit ca674829d28787349c2a9593a14e0f1d41f04ea4 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 10 13:50:05 2024 +0200 Revert "fix" This reverts commit `37a1cb35b8`. commit fab2dd8576c099eb1a3464958cb206a664d28247 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 10 13:47:46 2024 +0200 fix commit fbc6ae50fd6f2d36294d31e191761631b701d696 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 10 13:38:30 2024 +0200 reflect feedback more commit 87245bb020b2d60a89afe318a951df0159404fc9 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 3 08:54:34 2024 +0530 fixes commit 1057cc26390ee839251e7f8b3326c4207595fb23 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 3 07:49:03 2024 +0530 don't explicit set attn_implementation in tests commit e33f75916fc8a99f516b1cf449dbbe9d3aabda81 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 3 07:43:54 2024 +0530 explicitly override attn_implementation in the towers. commit 4cf41cb1bc885c39df7cb8f2a0694ebf23299235 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 3 07:38:42 2024 +0530 import in one-line. commit f2cc447ae9e74ccfacb448140cdf88259d4afc8c Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri May 3 07:34:58 2024 +0530 move sdpa mention to usage tips. commit 92884766c64dbb456926a3a84dd427be1349fa95 Author: sayakpaul <spsayakpaul@gmail.com> Date: Mon Apr 29 10:58:26 2024 +0530 fix: memory allocation problem. commit d7ffbbfe12f7750b7d0a361420f35c13e0ea787d Author: sayakpaul <spsayakpaul@gmail.com> Date: Mon Apr 29 09:56:59 2024 +0530 fix-copies commit 8dfc3731cedd02e36acd3fe56bb2e6d61efd25d8 Author: sayakpaul <spsayakpaul@gmail.com> Date: Fri Apr 26 20:16:12 2024 +0530 address arthur's comments. commit d2ed7b4ce4ff15ae9aa4d3d0500f1544e3dcd9e9 Author: Sayak Paul <spsayakpaul@gmail.com> Date: Fri Apr 26 20:08:15 2024 +0530 Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> commit 46e04361f37ded5c522ff05e9f725b9f82dce40e Author: sayakpaul <spsayakpaul@gmail.com> Date: Wed Apr 24 09:55:27 2024 +0530 add to docs. commit 831629158ad40d34d8983f209afb2740ba041af2 Author: sayakpaul <spsayakpaul@gmail.com> Date: Wed Apr 24 09:33:10 2024 +0530 styling.g commit d263a119c77314250f4b4c8469caf42559197f22 Author: sayakpaul <spsayakpaul@gmail.com> Date: Wed Apr 24 09:15:20 2024 +0530 up commit d44f9d3d7633d4c241a737a1bc317f791f6aedb3 Author: sayakpaul <spsayakpaul@gmail.com> Date: Tue Apr 23 18:40:42 2024 +0530 handle causal and attention mask commit 122f1d60153df6666b634a94e38d073f3f260926 Author: sayakpaul <spsayakpaul@gmail.com> Date: Tue Apr 23 15:18:21 2024 +0530 test fixes. commit 4382d8cff6fa1dee5dbcf0d06b3e2841231e36f5 Author: sayakpaul <spsayakpaul@gmail.com> Date: Tue Apr 23 09:39:25 2024 +0530 fix: scaling inside sdpa. commit 0f629989efc48b7315cf19405a81e02955efe7e5 Author: Sayak Paul <spsayakpaul@gmail.com> Date: Tue Apr 23 08:14:58 2024 +0530 Update src/transformers/models/clip/modeling_clip.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> commit 14367316877dc27ea40f767ad1aee38bbc97e4ce Author: sayakpaul <spsayakpaul@gmail.com> Date: Mon Apr 22 16:21:36 2024 +0530 add: sdpa support to clip. * Remove fallback for empty attention mask (expensive operation) * Fix typing in copies * Add flash attention * Add flash attention tests * List CLIP in FA docs * Fix embeddings attributes and tf * [run-slow] clip * Update clip documentation * Remove commented code, skip compile dynamic for CLIPModel * Fix doc * Fix doc 2 * Remove double transpose * Add torch version check for contiguous() * Add comment to test mixin * Fix copies * Add comment for mask * Update docs * [run-slow] clip	2024-07-18 10:30:37 +05:30
Dmitry Rogozhkin	bc36c26fa6	doc: fix broken BEiT and DiNAT model links on Backbone page (#32029 ) Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-07-17 20:24:10 +01:00
Raushan Turganbay	24cfcc2114	Chameleon: add model (#31534 ) * Chameleon model integration Co-authored-by: Jacob Kahn <jacobkahn1@gmail.com> Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com> * fix 7B, again. mask away image tokens * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove pretrained_config_map * make fixup passing up to utils/check_config_docstrings.py; vqgan moved to the modeling file * remove tokenizer (use llama's); remove codechameleon tests * a few copied from statements and minor changes * copied from in ChameleonModel * some copies in ChameleonForCausalLM * a few more copies * VQModel moved to ChameleonModel (as opposed to being in the processor) * ChameleonProcessor ready * Fix chameleon weights convert * update conversion script * clean-up processing * update modeling a bit * update * update (throws error...) * correct conversion ready * fix tests * fix docs * docs * ve swin norm * fix device for vocab map * add normalization * update * update script with rope rotations * final fix on model conversion * add slow tests * more info in docs * fix repo consistency tests * fix repo tests * fix-copies * hope this will make CI happy * fix for 30b model * Update docs/source/en/index.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/processing_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address comments * remove assertion in conversion script * add image processor test * not copied * port changes for qk layernorm * fix-copies * read token decorator for tests * [run-slow] chameleon * one more read-token * address some comments * qk norm changes * tests and repo check * moved rope permutations to conversion, YAY! * fix past kv check * docs * layernorm done! * let's be consistent in naming * fix slow tests * weird thing with slow CI, but let's see * once more try * remove past-kv as tuple following llama * ignore * style --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: jacobkahn <jacobkahn1@gmail.com> Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com> Co-authored-by: Leonid Shamis <lshamis@meta.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-17 10:41:43 +05:00
Zach Mueller	e0dfd7bcaf	Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771 ) * 1,100%! * Clean * Don't touch DS * Experiment with dtype allocation * skip test_load_save_without_tied_weights test * A little faster * Include proper upscaling? * Fixup tests * Potentially skip? * Let's see if this fixes git history * Maintain new dtype * Fin * Rm hook idea for now * New approach, see what breaks * stage * Clean * Stash * Should be fin now, just need to mark failing models * Clean up * Simplify * Deal with weird models * Enc/Dec * Skip w/ reason * Adjust test * Fix test * one more test * Keep experimenting * Fix ref * TO REMOVE: testing feedback CI * Right push * Update tests/utils/test_modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * disable * Add new func * Test nits from Amy * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Adjust comment * Adjust comment on skip * make private * Fin * Should be a not flag * Clarify and rename test --------- Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-16 09:32:01 -04:00
Naman Garg	c1e139c2b0	Adding hiera (#30356 ) * initialized Structure * Updated variable names * Added Config class, basic HF setup, convert_to_hf * Fixed Convert function, added hiera to HF files, Initilized test files * better naming for x in forward pass * Moved utils to hiera * Change hiera -> hiera_model * Fixed integration into tranformers * Fix: Convert Checkpoint * added documentation for hiera * added documentation for hiera * added Docstings to models, Transformers based changes * make style and quality * make style and quality * Integration & Block tests running * Fixed bugs * initialized Structure * Updated variable names * Added Config class, basic HF setup, convert_to_hf * Fixed Convert function, added hiera to HF files, Initilized test files * better naming for x in forward pass * Moved utils to hiera * Change hiera -> hiera_model * Fixed integration into tranformers * Fix: Convert Checkpoint * added documentation for hiera * added documentation for hiera * added Docstings to models, Transformers based changes * make style and quality * make style and quality * Integration & Block tests running * Fixed bugs * Removed tim dependency * added HieraBlock * fixed: Model name * added tests for HieraModel, HieraBlock * fixed imports * fixed quality & copies * Fixes * Update docs/source/en/model_doc/hiera.md Fix name Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hiera.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hiera.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/configuration_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/configuration_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fixed formatting * Code quality & Import differences * quality and repo-consistency fix * fixed no torch error * Docstring fix * Docstring fix * doc string fix * fixed example usage * Resolved issues in modeling_hiera * Removed Hiera MAE * Added test and resolved bug * fixed doc string * First commit * Finished conversion script and model forward working * Resolved all issues * nits * Improving tests * Nits * More nits * Improving HieraForMaskedImageModeling * More improvements and nits * Fixed docstrings of outputs * More fixes * More imrpovments * Updated conversion script * Fixed docstrings * Improved tests * Fixed attentou outputs test * All tests green * Removed unnecessary file * contribution attribution * Resolved a few issues * Resolved Comments * Updated model repo id and fixed bugs * Removed loss print * Make tests green * Updated docstrings * Fix style * Fixed num_heads in config * Removed unnecessary video checkpoint related code in the conversion script * Fix style * Changed atol in conversion script * HieraConfig * Fix copies * Fixed typo * Resolved few issues * make * converted conv_nd -> nn.Module * Removed video complexities * Removed video complexities * fix style * Addressing comments * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix style * Fixed tests * Fixed typo * Fixed interpolate test * Made torch fx compatible * Made sure imageprocesor is correct * Addressed comments * Noise directly as torch * Remove unnecesary attr * Added return_dit * Update src/transformers/models/hiera/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated checkpoints * [run_slow] hiera * Fixed device mismatch * [run_slow] hiera * Fixed GPU tests * [run_slow] hiera --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Eduardo Pacheco <eduardo.pach@hotmail.com> Co-authored-by: Eduardo Pacheco <69953243+EduardoPach@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-11 22:13:56 +01:00
NielsRogge	ec03d97b27	[RT-DETR] Add resources (#31815 ) * Add resources * Address comments	2024-07-10 16:34:53 +01:00
Raushan Turganbay	97aa3e2905	Add conversion for interleave llava (#31858 ) * add conversion for interleave llava * remove debug lines * remove unused imports * Update src/transformers/models/llava/convert_llava_weights_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * small changes + docs --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-10 12:12:21 +05:00
Merve Noyan	e3a7d9bd47	Update depth estimation task guide (#31860 ) --------- Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-07-09 22:13:30 +03:00
Yung-Sung Chuang	d094d8d9ec	Generate: Add new decoding strategy "DoLa" in `.generate()` (#29619 ) Co-authored-by: Joao Gante <joao@huggingface.co>	2024-07-09 17:37:38 +01:00
hatti	350aed7076	chore: remove duplicate words (#31853 ) remove duplicate words	2024-07-09 10:38:29 +01:00
omahs	e5ca9b057c	Fix typos (#31819 ) * fix typo * fix typo * fix typos * fix typo * fix typos	2024-07-08 11:52:47 +01:00
Pavel Iakubovskii	a177821b24	Add FA2 and `sdpa` support for SigLIP (#31499 ) * Rebase to main * Fix attention implementation autoset for tex and vision configs * Fixup * Minor fixes * Fix copies * Fix attention_mask for FA2 * Add eqvivalence tests for siglip * Remove right padding test * Uncomment flaky * Fix import * Add to docs * Fix test message * Add sdpa * Add sdpa equivalence test * Add siglip sdpa to docs * Fix typing for attention output * Add sdpa tests * Fix signature of FA2 * Autoset attn_implementation in config * Rename bsz -> batch_size * Move back autoset attn method * Mark as flaky * Correct attention mask padding * [run-slow] siglip * Add FA2 and sdpa docs * Style fix * Remove flaky for FA2 test * Change attention implementation set * Change attn_implementaiton propogation * Fix typos * Add modality to assert message * Add more sdpa backends in test * [run slow] siglip * Add math sdpa backend for all options * [run slow] siglip	2024-07-08 11:10:02 +01:00
NielsRogge	06fd7972ac	Add ZoeDepth (#30136 ) * First draft * Add docs * Clean up code * Convert model * Add image processor * Convert Zoe_K * More improvements * Improve variable names and docstrings * Improve variable names * Improve variable names * Replace nn.sequential * More improvements * Convert ZoeD_NK * Fix most tests * Verify pixel values * Verify pixel values * Add squeeze * Update beit to support arbitrary window sizes * Improve image processor * Improve docstring * Improve beit * Improve model outputs * Add figure * Fix beit * Update checkpoint * Fix repo id * Add _keys_to_ignore_on_load_unexpected * More improvements * Address comments * Address comments * Address comments * Address comments * Rename variable name * Add backbone_hidden_size * Vectorize * Vectorize more * Address comments * Clarify docstring * Remove backbone_hidden_size * Fix image processor * Remove print statements * Remove print statement * Add integration test * Address comments * Address comments * Address comments * Address comments * Add requires_backends * Clean up * Simplify conversion script * Simplify more * Simplify more * Simplify more * Clean up * Make sure beit is loaded correctly * Address comment * Address bin_configurations * Use bin_configurations * Convert models, add integration tests * Fix doc test * Address comments * Unify regressor classes * Clarify arguments * Improve resize_image * Add num_relative_features * Address comment * [run-slow]beit,data2vec,zoedepth * [run-slow]beit,data2vec,zoedepth * Address comments * Address comment * Address comment * Replace nn.TransformerEncoderLayer and nn.TransformerEncoder * Replace nn.MultiheadAttention * Add attributes for patch transformer to config * Add tests for ensure_multiple_of * Update organization * Add tests * [run-slow] beit data2vec * Update ruff * [run-slow] beit data2vec * Add comment * Improve docstrings, add test * Fix interpolate_pos_encoding * Fix slow tests * Add docstring * Update src/transformers/models/zoedepth/image_processing_zoedepth.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/zoedepth/image_processing_zoedepth.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Improve tests and docstrings * Use run_common_tests * Improve docstrings * Improve docstrings * Improve tests * Improve tests * Remove print statements --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-08 11:43:33 +02:00
Pedro Cuenca	1082361a19	Depth Anything: update conversion script for V2 (#31522 ) * Depth Anything: update conversion script for V2 * Update docs * Style * Revert "Update docs" This reverts commit `be0ca47ea1`. * Add docs for depth anything v2 * Add depth_anything_v2 to MODEL_NAMES_MAPPING Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files * Add tip in original docs	2024-07-05 19:28:41 +01:00
Billy Cao	ac26260436	Allow FP16 or other precision inference for Pipelines (#31342 ) * cast image features to model.dtype where needed to support FP16 or other precision in pipelines * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use .to instead * Add FP16 pipeline support for zeroshot audio classification * Remove unused torch imports * Add docs on FP16 pipeline * Remove unused import * Add FP16 tests to pipeline mixin * Add fp16 placeholder for mask_generation pipeline test * Add FP16 tests for all pipelines * Fix formatting * Remove torch_dtype arg from is_pipeline_test_to_skip* * Fix format * trigger ci --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-05 17:21:50 +01:00
Matt	e786844425	Repeating an important warning in the chat template docs (#31796 ) * Repeating an important warning in the chat template docs * Update docs/source/en/chat_templating.md Co-authored-by: Lysandre Debut <hi@lysand.re> * Reword for clarity * Reword for clarity --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-07-05 15:30:24 +01:00
Billy Cao	1d3eaa6f7e	Add training support for SigLIP (#31495 ) * Add siglip loss function * Update docs * Enable training tests [experimental] enable GC training tests as it has worked for my own data * Remove test_training* overrides to enable training tests [run_slow] siglip * Skip training tests for Siglip text model and ImageClassificationModel [run_slow] siglip * Skip GC training tests for SiglipForImageClassification * Explicitly skip training tests for SiglipVisionModel Add skip reason for training tests for SiglipTextModel * Remove copied from to fix CI	2024-07-05 14:50:39 +01:00
Boris Feld	9e599d1d94	Update CometCallback to allow reusing of the running experiment (#31366 ) * Update CometCallback to allow reusing of the running experiment * Fixups * Remove useless TODO * Add checks for minimum version of the Comet SDK * Fix documentation and links. Also simplify how the Comet Experiment name is passed	2024-07-05 08:13:46 +02:00
Billy Cao	43ffb785c0	Add torch_empty_cache_steps to TrainingArguments (#31546 ) * Add torch_empty_cache_steps to TrainingArguments * Fix formatting * Add torch_empty_cache_steps to docs on single gpu training * Remove check for torch_empty_cache_steps <= max_steps * Captalize Tip * Be device agnostic * Fix linting	2024-07-04 13:20:49 -04:00
Aymeric Roucher	0fd885b91c	Adds final answer tool for all agents (#31703 ) * Adds final answer tool for all agents * Typo * Add clarification in doc * Put final_answer tool adition in agent for clarity	2024-07-03 11:36:09 +02:00
Jörg Bornschein	f91c16d270	Fix documentation for Gemma2. (#31682 ) * Fix documentation for Gemma2. Model sizes and Blog post URL are wrong in the documentation. * Update docs/source/en/model_doc/gemma2.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-02 23:04:53 +01:00
Sanchit Gandhi	a9701953ff	[whisper] static kv cache (#31166 ) * make work with cache abstraction * correct for static cache * hacks for compile * make fast * fix * fix pos ids * generate * fix sdpa * fix sdpa cache pos * fix fa2 * clean fa2 * integrate cache into generate * make style * copies * more copies * update eager * update sdpa * update fa2 * simplify * use cache pos * always compute cross-cache for debug * avoid recompiles Co-authored-by: Arthur Zucker <arthur@huggingface.co> * fix fix * fix fix fix * more fix * try encoder-decoder cache (too messy) * revert encoder-decoder cache * check cross-attn cache * use enc-dec dataclass * use richer enc-dec dataclass * clean-up * revert static cache changes * small fixes * revert to cpu flag * fix copies * add static slow test * past k/v docstring * more docstrings * cache_position docstrings * add to docs * add enc-dec cache to docs * make style * fix after rebase * fix beam * style * fix generation strategies * fix most decoder-only tests * style * skip test * more clean up * small docstrings * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * add todo * only crop self-attn * check cache in mixin * style * fix re-compile after rebase * move `is_updated` logic to enc-dec wrapper * revert back * revert cache back * finalise design * fix * fix fix * style * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * deprecate * updates * final updates * style * style --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-07-02 13:24:15 +01:00
Jade Choghari	e655029515	Add French version of run scripts tutorial (#31483 ) * Add French translation of run scripts tutorial * Update docs/source/fr/run_scripts_fr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Jade Choghari <chogharijade@icloud.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-06-28 18:02:30 +02:00
Steven Liu	464aa74659	[docs] Llama3 (#31662 ) quick usage to top	2024-06-27 10:32:51 -07:00
Arthur	75a6319864	Fix post gemma merge (#31660 ) * nit * toctree issue * protect gemma2 tests as well * sdpa supported	2024-06-27 17:51:42 +02:00
Arthur	0cf60f13ab	Add gemma 2 (#31659 ) * inital commit * Add doc * protect? * fixup stuffs * update tests * fix build documentation * mmmmmmm config attributes * style * nit * uodate * nit * Fix docs * protect some stuff --------- Co-authored-by: Lysandre <lysandre@huggingface.co>	2024-06-27 17:36:19 +02:00
amyeroberts	1de7dc7403	Skip tests properly (#31308 ) * Skip tests properly * [test_all] * Add 'reason' as kwarg for skipTest * [test_all] Fix up * [test_all]	2024-06-26 21:59:08 +01:00
Raushan Turganbay	e71f2863d7	Add LLaVa NeXT Video (#31252 ) * squash into single commit * run diff once more * docstring * tests * minor chnages and ready to go * Update src/transformers/models/llava_next_video/processing_llava_next_video.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vipllava/test_modeling_vipllava.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [run-slow] llava-next-video * [run-slow] llava-next-video * [run-slow] llava_next_video * fix two tests * fix slow tests * remove logit checks due to numeric errors * run test once more * [run-slow] llava_next_video * final try to pass the test * [run-slow] llava_next_video * [run-slow] llava_next_video * [run-slow] llava_next_video * style * fix * style --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-26 21:52:28 +05:00
Pavel Iakubovskii	ac52084bf2	Update RT-DETR code snippet (#31631 ) Update code snippet	2024-06-26 14:42:20 +01:00
Anton Vlasjuk	b07770c5eb	[`GPT-NeoX`] Add SDPA support (#31031 ) * starting support for sdpa in `gptneox` models * small comment on tests * fix dropout * documentation and style * clarify concrete paths for reference * generalise attn projections and rope application added head mask check to sdpa mask creation handle sdpa memory backend bug via own version flag * update docs and style * move dtype casting outside of general attn_projection_and_rope function fix flash_attn_2 stuff * more generic attn warning if output_attns or head_mask * simplify head mask check by moving head mask creation to a later point * remove copied llama artifact * remove padding_mask from attention function signature * removing unnecessary comments, only "save" attn implementation once * [run_slow] gpt_neox	2024-06-26 13:56:36 +01:00
Raushan Turganbay	fc689d75a0	Add video modality for InstrucBLIP (#30182 ) * squash in single commit * add docs * dummy obj * more changes in diff converter * tiny fix * make docs happy * skip test * repo consistency tests * update docstring * style * fix tests * change diff imports * [run-slow] instructblipvideo * [run-slow] instructblipvideo * fix tests and remove logit check * [run-slow] instructblipvideo	2024-06-25 15:45:39 +05:00
Nicholi Caron	4b822560a1	Update mask_generation.md (#31543 ) Minor bug fixes -- rearrange import & add missing parentheses	2024-06-23 20:27:21 +01:00
Sangbum Daniel Choi	74a207404e	New model support RTDETR (#29077 ) * fill out docs string in configuration `75dcd3a0e8 (r1506391856)` * reduce the input image size for the tests * remove the unappropriate tests * only 5 failes exists * make style * fill up missed architecture for object detection in docs * fix auto modeling * simple fix in missing import * major change including backbone refactor and objectdetectionoutput refactor * minor fix only 4 fails left * intermediate fix * revert __init__.py * revert __init__.py * make style * fixes in pr_docs * intermediate fix * make style * two fixes * pass doctest * only one fix left * intermediate commit * all fixed * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/convert_rt_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/rt_detr/test_modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * function class above the model definition in dice_loss * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * simple fix * layernorm add config.layer_norm_eps * fix inputs_docstring * make style * simple fix * add custom coco loading test in image_processor * fix error in BaseModelOutput https://github.com/huggingface/transformers/pull/29077#discussion_r1516657790 * simple typo * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * intermediate fix * fix with load_backbone format * remove unused configuration * 3 fix test left * make style * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: Sounak Dey <dey.sounak@gmail.com> * change last_hidden_state to first index * all pass fix TO DO: minor update in comments * make fix-copies * remove deepcopy * pr_document fix * revert deepcopy due to the issue of unexpceted behavior in decoderlayer * add atol in final * add no_split_module * _no_split_modules = None * device transfer for model parallelism * minor fix * make fix-copies * fix typo * add test_image_processor with post_processing * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add config in RTDETRPredictionHead * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * set lru_cache with max_size 32 * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add lru_cache import and configuration change * change the order of definition * make fix-copies * add docs and change config error * revert strange make-fix * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * test pass * fix get_clones related and remove deepcopy * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * nit for paper section * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename denoising related parameters * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * check the image transformation logic * make style * make style * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * pe_encoding -> positional_encoding_temperature * remove TODO * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove eval_idx since transformer DETR is giving all decoder output * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * change variable name * make style and docs import update * Revert "Update src/transformers/models/rt_detr/image_processing_rt_detr.py" This reverts commit `74aa3e1de0`. * fix typo * add postprocessing in docs * move import scipy to top * change varaible name * make fix-copies * remove eval_idx in test * move to after first sentence * update image_processor since box loss requires normalized one * change appropriate name to auxiliary_outputs * Update src/transformers/models/rt_detr/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/rt_detr/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/rt_detr.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/rt_detr.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make style * remove panoptic related comments * make style * revert valid_processor_keys * fix aux related test * make style * change origination from config to backbone API * enable the dn_loss * fix test and conversion * renewal weight initialization * change initializer_range * make fix-up * fix the loss issue in the auxiliary output and denoising part * change weight loss to original RTDETR * fix in initialization * sync shape format of dn and aux * make style * stable fine-tuning and compatible conversion for resnet101 * make style * skip input_embed * change encoder related variable * enable converting rtdetr_r101 * add r101 related conversion code * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/rt_detr.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/image_processing_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change name _shape to _reshape * Update src/transformers/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * maket style * make fix-copies * remove deprecated import * more fix * remove last_hidden_state for task-specific model * Revert "remove last_hidden_state for task-specific model" This reverts commit `ccb7a34051`. * minore change in convert * remove print * make style and fix-copies * add custom rtdetr backbone for r18, r34 * remove print * change copied * add pad_size * make style * change layertype to optional to pass the CI * make style * add test in modeling_resnet_rt_detr * make fix-copies * skip tmp file test * fix comment * add docs * change to modeling_resnet file format * enabling resnet50 above * Update src/transformers/models/rt_detr/modeling_rt_detr.py Co-authored-by: Jason Wu <jasonkit@users.noreply.github.com> * enable all the rtdetr model :) * finish except CI * add RTDetrResNetBackbone * make fix-copies * fix TO DO: CI enable * make style * rename test * add docs * add special fix * revert resnet * Update src/transformers/models/rt_detr/modeling_rt_detr_resnet.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add more comment * remove swin comment * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename convert and add verify backbone * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/rt_detr.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/rt_detr.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make style * requests for docs * more general test docs * general script docs * make fix-copies * final commit * Revert "Update src/transformers/models/rt_detr/configuration_rt_detr.py" This reverts commit `d136225cd3`. * skip test_model_get_set_embeddings * remove target * add changes * make fix-copies * remove decoder_attention_mask * add load_backbone function for auto_backbone * remove comment * fix repo name * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final commit * remove unused downsample_in_bottleneck * new test for autobackbone * change to appropriate indices * test fix * fix dict in test_image_processor * fix test * [run-slow] rt_detr, rt_detr_resnet * change the slow test * [run-slow] rt_detr * [run-slow] rt_detr, rt_detr_resnet * make in to same cuda in CSPRepLayer * [run-slow] rt_detr, rt_detr_resnet --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sounak Dey <dey.sounak@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Jason Wu <jasonkit@users.noreply.github.com> Co-authored-by: ChoiSangBum <choisangbum@ChoiSangBumui-MacBookPro.local>	2024-06-21 17:50:08 +01:00
Billy Cao	cd5f7c1790	Add docs on zeroshot image classification prompt templates (#31343 ) * Add docs on pipeline templates * Fix example and comments Update usage tips * Update docs/source/en/tasks/zero_shot_image_classification.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/siglip.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Trigger CI --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-19 11:11:44 +01:00
linlin	1c1aec2ef1	Update object_detection.md (#31488 ) Define MAX_SIZE before it is used.	2024-06-19 10:36:44 +01:00
Younes Belkada	7d683f7bae	Docs / AQLM: Clarify `torch.compile` support for AQLM (#31473 ) Update overview.md	2024-06-19 11:26:25 +02:00
Anton Vlasjuk	b275a41005	[`GPT2`] Add SDPA support (#31172 ) * `gpt2` sdpa support * fix (at least) one test, style, repo consistency * fix sdpa mask in forward --> fixes generation * test * test2 * test3 * test4 * simplify shapes for attn mask creation and small comments * hub fail test * benchmarks * flash attn 2 mask should not be inverted on enc-dec setup * fix comment * apply some suggestion from code review - only save _attn_implentation once - remove unnecessary comment * change elif logic * [run-slow] gpt2 * modify `test_gpt2_sample_max_time` to follow previous assertion patterns	2024-06-19 09:40:57 +02:00
Rémy Léone	22b41b3f8a	Update perf_train_gpu_many.md (#31451 ) * Update perf_train_gpu_many.md * Update docs/source/en/perf_train_gpu_many.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_gpu_many.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-06-18 11:00:26 -07:00
Matt	6e56b83453	Update chat template docs and bump Jinja version (#31455 ) * Update chat template docs * Minor bug in the version check * Update docs/source/en/chat_templating.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/chat_templating.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/chat_templating.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Replace backticks with bolding because the doc builder was trying to parse them * Replace backticks with bolding because the doc builder was trying to parse them * Replace backticks with bolding because the doc builder was trying to parse them * More cleanups to avoid upsetting the doc builder * Add one more tip at the end --------- Co-authored-by: Joshua Lochner <admin@xenova.com>	2024-06-18 14:16:30 +01:00
Matt	dabf01973a	Make "tool_use" the default chat template key when tools are passed (#31429 ) * Make "tool_use" the default when tools are passed * Add some opinionated text to the docs * Add some opinionated text to the docs	2024-06-18 13:54:42 +01:00
Jade Choghari	67a4ef89d4	Add missing French translation of tutoriel_pipeline.md (#31396 ) * Update french translation of tutoriel_pipeline.md * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/fr/tutoriel_pipeline.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Jade Choghari <chogharijade@icloud.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-06-13 17:48:54 +02:00
谭九鼎	84351d57eb	docs: fix broken link (#31370 ) * docs: fix broken link * fix link	2024-06-12 11:33:00 +01:00
Jade Choghari	35a6d9d648	Add french translation of AutoBackbone (#31300 )	2024-06-11 18:28:52 +01:00
amyeroberts	f53fe35b29	Fast image processor (#28847 ) * Draft fast image processors * Draft working fast version * py3.8 compatible cache * Enable loading fast image processors through auto * Tidy up; rescale behaviour based on input type * Enable tests for fast image processors * Smarter rescaling * Don't default to Fast * Safer imports * Add necessary Pillow requirement * Woops * Add AutoImageProcessor test * Fix up * Fix test for imagegpt * Fix test * Review comments * Add warning for TF and JAX input types * Rearrange * Return transforms * NumpyToTensor transformation * Rebase - include changes from upstream in ImageProcessingMixin * Safe typing * Fix up * convert mean/std to tesnor to rescale * Don't store transforms in state * Fix up * Update src/transformers/image_processing_utils_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Warn if fast image processor available * Update src/transformers/models/vit/image_processing_vit_fast.py * Transpose incoming numpy images to be in CHW format * Update mapping names based on packages, auto set fast to None * Fix up * Fix * Add AutoImageProcessor.from_pretrained(checkpoint, use_fast=True) test * Update src/transformers/models/vit/image_processing_vit_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Add equivalence and speed tests * Fix up --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-06-11 15:47:38 +01:00
Matt	edc1dffd00	Chat Template support for function calling and RAG (#30621 ) * First draft, still missing automatic function conversion * First draft of the automatic schema generator * Lots of small fixes * the walrus has betrayed me * please stop committing your debug breakpoints * Lots of cleanup and edge cases, looking better now * Comments and bugfixes for the type hint parser * More cleanup * Add tests, update schema generator * Update tests, proper handling of return values * Small docstring change * More doc updates * More doc updates * Add json_schema decorator * Clean up the TODOs and finish the docs * self.maxDiff = None to see the whole diff for the nested list test * add import for add_json_schema * Quick test fix * Fix something that was bugging me in the chat template docstring * Less "anyOf" when unnecessary * Support return types for the templates that need them * Proper return type tests * Switch to Google format docstrings * Update chat templating docs to match new format * Stop putting the return type in with the other parameters * Add Tuple support * No more decorator - we just do it implicitly! * Add enum support to get_json_schema * Update docstring * Add copyright header * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/chat_template_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/chat_template_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add copyright header * make fixup * Fix indentation * Reformat chat_template_utils * Correct return value * Make regexes module-level * Support more complex, multi-line arg docstrings * Update error message for ... * Update ruff * Add document type validation * Refactor docs * Refactor docs * Refactor docs * Clean up Tuple error * Add an extra test for very complex defs and docstrings and clean everything up for it * Document enum block * Quick test fixes * Stop supporting type hints in docstring to fix bugs and simplify the regex * Update docs for the regex change * Clean up enum regex * Wrap functions in {"type": "function", "function": ...} * Update src/transformers/utils/chat_template_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Temporary tool calling commit * Add type hints to chat template utils, partially update docs (incomplete!) * Code cleanup based on @molbap's suggestion * Add comments to explain regexes * Fix up type parsing for unions and lists * Add custom exception types and adjust tests to look for them * Update docs with a demo! * Docs cleanup * Pass content as string * Update tool call formatting * Update docs with new function format * Update docs * Update docs with a second tool to show the model choosing correctly --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-06-11 15:46:38 +01:00
Pavel Iakubovskii	517df566f5	Decorators for deprecation and named arguments validation (#30799 ) * Fix do_reduce_labels for maskformer image processor * Deprecate reduce_labels in favor to do_reduce_labels * Deprecate reduce_labels in favor to do_reduce_labels (segformer) * Deprecate reduce_labels in favor to do_reduce_labels (oneformer) * Deprecate reduce_labels in favor to do_reduce_labels (maskformer) * Deprecate reduce_labels in favor to do_reduce_labels (mask2former) * Fix typo * Update mask2former test * fixup * Update segmentation examples * Update docs * Fixup * Imports fixup * Add deprecation decorator draft * Add deprecation decorator * Fixup * Add deprecate_kwarg decorator * Validate kwargs decorator * Kwargs validation (beit) * fixup * Kwargs validation (mask2former) * Kwargs validation (maskformer) * Kwargs validation (oneformer) * Kwargs validation (segformer) * Better message * Fix oneformer processor save-load test * Update src/transformers/utils/deprecation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/utils/deprecation.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Better handle classmethod warning * Fix typo, remove warn * Add header * Docs and `additional_message` * Move to filter decorator ot generic * Proper deprecation for semantic segm scripts * Add to __init__ and update import * Basic tests for filter decorator * Fix doc * Override `to_dict()` to pop depracated `_max_size` * Pop unused parameters * Fix trailing whitespace * Add test for deprecation * Add deprecation warning control parameter * Update generic test * Fixup deprecation tests * Introduce init service kwargs * Revert popping unused params * Revert oneformer test * Allow "metadata" to pass * Better docs * Fix test * Add notion in docstring * Fix notification for both names * Add func name to warning message * Fixup --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-06-10 12:35:10 +01:00
谭九鼎	4fa4dcb2be	docs/zh: fix style (#31334 )	2024-06-10 11:40:40 +01:00
谭九鼎	807483edba	docs: fix style (#31340 )	2024-06-10 09:53:25 +01:00
Matt	065729a692	Remove ConversationalPipeline and Conversation object (#31165 ) * Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal * Update not-doctested.txt * Fix JA and ZH docs * Fix JA and ZH docs some more * Fix JA and ZH docs some more	2024-06-07 17:50:18 +01:00
amyeroberts	bdf36dcd48	Enable HF pretrained backbones (#31145 ) * Enable load HF or tim backbone checkpoints * Fix up * Fix test - pass in proper out_indices * Update docs * Fix tvp tests * Fix doc examples * Fix doc examples * Try to resolve DPT backbone param init * Don't conditionally set to None * Add condition based on whether backbone is defined * Address review comments	2024-06-06 22:02:38 +01:00
Jack Yang	a3d351c00f	Update text-to-speech.md (#31269 ) SpeechBrain usage has changed	2024-06-06 21:59:22 +01:00
Lucain	9ef93fccad	Switch from `cached_download` to `hf_hub_download` in remaining occurrences (#31284 ) Switch from hf_hub_url to hf_hub_download in remaining occurences	2024-06-06 12:05:59 +01:00
Vaibhav Srivastav	4a6024921f	doc: add info about wav2vec2 bert in older wav2vec2 models. (#31120 ) * doc: add info about wav2vec2 bert in older wav2vec2 models. * apply suggestions from review. * forward contrib credits from review --------- Co-authored-by: Sanchit Gandhi <sanchit-gandhi@users.noreply.github.com>	2024-06-05 11:56:11 +01:00
Younes Belkada	485d913dfb	Blip: Deprecate `BlipModel` (#31235 ) * deprecate blip * mention deprecation on docs	2024-06-04 18:29:45 +02:00
Aaron Jimenez	c73ee1333d	[docs] Spanish translation of tokenizer_summary.md (#31154 ) * add tokenizer_summary to es/_toctree.yml * add tokenizer_summary to es/ * fix link to Transformes XL in en/ * translate until Subword tokenization section * fix GPT link in en/ * fix other GPT link in en/ * fix typo in en/ * translate the doc * run make fixup * Remove .md in Transformer XL link * fix some link issues in es/ * fix typo	2024-06-03 16:52:23 -07:00
Jade Choghari	98dd842339	Wrong translation FR : Contents = Contenu (#31186 ) Update index.md - Contents = Contenu French typo - Contents = Contenu	2024-06-03 17:40:14 +02:00
Isotr0py	e4628434d8	Add Qwen2 GGUF loading support (#31175 ) * add qwen2 gguf support * Update docs * fix qwen2 tokenizer * add qwen2 gguf test * fix typo in qwen2 gguf test * format code * Remove mistral, clarify the error message * format code * add typing and update docstring	2024-06-03 14:55:10 +01:00
Pavel Iakubovskii	cdc813113a	Instance segmentation examples (#31084 ) * Initial setup * Metrics * Overfit on two batches * Train 40 epochs * Memory leak debugging * Trainer fine-tuning * Draft * Fixup * Trained end-to-end * Add requirements * Rewrite evaluator * nits * Add readme * Add instance-segmentation to the table * Support void masks * Remove sh * Update docs * Add pytorch test * Add accelerate test * Update examples/pytorch/instance-segmentation/README.md * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Fix consistency oneformer * Fix imports * Fix imports sort * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Add resources to docs * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove explicit model_type argument * Fix tests * Update readme * Note about other models --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-31 16:56:17 +01:00
Aymeric Roucher	9837a25481	Add streaming, various fixes (#30838 ) * Implement streaming run in ReAct agents * Allow additional imports in code agents * Python interpreter: support classes and exceptions, fixes	2024-05-31 14:16:23 +02:00
Asif Ajrof	bd9d1ddf41	Update sam.md (#31130 ) `mask` variable is not defined. probably a writing mistake. it should be `segmentation_map`. `segmentation_map` should be a `1` channel image rather than `RGB`. [on a different note, the `mask_url` is the same as `raw_image`. could provide a better example.	2024-05-31 12:34:29 +02:00
Younes Belkada	f5590deaa8	Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (#31136 ) Replace all occurences of `load_in_8bit` with bnb config	2024-05-30 16:47:35 +02:00
Younes Belkada	cb879c5801	FIX / Docs: Fix GPTQ expected number of bits (#31111 ) Update overview.md	2024-05-29 15:56:28 +02:00
Lucain	c3044ec2f3	Use `HF_HUB_OFFLINE` + fix has_file in offline mode (#31016 ) * Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs	2024-05-29 11:55:43 +01:00
amyeroberts	a564d10afe	Deprecate low use models (#30781 ) * Deprecate models - graphormer - time_series_transformer - xlm_prophetnet - qdqbert - nat - ernie_m - tvlt - nezha - mega - jukebox - vit_hybrid - x_clip - deta - speech_to_text_2 - efficientformer - realm - gptsan_japanese * Fix up * Fix speech2text2 imports * Make sure message isn't indented * Fix docstrings * Correctly map for deprecated models from model_type * Uncomment out * Add back time series transformer and x-clip * Import fix and fix-up * Fix up with updated ruff	2024-05-28 18:07:07 +01:00
Younes Belkada	7f08817be4	Docs / Quantization: Redirect deleted page (#31063 ) Update _redirects.yml	2024-05-28 18:29:22 +02:00
Younes Belkada	4f98b14465	Docs / PEFT: Add PEFT API documentation (#31078 ) * add peft references * add peft references * Update docs/source/en/peft.md * Update docs/source/en/peft.md	2024-05-28 15:04:43 +02:00
NielsRogge	90da0b1c9f	[SuperPoint, PaliGemma] Update docs (#31025 ) * Update docs * Add PaliGemma resources * Address comment * Update docs	2024-05-28 13:22:06 +02:00

... 2 3 4 5 6 ...

2890 Commits