Commit Graph

3009 Commits

Author SHA1 Message Date
Abhishek Maurya
65753d6065
Remove graph breaks for torch.compile() in flash_attention_forward when Lllama Model is padding free tuned (#33932)
* fix: fixes for graph breaks

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: formatting

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: import error

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: Add Fa2Kwargs

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* Revert "PR changes"

This reverts commit 39d2868e5c.

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: FlashAttentionKwarg

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: FlashAttentionKwarg

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* addition of documentation

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* change in _flash_attention_forward

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* make fix-copies

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* revert make fix-copies

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix copies

* style

* loss kwargs typing

* style and pull latest changes

---------

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-10-24 11:02:54 +02:00
Joao Gante
b0f0c61899
Add SynthID (watermerking by Google DeepMind) (#34350)
* Add SynthIDTextWatermarkLogitsProcessor

* esolving comments.

* Resolving comments.

* esolving commits,

* Improving SynthIDWatermark tests.

* switch to PT version

* detector as pretrained model + style

* update training + style

* rebase

* Update logits_process.py

* Improving SynthIDWatermark tests.

* Shift detector training to wikitext negatives and stabilize with lower learning rate.

* Clean up.

* in for 7B

* cleanup

* upport python 3.8.

* README and final cleanup.

* HF Hub upload and initiaze.

* Update requirements for synthid_text.

* Adding SynthIDTextWatermarkDetector.

* Detector testing.

* Documentation changes.

* Copyrights fix.

* Fix detector api.

* ironing out errors

* ironing out errors

* training checks

* make fixup and make fix-copies

* docstrings and add to docs

* copyright

* BC

* test docstrings

* move import

* protect type hints

* top level imports

* watermarking example

* direct imports

* tpr fpr meaning

* process_kwargs

* SynthIDTextWatermarkingConfig docstring

* assert -> exception

* example updates

* no immutable dict (cant be serialized)

* pack fn

* einsum equivalent

* import order

* fix test on gpu

* add detector example

---------

Co-authored-by: Sumedh Ghaisas <sumedhg@google.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>
2024-10-23 21:18:52 +01:00
Steven Liu
5ba85de7a4
[docs] Fix Korean toctree (#34324)
fix
2024-10-23 10:52:51 +02:00
wony617
644d5287b2
🌐 [i18n-KO] Translated model_doc/bartpho.md to Korean (#33981)
* docs: ko: model_doc/bartpho.md

* feat: nmt draft

* Update docs/source/ko/model_doc/bartpho.md

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-22 09:46:52 -07:00
Ahnjj_DEV
b03dc0a87e
🌐 [i18n-KO] Translated bert japanese.md to Korean (#33890)
* docs: ko: bert-japanese.md

* Update _toctree.yml

* fix: manual edits

* Update docs/source/ko/_toctree.yml

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

---------

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-22 09:46:31 -07:00
Ahnjj_DEV
4b14aa1bcd
🌐 [i18n-KO] Translated executorch.md to Korean (#33888)
* docs: ko: executorch.md

* Update _toctree.yml

* fix: manual edits

* Update docs/source/ko/main_classes/executorch.md

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/_toctree.yml

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-22 09:46:20 -07:00
Fanli Lin
688eeac81e
[docs] fix typo (#34235)
fix typo
2024-10-22 09:46:07 -07:00
Alexandros Benetatos
c31a6ff474
Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550)
* add colorize_depth and matplotlib availability check

* add post_process_depth_estimation for zoedepth + tests

* add post_process_depth_estimation for DPT + tests

* add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth

* run `make fixup`

* fix import related error on tests

* fix more import related errors on test

* forgot some `torch` calls in declerations

* remove `torch` call in zoedepth tests that caused error

* updated docs for depth estimation

* small fix for `colorize` input/output types

* remove `colorize_depth`, fix various names, remove matplotlib dependency

* fix formatting

* run fixup

* different images for test

* update examples in `forward` functions

* fixed broken links

* fix output types for docs

* possible format fix inside `<Tip>`

* Readability related updates

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Readability related update

* cleanup after merge

* refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation`

* rewrite dict merging to support python 3.8

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-22 15:50:54 +02:00
regisss
93352e81f5
Fix Korean doc _toctree.yml (#34293)
Fix korean doc _toctree.yml
2024-10-22 11:05:56 +02:00
Raushan Turganbay
21d5025826
Attn implementation for composite models (#32238)
* first try

* codestyle

* idefics2 is happy

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma

* fix-copies

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo

* blip-2 needs to init vision from config

* when was this removed O_o

* minor fix

* tests

* this way?

* tests

* model-agnostic code

* codestyle

* add tests for idefics

* modify general test for VLMs

* no generation test for vlm yet!

* no generation test here also

* wanr in VIT-SDPA if output attn

* add more tests

* user can pass dict as attn impl

* repo consistency

* update

* muicgen

* no prints

* forgot speech enc-dec and clip

* how many composite models we have?

* musicgen meelody is same as mudicgen

* +siglip

* fix tests + add some more

* remove idefics custom overriden code

* make idefics2 automappable

* nits

* skip tests

* doctests

* Update src/transformers/models/idefics2/configuration_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/clip/test_modeling_clip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* major update, no need for automap

* clean up

* add FA2 test

* more tests

* style

* skip tests

* why did these started failing now?

* no attributes for FA2 needed

* one tiny test

* address comment about FA2 false warning

* style

* add new models and resolve conflicts

* fix copies

* let it be this way for now, come back tomorrow to review

* some more fixes

* update

* more updates

* update

* fix copies

* style and tests

* another big update

* fix tests

* fix tests

* update

* another update

* fix tests

* fix copies

* fix tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-22 06:54:44 +02:00
Andrés Marafioti
32590b5ecb
Fix method name which changes in tutorial (#34252)
The method `model_download_tool` was called `model_download_counter` earlier in the tutorial, this raises an error when following the code.
2024-10-21 14:21:52 -03:00
Matt
f701b98e4a
Add a doc section on writing generation prompts (#34248)
Add a section on writing generation prompts
2024-10-21 14:35:57 +01:00
Yoni Gozlan
a4122813d1
Add DetrImageProcessorFast (#34063)
* add fully functionning image_processing_detr_fast

* Create tensors on the correct device

* fix copies

* fix doc

* add tests equivalence cpu gpu

* fix doc en

* add relative imports and copied from

* Fix copies and nit
2024-10-21 09:05:05 -04:00
Cyril Vallez
6604764007
add Glm (#33823)
* Create modular_glm.py

* Update modular_glm.py

* Finalize architecture without all attentions

* Add all attentions modules

* Finalize modular

* Update given last version

* Last update

* Finalize model

* Finalize converter

* Update convert_glm_weights_to_hf.py

* style

* style

* Create __init__.py

* Aff all inits

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Correct the rotary embeddings

* Remove apply_residual_connection_post_layernorm (always false)

* remove use_rms_norm (always true)

* remove past_layer_norm (always true)

* Update __init__.py

* Update config and license

* start adding tests and doc

* Add doc + style

* Update test_modeling_glm.py

* Add dummies

* Apply correct modeling

* Refactor attention to follow llama

* Update __init__.py

* Update convert_glm_weights_to_hf.py

* Correct bias

* remove linear_bias and pdrop (never used)

* apply modular

* Simplify converter

* remove dummies + style

* add model_input_names

* Add pretraining_tp to config for when eager attention is used

* Update modular to remove all pretraining_tp

* Update test_modeling_glm.py

* Update the __all__

* Update __all__

* Update __init__.py

* Update test_modeling_glm.py

* add revisions

* Add the correct repos and revisions

* style

* Update __init__.py

* update exports

* remove import of modular files

* style

* Apply Llama changes + refine converter

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* style

* Use new modular converter

* add pretrainedmodel to init

* style

* Update test_modeling_glm.py

* Move config outside modular to please CI about docstrings

* Add dummies to please CI

* Update glm.md

* Update glm.md
2024-10-18 17:41:12 +02:00
Yoach Lacombe
9ba021ea75
Moshi integration (#33624)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* first moshi WIP

* converting weights working + configuration + generation configuration

* finalize converting script - still missing tokenizer and FE and processor

* fix saving model w/o default config

* working generation

* use GenerationMixin instead of inheriting

* add delay pattern mask

* fix right order: moshi codes then user codes

* unconditional inputs + generation config

* get rid of MoshiGenerationConfig

* blank user inputs

* update convert script:fix conversion, add  tokenizer, feature extractor and bf16

* add and correct Auto classes

* update modeling code, configuration and tests

* make fixup

* fix some copies

* WIP: add integration tests

* add dummy objects

* propose better readiblity and code organisation

* update tokenization tests

* update docstrigns, eval and modeling

* add .md

* make fixup

* add MoshiForConditionalGeneration to ignore Auto

* revert mimi changes

* re

* further fix

* Update moshi.md

* correct md formating

* move prepare causal mask to class

* fix copies

* fix depth decoder causal

* fix and correct some tests

* make style and update .md

* correct config checkpoitn

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make style

* Update src/transformers/models/moshi/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* change firm in copyrights

* udpate config with nested dict

* replace einsum

* make style

* change split to True

* add back splt=False

* remove tests in convert

* Update tests/models/moshi/test_modeling_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add default config repo + add model to FA2 docstrings

* remove logits float

* fix some tokenization tests and ignore some others

* make style tokenization tests

* update modeling with sliding window + update modeling tests

* [run-slow] moshi

* remove prepare for generation frol CausalLM

* isort

* remove copied from

* ignore offload tests

* update causal mask and prepare 4D mask aligned with recent changes

* further test refine + add back prepare_inputs_for_generation for depth decoder

* correct conditional use of prepare mask

* update slow integration tests

* fix multi-device forward

* remove previous solution to device_map

* save_load is flaky

* fix generate multi-devices

* fix device

* move tensor to int

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-10-16 11:21:49 +02:00
Chulhwa (Evan) Han
9d6998c759
🌐 [i18n-KO] Translated blip-2.md to Korean (#33516)
* docs: ko: model_doc/blip-2

* feat: nmt draft

* Apply suggestions from code review

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/model_doc/blip-2.md

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
2024-10-15 11:21:22 -07:00
Yijun Lee
554ed5d1e0
🌐 [i18n-KO] Translated trainer_utils.md to Korean (#33817)
* docs: ko: trainer_utils.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

---------

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
2024-10-15 11:21:05 -07:00
Yijun Lee
8c33cf4eec
🌐 [i18n-KO] Translated gemma2.md to Korean (#33937)
* docs: ko: gemma2.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions
2024-10-15 11:20:46 -07:00
Jiwook Han
67acb0b123
🌐 [i18n-KO] Translated vivit.md to Korean (#33935)
* docs: ko: model_doc/vivit.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits
2024-10-15 10:31:44 -07:00
Prakarsh Kaushik
293e6271c6
Add sdpa for Vivit (#33757)
* chore:add sdpa to vivit

* fix:failing slow test_inference_interpolate_pos_encoding(failing on main branch too)

* chore:fix nits

* ci:fix repo consistency failure

* chore:add info and benchmark to model doc

* [run_slow] vivit

* chore:revert interpolation test fix for new issue

* [run_slow] vivit

* [run_slow] vivit

* [run_slow] vivit

* chore:add fallback for output_attentions being True

* [run_slow] vivit

* style:make fixup

* [run_slow] vivit
2024-10-15 11:27:54 +02:00
Vladislav Bronzov
cb5ca3265f
Add GGUF for starcoder2 (#34094)
* add starcoder2 arch support for gguf

* fix q6 test
2024-10-14 10:22:49 +02:00
Anton Vlasjuk
7434c0ed21
Mistral-related models for QnA (#34045)
* mistral qna start

* mixtral qna

* oops

* qwen2 qna

* qwen2moe qna

* add missing input embed methods

* add copied to all methods, can't directly from llama due to the prefix

* make top level copied from
2024-10-14 08:53:32 +02:00
Lysandre Debut
f052e94bcc
Fix flax failures (#33912)
* Few fixes here and there

* Remove typos

* Remove typos
2024-10-11 14:38:35 +02:00
Michael Goin
b2f09fb90f
[Docs] Update compressed_tensors.md (#33961)
* Update compressed_tensors.md

Fix some unfinished sections

* Update docs/source/en/quantization/compressed_tensors.md

Co-authored-by: Xiao Yuan <yuanx749@gmail.com>

---------

Co-authored-by: Xiao Yuan <yuanx749@gmail.com>
2024-10-10 15:22:41 +02:00
Daniel Korat
fb0c6b521d
Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) (#33383)
* Update candidate_generator.py

* Update utils.py

* add lookbehind params to _get_candidate_generator

* make fixup

* add unit tests

* fix failing tests

* add docstrings

* fix docstrings; remove non-optimized AnyTokenizer

* added any tokenizer generation correctness test

* make fixup

* fix assertion syntax

* PR review fixes

* address additional PR comments

* fix tests

* remove stropping criteria arg

* make fixup

* add AssistantConfig

* fix prev_tokens branching

* pass tokenizers through `generate()`kwargs

* fix lookbehind values; tokenizer params WIP

* fixup

* AssistantConfig

* remove AssistantConfig; apply PR suggestions

* restructure tests

* fixup

* fix assistant_tokenizer arg validation

* fixup

* fix tests in TestAssistedCandidateGeneratorDifferentTokenizers

* fix class docstring

* PR suggestions

* doc

* doc update and improvements to `_validate_assistant()`

---------

Co-authored-by: mosheber <moshe.berchansky@intel.com>
2024-10-10 14:41:53 +02:00
Vladislav Bronzov
c9afee5392
Add gguf support for gpt2 (#34044)
* add gpt2 gguf support

* add doc change

* small refactoring
2024-10-10 13:42:18 +02:00
Avishai Elmakies
a265600c60
add sdpa to OPT (#33298)
* add sdpa to OPT

* chore: remove redundant whitespace in OPTDecoder class

* fixup

* bug fix

* add sdpa and attention generate test

* fixup

* Refactor OPTAttention forward method for improved readability and maintainability

* undo refactor for _shape and key,val states

* add OPT to doc, fixup didn't find it for some reason

* change order

* change default attn_implemntation in testing to eager

* [run-slow] opt

* change test_eager_matches_sdpa_generate to the one llama

* Update default attention implementation in testing common

* [run-slow] opt

* remove uneeded print

* [run-slow] opt

* refactor model testers to have attn_implementation="eager"

* [run-slow] opt

* convert test_eager_matches_sdpa_generate to opt-350M

* bug fix when creating mask for opt

* [run-slow] opt

* if layer head mask default to eager

* if head mask is not none fall to eager

* [run-slow] opt

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Clean up Unpack imports (#33631)

clean up Unpack imports

* Fix DPT /Dinov2 sdpa regression on main (#33660)

* fallback to eager if output attentions.

* fix copies

* handle dependency errors in check_imports (#33622)

* handle dependency errors in check_imports

* change log level to warning

* add back self.max_position_embeddings = config.max_position_embeddings (#33550)

* add back self.max_position_embeddings = config.max_position_embeddings

* fix-copies

* Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (#33613)

fix llavaqwen2 model conversion

* Uniformize kwargs for Udop processor and update docs (#33628)

* Add optional kwargs and uniformize udop

* cleanup Unpack

* nit Udop

* Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin`  (#33203)

* Enable BNB multi-backend support (#31098)

* enable cpu bnb path

* fix style

* fix code style

* fix 4 bit path

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* add multi backend refactor tests

* fix style

* tweak 4bit quantizer + fix corresponding tests

* tweak 8bit quantizer + *try* fixing corresponding tests

* fix dequant bnb 8bit

* account for Intel CPU in variability of expected outputs

* enable cpu and xpu device map

* further tweaks to account for Intel CPU

* fix autocast to work with both cpu + cuda

* fix comments

* fix comments

* switch to testing_utils.torch_device

* allow for xpu in multi-gpu tests

* fix tests 4bit for CPU NF4

* fix bug with is_torch_xpu_available needing to be called as func

* avoid issue where test reports attr err due to other failure

* fix formatting

* fix typo from resolving of merge conflict

* polish based on last PR review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix CI

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix error log

* fix error msg

* add \n in error log

* make quality

* rm bnb cuda restriction in doc

* cpu model don't need dispatch

* fix doc

* fix style

* check cuda avaliable in testing

* fix tests

* Update docs/source/en/model_doc/chameleon.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update docs/source/en/model_doc/llava_next.md

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix doc

* fix check multibackends

* fix import sort

* remove check torch in bnb

* docs: update bitsandbytes references with multi-backend info

* docs: fix small mistakes in bnb paragraph

* run formatting

* reveret bnb check

* move bnb multi-backend check to import_utils

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix bnb check

* minor fix for bnb

* check lib first

* fix code style

* Revert "run formatting"

This reverts commit ac108c6d6b.

* fix format

* give warning when bnb version is low and no cuda found]

* fix device assignment check to be multi-device capable

* address akx feedback on get_avlbl_dev fn

* revert partially, as we don't want the function that public, as docs would be too much (enforced)

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix error string after refactoring into get_chat_template (#33652)

* Fix error string after refactoring into get_chat_template

* Take suggestion from CR

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* uniformize git processor (#33668)

* uniformize git processor

* update doctring

* Modular `transformers`: modularity and inheritance for new model additions (#33248)

* update exampel

* update

* push the converted diff files for testing and ci

* correct one example

* fix class attributes and docstring

* nits

* oups

* fixed config!

* update

* nitd

* class attributes are not matched against the other, this is missing

* fixed overwriting self.xxx now onto the attributes I think

* partial fix, now order with docstring

* fix docstring order?

* more fixes

* update

* fix missing docstrings!

* examples don't all work yet

* fixup

* nit

* updated

* hick

* update

* delete

* update

* update

* update

* fix

* all default

* no local import

* fix more diff

* some fix related to "safe imports"

* push fixed

* add helper!

* style

* add a check

* all by default

* add the

* update

* FINALLY!

* nit

* fix config dependencies

* man that is it

* fix fix

* update diffs

* fix the last issue

* re-default to all

* alll the fixes

* nice

* fix properties vs setter

* fixup

* updates

* update dependencies

* make sure to install what needs to be installed

* fixup

* quick fix for now

* fix!

* fixup

* update

* update

* updates

* whitespaces

* nit

* fix

* simplify everything, and make it file agnostic (should work for image processors)

* style

* finish fixing all import issues

* fixup

* empty modeling should not be written!

* Add logic to find who depends on what

* update

* cleanup

* update

* update gemma to support positions

* some small nits

* this is the correct docstring for gemma2

* fix merging of docstrings

* update

* fixup

* update

* take doc into account

* styling

* update

* fix hidden activation

* more fixes

* final fixes!

* fixup

* fixup instruct  blip video

* update

* fix bugs

* align gemma2 with the rest as well

* updats

* revert

* update

* more reversiom

* grind

* more

* arf

* update

* order will matter

* finish del stuff

* update

* rename to modular

* fixup

* nits

* update makefile

* fixup

* update order of the checks!

* fix

* fix docstring that has a call inside

* fiix conversion check

* style

* add some initial documentation

* update

* update doc

* some fixup

* updates

* yups

* Mostly todo gimme a minut

* update

* fixup

* revert some stuff

* Review docs for the modular transformers (#33472)

Docs

* good update

* fixup

* mmm current updates lead to this code

* okay, this fixes it

* cool

* fixes

* update

* nit

* updates

* nits

* fix doc

* update

* revert bad changes

* update

* updates

* proper update

* update

* update?

* up

* update

* cool

* nits

* nits

* bon bon

* fix

* ?

* minimise changes

* update

* update

* update

* updates?

* fixed gemma2

* kind of a hack

* nits

* update

* remove `diffs` in favor of `modular`

* fix make fix copies

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Fix CIs post merging modular transformers (#33681)

update

* Fixed docstring for cohere model regarding unavailability of prune_he… (#33253)

* Fixed docstring for cohere model regarding unavailability of prune_head() methods

The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality.

* Update src/transformers/models/cohere/modeling_cohere.py

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Generation tests: update imagegpt input name, remove unused functions (#33663)

* Improve Error Messaging for Flash Attention 2 on CPU (#33655)

Update flash-attn error message on CPU

Rebased to latest branch

* Gemma2: fix config initialization (`cache_implementation`) (#33684)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used (#33556)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used

* Fixed formatting with `ruff`.

* Uniformize kwargs for image-text-to-text processors (#32544)

* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino

* 🚨🚨 Setting default behavior of assisted decoding (#33657)

* tests: fix pytorch tensor placement errors (#33485)

This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"

According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.

Fixes: #33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* bump tokenizers, fix added tokens fast (#32535)

* update based on tokenizers release

* update

* nits

* update

* revert re addition

* don't break that yet

* fmt

* revert unwanted

* update tokenizers version

* update dep table

* update

* update in conversion script as well

* some fix

* revert

* fully revert

* fix training

* remove set trace

* fixup

* update

* update

* [Pixtral] Improve docs, rename model (#33491)

* Improve docs, rename model

* Fix style

* Update repo id

* fix code quality after merge

* HFQuantizer implementation for compressed-tensors library (#31704)

* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>

* update model card for opt

* add batch size to inference table

* [slow-run] opt

* [run-slow] opt

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Tibor Reiss <75096465+tibor-reiss@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Muhammad Naufil <m.naufil1@gmail.com>
Co-authored-by: sizhky <yyeshr@gmail.com>
Co-authored-by: Umar Butler <umar@umar.au>
Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com>
Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>
2024-10-10 11:49:34 +02:00
Ahmed Almaghz
69b5ccb887
Add Translate docs into Arabic - section files CONCEPTUAL GUIDES (#33982)
Add Translate docs into Arabic - section files CONCEPTUAL GUIDES
---------------------------------------------------------------------------------------
 Philosophy [i18n-ar] Translated file : docs/source/ar/philosophy.md into Arabic #33064
 Glossary [i18n-ar] Translated file : docs/source/ar/glossary.md into Arabic #33038
 What 🤗 Transformers can do [i18n-ar] Translated file : docs/source/ar/task_summary.md into Arabic #33073
 How 🤗 Transformers solve tasks [i18n-ar] Translated file : docs/source/ar/tasks_explained.md into Arabic #33074
 The Transformer model family [i18n-ar] Translated file : docs/source/ar/model_summary.md into Arabic #33047
 Summary of the tokenizers [i18n-ar] Translated file : docs/source/ar/tokenizer_summary.md into Arabic #33078
 Attention [i18n-ar] Translated file : docs/source/ar/attention.md into Arabic #33021
 Padding and truncation [i18n-ar] Translated file : docs/source/ar/pad_truncation.md into Arabic #33050
 BERTology [i18n-ar] Translated file : docs/source/ar/bertology.md into Arabic #33024
 Perplexity of fixed-length models [i18n-ar] Translated file : docs/source/ar/perplexity.md into Arabic #33063
 Pipelines for webserver inference [i18n-ar] Translated file : docs/source/ar/pipeline_webserver.md into Arabic #33066
 Model training anatomy [i18n-ar] Translated file : docs/source/ar/model_memory_anatomy.md into Arabic #33045
 Getting the most out of LLMs [i18n-ar] Translated file : docs/source/ar/llm_tutorial_optimization.md into Arabic #33043
2024-10-09 14:51:19 -07:00
Yijun Lee
88d01d9119
🌐 [i18n-KO] Translated generation_utils.md to Korean (#33818)
* docs: ko: generation_utils.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

* Update generation_utils.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 11:55:07 -07:00
wony617
c02cf48729
🌐 [i18n-KO] Translated main_classes/callback.md to Korean (#33572)
* docs: ko: callback.md

* feat: nmt draft & manual edits

* fix: resolve suggestions

* Update docs/source/ko/main_classes/callback.md

* Apply suggestions from code review

* Apply suggestions from code review

확인했습니다! 상세한 리뷰 정말 감사합니다!

Co-authored-by: boyunJang <gobook1234@naver.com>

* Update _toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 11:54:38 -07:00
Yijun Lee
0354d44926
🌐 [i18n-KO] Translated text_generation.md to Korean (#33777)
* docs: ko: text_generation.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

---------

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 11:20:01 -07:00
Sungmin Oh
973e6066d4
🌐 [i18n-KO] Translated model_doc/patchtst.md to Korean (#33589)
* docs: ko: model_doc/patchtst.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

---------

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 11:15:24 -07:00
Sungmin Oh
61a6dce7e4
🌐 [i18n-KO] Translated main_classes/data_collator.md to Korean (#33954)
* docs: ko: main_classes/data_collator.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* fix: resolve suggestions

---------

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 11:14:43 -07:00
Yijun Lee
6ac5f25bb6
🌐 [i18n-KO] Translated modeling_utils.md to Korean (#33808)
* docs: ko: modeling_utils.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
2024-10-09 10:50:03 -07:00
Sungmin Oh
8dca259826
🌐 [i18n-KO] Translated model_doc/graphormer.md to Korean (#33569)
* docs: ko: model_doc/graphormer.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-09 10:44:28 -07:00
Sungmin Oh
4ad923344d
🌐 [i18n-KO] Translated model_doc/informer.md to Korean (#33585)
* docs: ko: model_doc/informer.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-09 10:41:06 -07:00
Sungmin Oh
04f51c42c8
🌐 [i18n-KO] Translated model_doc/time_series_transformer.md to Korean (#33596)
* docs: ko: model_doc/time_series_transformer.md

* fix: resolve suggestions

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

---------

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-09 10:40:48 -07:00
Sungmin Oh
32cc15c6a2
🌐 [i18n-KO] Translated model_doc/trajectory_transformer.md to Korean (#33597)
* docs: ko: model_doc/trajectory_transformer.md

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-09 10:40:36 -07:00
Sungmin Oh
f0fbef1c63
🌐 [i18n-KO] Translated main_classes/model.md to Korean (#33606)
* feat: nmt draft

* fix: manual edits

* docs: ko: main_classes/model.md

* fix: resolve suggestions

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-09 10:40:06 -07:00
Sungmin Oh
48b54205d0
🌐 [i18n-KO] Translated model_doc/mamba2.md to Korean (#33629)
* docs: ko: model_doc/mamba2.md

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestion

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-09 10:39:54 -07:00
Sungmin Oh
03e6fa0061
🌐 [i18n-KO] Translated main_classes/keras_callbacks.md to Korean (#33955)
* docs: ko: main_classes/keras_callbacks.md

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-09 10:34:01 -07:00
Sungmin Oh
13929a0ec6
🌐 [i18n-KO] Translated model_doc/deberta.md to Korean (#33967)
* docs: ko: model_doc/deberta.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
2024-10-09 10:33:34 -07:00
Sungmin Oh
41794e6098
🌐 [i18n-KO] Translated model_doc/bart.md to Korean (#33893)
* docs: ko: model_doc/bart.md

* fix: anchor edits

* feat: nmt draft

* Update docs/source/ko/model_doc/bart.md

* Update docs/source/ko/model_doc/bart.md

* fix: manual edits

* Update docs/source/ko/model_doc/bart.md

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 10:33:14 -07:00
Mohamed Mekkouri
36d410dab6
FEAT : Adding BitNet quantization method to HFQuantizer (#33410)
* rebasing changes

* fixing style

* adding some doc to functions

* remove bitblas

* change dtype

* fixing check_code_quality

* fixing import order

* adding doc to tree

* Small update on BitLinear

* adding some tests

* sorting imports

* small update

* reformatting

* reformatting

* reformatting with ruff

* adding assert

* changes after review

* update disk offloading

* adapting after review

* Update after review

* add is_serializable back

* fixing style

* adding serialization test

* make style

* small updates after review
2024-10-09 17:51:41 +02:00
Vladislav Bronzov
faa0f63b93
Add gguf support for StableLM (#33793)
* add stablelm gguf architecture support

* add additional quantization tests

* resolve merge conflict, add weight conversion tests for fp16
2024-10-09 12:16:13 +02:00
Yijun Lee
698b36da72
🌐 [i18n-KO] Translated modular_transformers.md to Korean (#33772)
* docs: ko: modular_transformers.md

* feat: nmt draft

* fix inline TOC

* fix: manual edits

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* fix: resolve suggestions

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-08 18:30:41 -07:00
Yijun Lee
6151bc47ba
🌐 [i18n-KO] Translated image_processing_utils.md to Korean (#33804)
* docs: ko: image_processing_utils.md

* feat: nmt draft

* fix: manual edits
2024-10-08 18:19:37 -07:00
YONGSANG
d31d076b53
🌐 [i18n-KO] Translated output.md to Korean (#33607)
* nmt draft

* fix toctree

* minor fix

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Update docs/source/ko/main_classes/output.md

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-08 18:19:21 -07:00
Chulhwa (Evan) Han
109b1e7591
🌐 [i18n-KO] Translated blip.md to Korean (#33515)
* docs: ko:  model_doc/blip

* feat: nmt darft

* Apply suggestions from code review

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/model_doc/blip.md

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
2024-10-08 17:59:31 -07:00
Yijun Lee
5809b43a62
🌐 [i18n-KO] Translated biogpt.md to Korean (#33773)
* docs: ko: biogpt.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestion

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

---------

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-10-08 17:57:51 -07:00
Yijun Lee
c674f2e313
🌐 [i18n-KO] Translated openai-gpt.md to Korean (#33801)
* docs: ko: openai-gpt.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-10-08 17:57:33 -07:00
Yijun Lee
c15d01fa1d
🌐 [i18n-KO] Translated file_utils.md to Korean (#33803)
* docs: ko: file_utils.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
2024-10-08 17:57:17 -07:00
Jiwook Han
f0f8077025
🌐 [i18n-KO] Translated swin.md to Korean (#33510)
* ko: doc: model_doc/swin.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* Update docs/source/ko/model_doc/swin.md

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

* resolve conflicts

* resolve conflicts - 2

---------

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
2024-10-08 17:57:03 -07:00
Yijun Lee
0d0ec1dbfb
🌐 [i18n-KO] Translated tokenization_utils.md to Korean (#33813)
* docs: ko: tokenization_utils.md

* feat: nmt draft

* fix: manual edits
2024-10-08 17:56:30 -07:00
Sungmin Oh
386401eca0
🌐 [i18n-KO] Translated main_classes/onnx.md to Korean (#33601)
* docs: ko: main_classes/onnx.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
2024-10-08 17:15:46 -07:00
Sungmin Oh
db5f117b8a
🌐 [i18n-KO] Translated model_doc/deberta-v2.md to Korean (#33968)
* docs: ko: model_doc/deberta-v2.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
2024-10-08 17:15:33 -07:00
Sungmin Oh
cd9a3c49b8
🌐 [i18n-KO] Translated model_doc/dbrx.md to Korean (#33951)
* docs: ko: model_doc/dbrx.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
2024-10-08 17:14:42 -07:00
Sungmin Oh
d6d07f9c77
🌐 [i18n-KO] Translated model_doc/cohere.md to Korean (#33885)
* docs: ko: model_doc/cohere.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

* fix: resolve suggestions

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
2024-10-08 17:14:25 -07:00
Sungmin Oh
48e80284fa
🌐 [i18n-KO] Translated model_doc/mistral.md to Korean (#33648)
* docs: ko: model_doc/mistral.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-08 17:14:12 -07:00
Sungmin Oh
adb14b93f4
🌐 [i18n-KO] Translated model_doc/llama3.md to Korean (#33635)
* docs: ko: model_doc/llama3.md

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

---------

Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-08 17:13:57 -07:00
Sungmin Oh
291e707868
🌐 [i18n-KO] Translated model_doc/paligemma.md to Korean (#33612)
* docs: ko: model_doc/paligemma.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-08 17:13:25 -07:00
Sungmin Oh
dd43dafa39
🌐 [i18n-KO] Translated model_doc/clip.md to Korean (#33610)
* docs: ko: model_doc/clip.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggestions

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-08 17:13:07 -07:00
Sungmin Oh
acde6c7d9d
🌐 [i18n-KO] Translated model_doc/patchtsmixer.md to Korean (#33587)
* docs: ko: model_doc/patchtsmixer.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* fix: resolve suggestions

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-08 17:11:48 -07:00
Sungmin Oh
bb825dde73
🌐 [i18n-KO] Translated model_doc/autoformer.md to Korean (#33574)
* docs: ko: model_doc/autoformer.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions
2024-10-08 17:11:19 -07:00
Sungmin Oh
1d458437dd
🌐 [i18n-KO] Translated model_doc/mamba.md to Korean (#33626)
* docs: ko: model_doc/mamba.md

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

* fix: resolve suggestions

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-08 17:11:11 -07:00
Sungmin Oh
47da2c528b
🌐 [i18n-KO] Translated main_classes/configuration.md to Korean (#33952)
* docs: ko: main_classes/configuration.md

* feat: nmt draft
2024-10-08 17:11:02 -07:00
Sungmin Oh
2e8de976bd
🌐 [i18n-KO] Translated main_classes/quantization.md to Korean (#33959)
* docs: ko: main_classes/quantization.md

* feat: nmt draft

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* fix: resolve suggestions

---------

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-08 17:10:41 -07:00
Chaewon Song
2fe77783c3
🌐 [i18n-KO] Translated rag.md to Korean (#33989)
* fix: toctree edits

* feat: nmt-draft

* fix: edit Inline TOC
2024-10-08 17:10:26 -07:00
Ahnjj_DEV
1ed98773e5
🌐 [i18n-KO] Translated gpt_neox_japanese.md to Korean (#33894)
* docs: ko: gpt_neox_japanese.md

* Update _toctree.yml

* fix: manual edits

* Update docs/source/ko/model_doc/gpt_neox_japanese.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/model_doc/gpt_neox_japanese.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/model_doc/gpt_neox_japanese.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

---------

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
2024-10-08 17:08:06 -07:00
Ahnjj_DEV
79af52ad9a
🌐 [i18n-KO] Translated bertweet.md to Korean (#33891)
* docs: ko: bertweet.md

* Update _toctree.yml

* fix: manual edits

* Update docs/source/ko/model_doc/bertweet.md

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
2024-10-08 17:07:13 -07:00
Yijun Lee
d49999ce11
🌐 [i18n-KO] Translated feature_extractor.md to Korean (#33775)
* docs: ko: feature_extractor.md

* feat: nmt draft

* fix: manual edits
2024-10-08 17:06:56 -07:00
Cyril Vallez
17806d11ba
Improve modular converter (#33991)
* improve modular

* style

* Update modular_model_converter.py

* pretty print warning

* style

* Support to remove unused classes as part of added dependencies as well

* nits

* correct bug

* add example

* style

* Add documentation
2024-10-08 14:53:58 +02:00
Yoni Gozlan
e2001c3413
Add auto model for image-text-to-text (#32472)
* Add Auto model for image-text-to-text

* Remove donut from processing auto, add chameleon ti image text to text models

* add qwen2_vl and llava_onevision

* add pixtral to auto model for image-text-to-text

* add mllama and idefics3

* remove models in IGNORE_NON_AUTO_CONFIGURED

* add AutoModelForImageTextToText to tests and doc
2024-10-08 14:26:43 +02:00
Arthur
a3add29097
Add support for __all__ and potentilly deleting functions (#33859)
* Add support for __all__ and potentailly deleting functions

* updates

* update

* nits

* remove dummies

* fix warning

* fixup

* style

* update

* fixup

* skip copied from when # skip

* remove log

* bring dummies back

* fixup

* remove copied from

* fixup

* remove warnings from `make fix-copies`

* fix doc issues

* nits

* Better error message !

* add support for more flexible naming!

* style

* breaking style?

* fix super() renaming issues

* del not needed when you don't call super().__init__()

* style

* no more fmt on :)

* properly remove `self`

* fixup

* fix

* doc nits

* add some doc 🫡
2024-10-08 10:19:17 +02:00
Yijun Lee
d6ba1ac041
🌐 [i18n-KO] Translated gemma.md to Korean (#33936)
* docs: ko: gemma.md

* feat: nmt draft

* fix: manual edits
2024-10-07 15:59:14 -07:00
Jiwook Han
46f146a2b5
🌐 [i18n-KO] Translated vit.md to Korean (#33884)
* docs: ko: model_doc/vit.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits

* Update docs/source/ko/model_doc/vit.md

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

* Update docs/source/ko/model_doc/vit.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

---------

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-10-07 15:35:11 -07:00
Jiwook Han
1ecca92f03
🌐 [i18n-KO] Translated swin2sr.md to Korean (#33795)
* ko: doc: model_doc/swin2sr.md

* feat: nmt draft

* Update docs/source/ko/model_doc/swin2sr.md

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

---------

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
2024-10-07 15:34:56 -07:00
boyunJang
8258219c4c
🌐 [i18n-KO] Translated auto.md to Korean (#33590)
* docs: ko: model_doc/auto.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* fix: resolve suggestions

---------

Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
2024-10-07 15:34:45 -07:00
Chaewon Song
253a9a9d6f
🌐 [i18n-KO] Translated logging.md to Korean (#33543)
* docs: ko: main_classes/logging.md

* feat: nmt-draft

* fix: update toctree.yml

* Update docs/source/ko/main_classes/logging.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/main_classes/logging.md

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

---------

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-10-07 15:34:34 -07:00
Yijun Lee
178d707b7e
🌐 [i18n-KO] Translated chameleon.md to Korean (#33799)
* docs: ko: chameleon.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-10-07 15:06:13 -07:00
Yijun Lee
13432f8409
🌐 [i18n-KO] Translated trainer.md to Korean (#33797)
* docs: ko: trainer.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-10-07 15:05:57 -07:00
Yijun Lee
e9fbe62965
🌐 [i18n-KO] Translated pipelines_utils.md to Korean (#33809)
* docs: ko: pipelines_utils.md

* feat: nmt draft

* fix: manual edits
2024-10-07 15:05:17 -07:00
Yijun Lee
9c61ba2f25
🌐 [i18n-KO] Translated time_series_utils.md to Korean (#33806)
* docs: ko: time_series_utils.md

* feat: nmt draft

* fix: manual edits
2024-10-07 15:05:00 -07:00
Yijun Lee
9c8bd3fc1b
🌐 [i18n-KO] Translated esm.md to Korean (#33796)
* docs: ko: esm.md

* feat: nmt draft

* fix: manual edits
2024-10-07 13:39:22 -07:00
Yijun Lee
6996f2186a
🌐 [i18n-KO] Translated audio_utils.md to Korean (#33802)
* docs: ko: audio_utils.md

* feat: nmt draft

* fix: manual edits
2024-10-07 13:39:10 -07:00
Jiwook Han
410c73af1d
🌐 [i18n-KO] Translated swinv2.md to Korean (#33566)
* docs: ko: model_doc/swinv2.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits
2024-10-07 12:50:43 -07:00
Yijun Lee
6c18cefed0
🌐 [i18n-KO] Translated gguf.md to Korean (#33764)
* docs: ko: gguf.md

* feat nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-10-07 12:49:08 -07:00
Magnus
ad1a250719
[Docs] Add Developer Guide: How to Hack Any Transformers Model (#33979)
* docs: add example for separating q, k, v projections in SAM

* docs: How to Hack Any Transformers Model

* docs: remove changes from sam model docs

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-07 10:08:20 +02:00
NielsRogge
f5aeb7c1a5
[Docs] Improve VLM docs (#33393)
* Improve docs

* Update docs/source/en/model_doc/llava.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/llava.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Address comment

* Address comment

* Improve pixtral docs

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-07 09:54:07 +02:00
TomLim
1bd604d11c
[WIP] Add Tokenizer for MyT5 Model (#31286)
* Initial commit for MyT5 model

* custom implementation of MyT5 tokenizer, unused files deleted

* unittest for myt5 tokenizer

* upadate of import structure and style

* removed remmanents of MyT5Config

* fixed docstrings

* Updates after review: filled documentaion file, new docstrings and tests added

* Fixed code style issues

* fixed copied from to refer to function

* updated loading myt5 tokenizer in tests, added sample byte map file to fixtures

* changes after review

* removed redundant copied from

* removed redundant copied from

* optimalization and loading model from hf

* [run_slow] myt5

* [run-slow] myt5

* Updated en documentation for myt5

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-06 10:33:16 +02:00
pglorio
f319ba16fa
Add Zamba (#30950)
* Update index.md

* Rebase

* Rebase

* Updates from make fixup

* Update zamba.md

* Batched inference

* Update

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update configuration_zamba.py

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update configuration_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba

* Update ZambaForCausalLM

* Update ZambaForCausalLM

* Describe diffs with original mamba layer

* Moved mamba init into `_init_weights`

* Update index.md

* Rebase

* Rebase

* Updates from make fixup

* Update zamba.md

* Batched inference

* Update

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/zamba.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update configuration_zamba.py

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Update configuration_zamba.py

* Update modeling_zamba.py

* Update modeling_zamba.py

* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba

* Update ZambaForCausalLM

* Moved mamba init into `_init_weights`

* Update ZambaForCausalLM

* Describe diffs with original mamba layer

* make fixup fixes

* quality test fixes

* Fix Zamba model path

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* circleci fixes

* Update

* circleci fixes

* fix zamba test from merge

* fix ValueError for disabling mamba kernels

* add HF copyright

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* shared_transf --> shared_transformer

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fixes

* Move attention head dim to config

* Fix circle/ci tests

* Update modeling_zamba.py

* apply GenerationMixin inheritance change from upstream

* apply import ordering

* update needed transformers version for zamba

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add contribution author

* add @slow to avoid CI

* Update src/transformers/models/zamba/modeling_zamba.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Define attention_hidden_size

* Added doc for attention_head_size

* trigger CI

* Fix doc of attention_hidden_size

* [run-slow] zamba

* Fixed shared layer logic, swapped up<->gate in mlp

* shared_transformer -> shared_transf

* reformat HybridLayer __init__

* fix docstrings in zamba config

* added definition of _get_input_ids_and_config

* fixed formatting of _get_input_ids_and_config

---------

Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-10-04 22:28:05 +02:00
Amit Garg
e3775539c8
PhiMoE (#33363)
* onboard phimoe model

* removed debug code

* added unit tests

* updated docs

* formatted

* fixed unit tests

* fixed test case

* fixed format

* refactored code

* fixed expected outputs in the integration tests

* Added a warning msg

* Addressed comments

* Addressed comments

* fixed test cases

* added paper link

* Addressed comments

* Refactored PhimoeForCausalLM forward fn

* Refactored PhimoeRotaryEmbedding class

* fixed test cases

* fixed testcase

* fixed test case

* Addressed comments

* fixed test cases

* fixed testcases

* Used cache position instead to get the seq len
2024-10-04 21:39:45 +02:00
jiqing-feng
b916efcb3c
Enables CPU AWQ model with IPEX version. (#33460)
* enable cpu awq ipex linear

* add doc for cpu awq with ipex kernel

* add tests for cpu awq

* fix code style

* fix doc and tests

* Update docs/source/en/quantization/awq.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/autoawq/test_awq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix comments

* fix log

* fix log

* fix style

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-10-04 16:25:10 +02:00
Matt
de4112e4d2
Add a section on writing tool templates to the chat template docs (#33924)
* Add a section on writing tool templates to the chat template docs

* Small cleanups
2024-10-04 14:40:44 +01:00
Deepak Saldanha
b6a01df6e9
[Doc]: Broken link in Kubernetes doc (#33879)
* add relative path in .md and redirects to conf.py

* add redirects to conf.py and update .md

* modify links in .md
2024-10-04 11:20:56 +02:00
amyeroberts
b7474f211d
Trainer - deprecate tokenizer for processing_class (#32385)
* Trainer - deprecate tokenizer for processing_class

* Extend chage across Seq2Seq trainer and docs

* Add tests

* Update to FutureWarning and add deprecation version
2024-10-02 14:08:46 +01:00
Omar Salman
e7c8af7f33
Add sdpa for DistilBert (#33724)
* Add sdpa for DistilBert

* [run_slow] distilbert

* [run_slow] distilbert

* [run_slow] distilbert

* Try without slow tests

* [run_slow] distilbert

* [run_slow] distilbert
2024-10-02 13:55:19 +01:00
g-prz
fe484726aa
Add falcon gguf (#33437)
* feat(gguf): add falcon q2 k

* fix(gguf): remove useless renaming

* feat(gguf): seperate falcon 7b and 40b

* feat(gguf): apply fixup

* fix(test): error rebase

* feat(gguf): add fp16 weight comparison for falcon

* feat(gguf): test weight of all layers

* test(gguf): add falcon 40b under skip decorator

* feat(gguf): quick example for extracting model size
2024-10-02 14:10:39 +02:00
TrickEye
2292be6c1b
Fix: typo (#33880)
Update llm_tutorial.md: typo
2024-10-02 09:12:21 +01:00
pogpog
b77846a6e6
Fix link in gguf.md (#33768)
Change hyphen to underscore for URL in link to convert_hf_to_gguf.py
2024-09-30 20:17:33 +02:00
mobicham
f5247aca01
Hqq serialization (#33141)
* HQQ model serialization attempt

* fix hqq dispatch and unexpected keys

* style

* remove check_old_param

* revert to check HQQLinear in quantizer_hqq.py

* revert to check HQQLinear in quantizer_hqq.py

* update HqqConfig default params

* make ci happy

* make ci happy

* revert to HQQLinear check in quantizer_hqq.py

* check hqq_min version 0.2.0

* set axis=1 as default in quantization_config.py

* validate_env with hqq>=0.2.0 version message

* deprecated hqq kwargs message

* make ci happy

* remove run_expected_keys_check hack + bump to 0.2.1 min hqq version

* fix unexpected_keys hqq update

* add pre_quantized check

* add update_expected_keys to base quantizerr

* ci base.py fix?

* ci base.py fix?

* fix "quantization typo" src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix post merge

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-30 14:47:18 +02:00
Jerry Zhang
4bb49d4e00
Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 (#33456)
* Enable non-safetensor serialization and deserialization for TorchAoConfig quantized model

Summary:
After https://github.com/huggingface/huggingface_hub/pull/2440 we added non-safetensor serialization and deserialization
in huggingface, with this we can now add the support in transformers

Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor
see README for more details

Test Plan:
tested locally

Reviewers:

Subscribers:

Tasks:

Tags:

* formatting

* formatting

* minor fix

* formatting

* address comments

* comments

* minor fix

* update doc

* refactor compressed tensor quantizer
2024-09-30 11:30:29 +02:00
Lysandre Debut
4973fc5769
Model addition timeline (#33762)
* Model addition timeline

* Link guide

* Update docs/source/en/add_new_model.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/add_new_model.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Review comments

* Add contact email

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-27 17:15:13 +02:00
Vladislav Bronzov
9d200cfbee
Add gguf support for bloom (#33473)
* add bloom arch support for gguf

* apply format

* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming

* optimize bloom GGUF_TENSOR_MAPPING

* implement reverse reshaping for bloom gguf

* add qkv weights test

* add q_8 test for bloom
2024-09-27 12:13:40 +02:00
Raushan Turganbay
3e039d3827
Paligemma support for multi-image (#33447)
* upadte

* Update src/transformers/models/paligemma/processing_paligemma.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update docs

* better example in tests

* support image tokens

* read token

* Update tests/models/paligemma/test_processing_paligemma.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* nit: naming

* Update docs/source/en/model_doc/paligemma.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* conflicts after rebasing

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2024-09-27 11:23:14 +02:00
John B Nelson
55b7a0404e
Make siglip examples clearer and error free (#33667)
Update siglip.md

This was already partially fixed relative to the deployed docs. But the partial fix made it inconsistent. Additionally, giving the full text ("This is a photo of...") is likely not the desired output.
2024-09-27 10:33:55 +02:00
Yoni Gozlan
77b47e6645
Fix docs and docstrings Omdet-Turbo (#33726)
Fix weights path in docs
2024-09-26 12:18:23 -04:00
Franz Louis Cesista
0a21381ba3
Uniformize kwargs for chameleon processor (#32181)
* uniformize kwargs of Chameleon

* fix linter nit

* rm stride default

* add tests for chameleon processor

* fix tests

* add comment on get_component

* rm Chameleon's slow tokenizer

* add check order images text + nit

* update docs and tests

* Fix LlamaTokenizer tests

* fix gated repo access

* fix wrong import

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2024-09-26 10:18:07 -04:00
Andrés Marafioti
f2c388e3f9
Add Idefics 3! (#32473)
* Add Idefics 3!

* fixes to make both pipelines identical

* fix for quantized models

* First pass at the review

* remove vocab size from the main config (it's still in the text_config)

* hot fix for merve

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* re-add model_type for text_config

* remove support for old_cache

* remove hidden_size from main config

* rename idefics3 HF repo

* few changes suggested in the PR

* fix to input_data_format computation

* remove overwrite of _autoset_attn_implementation following @zucchini-nlp suggestion

* improve example

* few improvements from amy's review

* big change to enable processing input images as numpy arrays

* Changes to the code to uniformize processor kwargs

* image processing tests

* image processing tests fixes and some bugs they discovered

* addressed review comments from Yoni

* fix modeling tests

* remove special tokens that are not special

* fixes tests

* skip failing tests - they also fail for idefics2

* added paper and readded the tests with multi gpu, who knows

* Update docs/source/en/model_doc/idefics3.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* review amy until image_processing_idefics3

* last comments from Amy

* review amy

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics3/modeling_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/idefics3.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* doc improvement - amy review

* fix runtime error during fine-tuning

* amy's review

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/idefics3/modeling_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* ruff

* amy's comment on the order

* ruff ruff

* fix copies

* square images when they are not splitted

* ruff :(

* Update src/transformers/models/idefics3/image_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics3/test_processing_idefics3.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix small bug introduced in refactor

* amy's image processing changes

* fixes peft tests and ruff

* modify to_pil_image from transformers. and review from emanuele.

* add modified to_pil_image

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-25 21:28:49 +02:00
Arthur
19d58d31f1
Add MLLama (#33703)
* current changes

* nit

* Add cross_attenttion_mask to processor

* multi-image fixed

* Add cross_attenttion_mask to processor

* cross attn works in all cases

* WIP refactoring function for image processor

* WIP refactoring image processor functions

* Refactor preprocess to use global loops instead of list nested list comps

* Docstrings

* Add channels unification

* fix dtype issues

* Update docsrings and format

* Consistent max_image_tiles

* current script

* updates

* Add convert to rgb

* Add image processor tests

* updates!

* update

* god damn it I am dumb sometimes

* Precompute aspect ratios

* now this works, full match

* fix 😉

* nits

* style

* fix model and conversion

* nit

* nit

* kinda works

* hack for sdpa non-contiguous bias

* nits here and there

* latest c hanges

* merge?

* run forward

* Add aspect_ratio_mask

* vision attention mask

* update script and config variable names

* nit

* nits

* be able to load

* style

* nits

* there

* nits

* make forward run

* small update

* enable generation multi-turn

* nit

* nit

* Clean up a bit for errors and typos

* A bit more constant fixes

* 90B keys and shapes match

* Fix for 11B model

* Fixup, remove debug part

* Docs

* Make max_aspect_ratio_id to be minimal

* Update image processing code to match new implementation

* Adjust conversion for final checkpoint state

* Change dim in repeat_interleave (accordig to meta code)

* tmp fix for num_tiles

* Fix for conversion (gate<->up, q/k_proj rope permute)

* nits

* codestyle

* Vision encoder fixes

* pass cross attn mask further

* Refactor aspect ratio mask

* Disable text-only generation

* Fix cross attention layers order, remove q/k norm rotation for cross atention layers

* Refactor gated position embeddings

* fix bugs but needs test with new weights

* rope scaling should be llama3

* Fix rope scaling name

* Remove debug for linear layer

* fix copies

* Make mask prepare private func

* Remove linear patch embed

* Make precomputed embeddings as nn.Embedding module

* MllamaPrecomputedAspectRatioEmbedding with config init

* Remove unused self.output_dim

* nit, intermediate layers

* Rename ln and pos_embed

* vision_chunk_size -> image_size

* return_intermediate -> intermediate_layers_indices

* vision_input_dim -> hidden_size

* Fix copied from statements

* fix most tests

* Fix more copied from

* layer_id->layer_idx

* Comment

* Fix tests for processor

* Copied from for _prepare_4d_causal_attention_mask_with_cache_position

* Style fix

* Add MllamaForCausalLM

* WIP fixing tests

* Remove duplicated layers

* Remove dummy file

* Fix style

* Fix consistency

* Fix some TODOs

* fix language_model instantiation, add docstring

* Move docstring, remove todos for precomputed embeds (we cannot init them properly)

* Add initial docstrings

* Fix

* fix some tests

* lets skip these

* nits, remove print, style

* Add one more copied from

* Improve test message

* Make validate func private

* Fix dummy objects

* Refactor `data_format` a bit + add comment

* typos/nits

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* fix dummy objects and imports

* Add chat template config json

* remove num_kv_heads from vision attention

* fix

* move some commits and add more tests

* fix test

* Remove `update_key_name` from modeling utils

* remove num-kv-heads again

* some prelimiary docs

* Update chat template + tests

* nit, conversion script max_num_tiles from params

* Fix warning for text-only generation

* Update conversion script for instruct models

* Update chat template in converstion + test

* add tests for CausalLM model

* model_max_length, avoid null chat_template

* Refactor conversion script

* Fix forward

* Fix integration tests

* Refactor vision config + docs

* Fix default

* Refactor text config

* Doc fixes

* Remove unused args, fix docs example

* Squashed commit of the following:

commit b51ce5a2efffbecdefbf6fc92ee87372ec9d8830
Author: qubvel <qubvel@gmail.com>
Date:   Wed Sep 18 13:39:15 2024 +0000

    Move model + add output hidden states and output attentions

* Fix num_channels

* Add mllama text and mllama vision models

* Fixing repo consistency

* Style fix

* Fixing repo consistency

* Fixing unused config params

* Fix failed tests after refactoring

* hidden_activation -> hidden_act  for text mlp

* Remove from_pretrained from sub-configs

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/mllama/convert_mllama_weights_to_hf.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Reuse lambda in conversion script

* Remove run.py

* Update docs/source/en/model_doc/mllama.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/mllama/processing_mllama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove unused LlamaTokenizerFast

* Fix logging

* Refactor gating

* Remove cycle for collecting intermediate states

* Refactor text-only check, add integration test for text-only

* Revert from pretrained to configs

* Fix example

* Add auto `bos_token` adding in processor

* Fix tips

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Enable supports_gradient_checkpointing model flag

* add eager/sdpa options

* don't skip attn tests and bring back GC skips (did i really remove those?)

* Fix signature, but get error with None gradient

* Fix output attention tests

* Disable GC back

* Change no split modules

* Fix dropout

* Style

* Add Mllama to sdpa list

* Add post init for vision model

* Refine config for MllamaForCausalLMModelTest and skipped tests for CausalLM model

* if skipped, say it, don't pass

* Clean vision tester config

* Doc for args

* Update tests/models/mllama/test_modeling_mllama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add cross_attention_mask to test

* typehint

* Remove todo

* Enable gradient checkpointing

* Docstring

* Style

* Fixing and skipping some tests for new cache

* Mark flaky test

* Skip `test_sdpa_can_compile_dynamic` test

* Fixing some offload tests

* Add direct GenerationMixin inheritance

* Remove unused code

* Add initializer_range to vision config

* update the test to make sure we show if split

* fix gc?

* Fix repo consistency

* Undo modeling utils debug changes

* Fix link

* mllama -> Mllama

* [mllama] -> [Mllama]

* Enable compile test for CausalLM model (text-only)

* Fix TextModel prefix

* Update doc

* Docs for forward, type hints, and vision model prefix

* make sure to reset

* fix init

* small script refactor and styling

* nit

* updates!

* some nits

* Interpolate embeddings for 560 size and update integration tests

* nit

* does not suppor static cache!

* update

* fix

* nit2

* this?

* Fix conversion

* Style

* 4x memory improvement with image cache AFAIK

* Token decorator for tests

* Skip failing tests

* update processor errors

* fix split issues

* style

* weird

* style

* fix failing tests

* update

* nit fixing the whisper tests

* fix path

* update

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: pavel <ubuntu@ip-10-90-0-11.ec2.internal>
Co-authored-by: qubvel <qubvel@gmail.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-09-25 19:56:25 +02:00
Yoni Gozlan
94f18cf23c
Add OmDet-Turbo (#31843)
* Add template with add-new-model-like

* Add rough OmDetTurboEncoder and OmDetTurboDecoder

* Add working OmDetTurbo convert to hf

* Change OmDetTurbo encoder to RT-DETR encoder

* Add swin timm backbone as default, add always partition fix for swin timm

* Add labels and tasks caching

* Fix make fix-copies

* Format omdet_turbo

* fix Tokenizer tests

* Fix style and quality

* Reformat omdet_turbo

* Fix quality, style, copies

* Standardize processor kwargs

* Fix style

* Add output_hidden_states and ouput_attentions

* Add personalize multi-head attention, improve docstrings

* Add integrated test and fix copy, style, quality

* Fix unprotected import

* Cleanup comments and fix unprotected imports

* Add fix different prompts in batch (key_padding_mask)

* Add key_padding_mask to custom multi-head attention module

* Replace attention_mask by key_padding_mask

* Remove OmDetTurboModel and refactor

* Refactor processing of classes and abstract use of timm backbone

* Add testing, fix output attentions and hidden states, add cache for anchors generation

* Fix copies, style, quality

* Add documentation, conver key_padding_mask to attention_mask

* revert changes to backbone_utils

* Fic docstrings rst

* Fix unused argument in config

* Fix image link documentation

* Reorder config and cleanup

* Add tokenizer_init_kwargs in merge_kwargs of the processor

* Change AutoTokenizer to CLIPTokenizer in convert

* Fix init_weights

* Add ProcessorMixin tests, Fix convert while waiting on uniform kwargs

* change processor kwargs and make task input optional

* Fix omdet docs

* Remove unnecessary tests for processor kwargs

* Replace nested BatchEncoding output of the processor by a flattened BatchFeature

* Make modifications from Pavel review

* Add changes Amy review

* Remove unused param

* Remove normalize_before param, Modify processor call docstring

* Remove redundant decoder class, add gradient checkpointing for decoder

* Remove commented out code

* Fix inference in fp16 and add fp16 integrated test

* update omdet md doc

* Add OmdetTurboModel

* fix caching and nit

* add OmDetTurboModel to tests

* nit change repeated key test

* Improve inference speed in eager mode

* fix copies

* Fix nit

* remove OmdetTurboModel

* [run-slow] omdet_turbo

* [run-slow] omdet_turbo

* skip dataparallel test

* [run-slow] omdet_turbo

* update weights to new path

* remove unnecessary config in class

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-91-248.ec2.internal>
2024-09-25 13:26:28 -04:00
Alan Kashkash
ade9e0fe41
Corrected max number for bf16 in transformer/docs (#33658)
Update perf_train_gpu_one.md

per issue https://github.com/huggingface/hub-docs/issues/1425 max number for bf16 should be 65,504 not 65,535
2024-09-25 19:20:51 +02:00
Isaac Schifferer
61e98cb957
Add SDPA support for M2M100 (#33309)
* Add SDPA support for M2M100

* [run_slow] m2m_100, nllb
2024-09-25 18:04:42 +01:00
Benjamin Fineran
574a9e12bb
HFQuantizer implementation for compressed-tensors library (#31704)
* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>
2024-09-25 14:31:38 +02:00
NielsRogge
06e27e3dc0
[Pixtral] Improve docs, rename model (#33491)
* Improve docs, rename model

* Fix style

* Update repo id
2024-09-25 13:53:12 +02:00
Yoni Gozlan
5f0c181f4e
Uniformize kwargs for image-text-to-text processors (#32544)
* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino
2024-09-24 21:28:19 -04:00
Arthur
317e069ee7
Modular transformers: modularity and inheritance for new model additions (#33248)
* update exampel

* update

* push the converted diff files for testing and ci

* correct one example

* fix class attributes and docstring

* nits

* oups

* fixed config!

* update

* nitd

* class attributes are not matched against the other, this is missing

* fixed overwriting self.xxx now onto the attributes I think

* partial fix, now order with docstring

* fix docstring order?

* more fixes

* update

* fix missing docstrings!

* examples don't all work yet

* fixup

* nit

* updated

* hick

* update

* delete

* update

* update

* update

* fix

* all default

* no local import

* fix more diff

* some fix related to "safe imports"

* push fixed

* add helper!

* style

* add a check

* all by default

* add the

* update

* FINALLY!

* nit

* fix config dependencies

* man that is it

* fix fix

* update diffs

* fix the last issue

* re-default to all

* alll the fixes

* nice

* fix properties vs setter

* fixup

* updates

* update dependencies

* make sure to install what needs to be installed

* fixup

* quick fix for now

* fix!

* fixup

* update

* update

* updates

* whitespaces

* nit

* fix

* simplify everything, and make it file agnostic (should work for image processors)

* style

* finish fixing all import issues

* fixup

* empty modeling should not be written!

* Add logic to find who depends on what

* update

* cleanup

* update

* update gemma to support positions

* some small nits

* this is the correct docstring for gemma2

* fix merging of docstrings

* update

* fixup

* update

* take doc into account

* styling

* update

* fix hidden activation

* more fixes

* final fixes!

* fixup

* fixup instruct  blip video

* update

* fix bugs

* align gemma2 with the rest as well

* updats

* revert

* update

* more reversiom

* grind

* more

* arf

* update

* order will matter

* finish del stuff

* update

* rename to modular

* fixup

* nits

* update makefile

* fixup

* update order of the checks!

* fix

* fix docstring that has a call inside

* fiix conversion check

* style

* add some initial documentation

* update

* update doc

* some fixup

* updates

* yups

* Mostly todo gimme a minut

* update

* fixup

* revert some stuff

* Review docs for the modular transformers (#33472)

Docs

* good update

* fixup

* mmm current updates lead to this code

* okay, this fixes it

* cool

* fixes

* update

* nit

* updates

* nits

* fix doc

* update

* revert bad changes

* update

* updates

* proper update

* update

* update?

* up

* update

* cool

* nits

* nits

* bon bon

* fix

* ?

* minimise changes

* update

* update

* update

* updates?

* fixed gemma2

* kind of a hack

* nits

* update

* remove `diffs` in favor of `modular`

* fix make fix copies

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-09-24 15:54:07 +02:00
jiqing-feng
11c27dd331
Enable BNB multi-backend support (#31098)
* enable cpu bnb path

* fix style

* fix code style

* fix 4 bit path

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* add multi backend refactor tests

* fix style

* tweak 4bit quantizer + fix corresponding tests

* tweak 8bit quantizer + *try* fixing corresponding tests

* fix dequant bnb 8bit

* account for Intel CPU in variability of expected outputs

* enable cpu and xpu device map

* further tweaks to account for Intel CPU

* fix autocast to work with both cpu + cuda

* fix comments

* fix comments

* switch to testing_utils.torch_device

* allow for xpu in multi-gpu tests

* fix tests 4bit for CPU NF4

* fix bug with is_torch_xpu_available needing to be called as func

* avoid issue where test reports attr err due to other failure

* fix formatting

* fix typo from resolving of merge conflict

* polish based on last PR review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix CI

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix error log

* fix error msg

* add \n in error log

* make quality

* rm bnb cuda restriction in doc

* cpu model don't need dispatch

* fix doc

* fix style

* check cuda avaliable in testing

* fix tests

* Update docs/source/en/model_doc/chameleon.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update docs/source/en/model_doc/llava_next.md

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix doc

* fix check multibackends

* fix import sort

* remove check torch in bnb

* docs: update bitsandbytes references with multi-backend info

* docs: fix small mistakes in bnb paragraph

* run formatting

* reveret bnb check

* move bnb multi-backend check to import_utils

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* fix bnb check

* minor fix for bnb

* check lib first

* fix code style

* Revert "run formatting"

This reverts commit ac108c6d6b.

* fix format

* give warning when bnb version is low and no cuda found]

* fix device assignment check to be multi-device capable

* address akx feedback on get_avlbl_dev fn

* revert partially, as we don't want the function that public, as docs would be too much (enforced)

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-24 03:40:56 -06:00
Avishai Elmakies
78b2929c05
Sdpa dino v2 (#33403)
* add sdpa to dinov2

* fixup

* add dinov2 to sdpa doc

* update doc order

* [run-slow] dinov2

* common to eager

* [run-slow] dinov2

* update attn implementation in common

* update test_modeling_dinov2 to have mask_ration, num_masks and mask_length similar to vit

* [run-slow] dinov2

---------

Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>
2024-09-21 01:58:00 +01:00
amyeroberts
e71bf70e33
Pixtral update example checkpoint (#33633)
* Update pixtral example checkpoint

* Fix typo
2024-09-21 01:01:16 +01:00
Mayank Mishra
e472e077c2
Granitemoe (#33207)
* first commit

* drop tokenizer

* drop tokenizer

* drop tokenizer

* drop convert

* granite

* drop tokenization test

* mup

* fix

* reformat

* reformat

* reformat

* fix docs

* stop checking for checkpoint

* update support

* attention multiplier

* update model

* tiny drop

* saibo drop

* skip test

* fix test

* fix test

* drop

* drop useless imports

* update docs

* drop flash function

* copied from

* drop pretraining tp

* drop pretraining tp

* drop pretraining tp

* drop unused import

* drop code path

* change name

* softmax scale

* head dim

* drop legacy cache

* rename params

* cleanup

* fix copies

* comments

* add back legacy cache

* multipliers

* multipliers

* multipliers

* text fix

* fix copies

* merge

* multipliers

* attention multiplier

* drop unused imports

* add granitemoe

* add decoration

* remove moe from sequenceclassification

* fix test

* fix

* fix

* fix

* move rope?

* merge

* drop bias

* drop bias

* Update src/transformers/models/granite/configuration_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* Update src/transformers/models/granite/modeling_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix

* fix

* fix

* drop

* drop

* fix

* fix

* cleanup

* cleanup

* fix

* fix granite tests

* fp32 test

* fix

* drop jitter

* fix

* rename

* rename

* fix config

* add gen test

---------

Co-authored-by: Yikang Shen <yikang.shn@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-21 01:43:50 +02:00
Omar Salman
653eb40425
Add sdpa for BioGpt (#33592)
* Add sdpa for BioGpt

* Updates

* Add the docs

* [run_slow] biogpt

* Use the copy mechanism to ensure consistency

* [run_slow] biogpt
2024-09-20 14:27:32 +01:00
Yoni Gozlan
f111d5b783
Uniformize kwargs for Paligemma processor and update docs (#33571)
* Uniformize paligemma processor

* nit
2024-09-19 14:14:06 -04:00
Joao Gante
80b774eb29
Cache: don't show warning in forward passes when past_key_values is None (#33541) 2024-09-19 12:02:46 +01:00
Yoach Lacombe
5af7d41e49
Codec integration (#33565)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* rename repo id + change readme

* Update docs/source/en/model_doc/mimi.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add flaky flag to batching equivalence due to audio_codes failing sometimes

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-18 19:23:44 +02:00
Aymeric Roucher
e6d9f39dd7
Decorator for easier tool building (#33439)
* Decorator for tool building
2024-09-18 11:07:51 +02:00
Yoni Gozlan
d8500cd229
Uniformize kwargs for Pixtral processor (#33521)
* add uniformized pixtral and kwargs

* update doc

* fix _validate_images_text_input_order

* nit
2024-09-17 14:44:27 -04:00
Antoine Dussolle
763548427d
Add explicit example for RAG chat templating (#33503)
* Add explicit example for RAG chat templating

* Add Tip box and reformulate

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-09-17 16:08:05 +01:00
Max Buckley
ac5a0556f1
Update chameleon.md — fix runtime type error (#33494)
Update chameleon.md

Fix error

RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same
2024-09-17 13:32:49 +02:00
Ahmed Almaghz
c2d05897bf
[i18n-ar] Add File : docs/source/ar/_toctree.yml (#32696)
* Update ar lang build_documentation.yml

* Update ar lang build_pr_documentation.yml

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/pipeline_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/autoclass_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/preprocessing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/training.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/run_scripts.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/accelerate.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/accelerate.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/accelerate.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/accelerate.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/accelerate.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/accelerate.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Create _config.py

* Update _toctree.yml

* Update _toctree.yml

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/peft.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/model_sharing.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/conversations.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/agents.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/llm_tutorial.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update llm_tutorial.md

* Update _toctree.yml

* Update autoclass_tutorial.md

* Update autoclass_tutorial.md

* Update preprocessing.md

* Update glossary.md

* Update run_scripts.md

* Update run_scripts.md

* Update run_scripts.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-09-16 10:02:03 -07:00
Sergio Paniego Blanco
c7a91f5adf
Agents, supercharged - Multi-agents, External tools, and more docs typo fixed (#33478)
* Typo fixed in Agents, supercharged
2024-09-16 18:52:27 +02:00
Merve Noyan
ce62a41880
Add keypoint-detection task guide (#33274)
---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-09-16 13:08:31 +02:00
Arthur
8bd2b1e8c2
Add support for Pixtral (#33449)
* initial commit

* gloups

* updates

* work

* weights match

* nits

* nits

* updates to support the tokenizer :)

* updates

* Pixtral processor (#33454)

* rough outline

* Add in image break and end tokens

* Fix

* Udo some formatting changes

* Set patch_size default

* Fix

* Fix token expansion

* nit in conversion script

* Fix image token list creation

* done

* add expected results

* Process list of list of images (#33465)

* updates

* working image and processor

* this is the expected format

* some fixes

* push current updated

* working mult images!

* add a small integration test

* Uodate configuration docstring

* Formatting

* Config docstring fix

* simplify model test

* fixup modeling and etests

* Return BatchMixFeature in image processor

* fix some copies

* update

* nits

* Update model docstring

* Apply suggestions from code review

* Fix up

* updates

* revert modeling changes

* update

* update

* fix load safe

* addd liscence

* update

* use pixel_values as required by the model

* skip some tests and refactor

* Add pixtral image processing tests (#33476)

* Image processing tests

* Add processing tests

* woops

* defaults reflect pixtral image processor

* fixup post merge

* images -> pixel values

* oups sorry Mr docbuilder

* isort

* fix

* fix processor tests

* small fixes

* nit

* update

* last nits

* oups this was really breaking!

* nits

* is composition needs to be true

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-14 12:28:39 +02:00
Sergio Paniego Blanco
e39b6c1c7c
Corrected Agents and tools documentation links typos (#33471)
* Corrected agents task link typo

* Corrected chat templating link

* Corrected chat templating link 2
2024-09-13 17:15:20 +02:00
Fanli Lin
a05ce550bf
[docs] refine the doc for train with a script (#33423)
* add xpu note

* add one more case

* add more

* Update docs/source/en/run_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-09-12 10:16:12 -07:00
Raushan Turganbay
2f611d30d9
Qwen2-VL: clean-up and add more tests (#33354)
* clean-up on qwen2-vl and add generation tests

* add video tests

* Update tests/models/qwen2_vl/test_processing_qwen2_vl.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix and add better tests

* Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update docs and address comments

* Update docs/source/en/model_doc/qwen2_vl.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_vl.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update

* remove size at all

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-12 18:24:04 +02:00
Sergio Paniego Blanco
516ee6adc2
Fix incomplete sentence in Zero-shot object detection documentation (#33430)
Rephrase sentence in zero-shot object detection docs
2024-09-12 11:25:44 +02:00
Michael Currin
e0ff4321d1
Docs - update formatting of llama3 model card (#33438)
update formatting of llama3 content
2024-09-12 11:24:56 +02:00
Fanli Lin
cea9ec086a
[docs] add the missing tokenizer when pushing models to huggingface hub (#33428)
* add tokenizer

* typo
2024-09-11 09:56:55 -07:00
Fanli Lin
c403441339
[docs] add the missing huggingface hub username (#33431)
* add username

* update username

* add username
2024-09-11 09:56:40 -07:00
Guang Yang
f38590dade
Make StaticCache configurable at model construct time (#32830)
* Make StaticCache configurable at model construct time

* integrations import structure

* add new doc file to toc

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-09-10 16:35:57 +01:00
Alazar
96429e74a8
Add support for GGUF Phi-3 (#31844)
* Update docs for GGUF supported models

* Add tensor mappings and define class GGUFPhi3Converter

* Fix tokenizer

* Working version

* Attempt to fix some CI failures

* Run ruff format

* Add vocab, merges, decoder methods like LlamaConverter

* Resolve conflicts since Qwen2Moe was added to gguf

- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
2024-09-10 13:32:38 +02:00
Nilay Bhatnagar
eedd21b9e7
Fixed Majority of the Typos in transformers[en] Documentation (#33350)
* Fixed typo: insted to instead

* Fixed typo: relase to release

* Fixed typo: nighlty to nightly

* Fixed typos: versatible, benchamarks, becnhmark to versatile, benchmark, benchmarks

* Fixed typo in comment: quantizd to quantized

* Fixed typo: architecutre to architecture

* Fixed typo: contibution to contribution

* Fixed typo: Presequities to Prerequisites

* Fixed typo: faste to faster

* Fixed typo: extendeding to extending

* Fixed typo: segmetantion_maps to segmentation_maps

* Fixed typo: Alternativelly to Alternatively

* Fixed incorrectly defined variable: output to output_disabled

* Fixed typo in library name: tranformers.onnx to transformers.onnx

* Fixed missing import: import tensorflow as tf

* Fixed incorrectly defined variable: token_tensor to tokens_tensor

* Fixed missing import: import torch

* Fixed incorrectly defined variable and typo: uromaize to uromanize

* Fixed incorrectly defined variable and typo: uromaize to uromanize

* Fixed typo in function args: numpy.ndarry to numpy.ndarray

* Fixed Inconsistent Library Name: Torchscript to TorchScript

* Fixed Inconsistent Class Name: OneformerProcessor to OneFormerProcessor

* Fixed Inconsistent Class Named Typo: TFLNetForMultipleChoice to TFXLNetForMultipleChoice

* Fixed Inconsistent Library Name Typo: Pytorch to PyTorch

* Fixed Inconsistent Function Name Typo: captureWarning to captureWarnings

* Fixed Inconsistent Library Name Typo: Pytorch to PyTorch

* Fixed Inconsistent Class Name Typo: TrainingArgument to TrainingArguments

* Fixed Inconsistent Model Name Typo: Swin2R to Swin2SR

* Fixed Inconsistent Model Name Typo: EART to BERT

* Fixed Inconsistent Library Name Typo: TensorFLow to TensorFlow

* Fixed Broken Link for Speech Emotion Classification with Wav2Vec2

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed minor missing word Typo

* Fixed Punctuation: Two commas

* Fixed Punctuation: No Space between XLM-R and is

* Fixed Punctuation: No Space between [~accelerate.Accelerator.backward] and method

* Added backticks to display model.fit() in codeblock

* Added backticks to display openai-community/gpt2 in codeblock

* Fixed Minor Typo: will to with

* Fixed Minor Typo: is to are

* Fixed Minor Typo: in to on

* Fixed Minor Typo: inhibits to exhibits

* Fixed Minor Typo: they need to it needs

* Fixed Minor Typo: cast the load the checkpoints To load the checkpoints

* Fixed Inconsistent Class Name Typo: TFCamembertForCasualLM to TFCamembertForCausalLM

* Fixed typo in attribute name: outputs.last_hidden_states to outputs.last_hidden_state

* Added missing verbosity level: fatal

* Fixed Minor Typo: take To takes

* Fixed Minor Typo: heuristic To heuristics

* Fixed Minor Typo: setting To settings

* Fixed Minor Typo: Content To Contents

* Fixed Minor Typo: millions To million

* Fixed Minor Typo: difference To differences

* Fixed Minor Typo: while extract To which extracts

* Fixed Minor Typo: Hereby To Here

* Fixed Minor Typo: addition To additional

* Fixed Minor Typo: supports To supported

* Fixed Minor Typo: so that benchmark results TO as a consequence, benchmark

* Fixed Minor Typo: a To an

* Fixed Minor Typo: a To an

* Fixed Minor Typo: Chain-of-though To Chain-of-thought
2024-09-09 10:47:24 +02:00
Aymeric Roucher
489cbfd6d3
Add visit webpage tool (#33353)
* Add VisitWebpageTool
2024-09-09 10:32:42 +02:00
Wing Lian
62aecd85ff
schedulefree optimizers (#30079)
* schedulefree optimizers

* fix train instead of eval for optimizer

* fixes and update docs

* chore: lint

* add tests and drop overly-verbose _32bit suffix

* chore: lint

* fix for docs

* fix code review issues

* use duck-typing to avoid per-optimizer patches

* fixup style

* fixup style

* warn if incorrect accelerate version with schedule free

Co-authored-by: Aman Gupta Karmani <aman@tmm1.net>

---------

Co-authored-by: Aman Karmani <aman@tmm1.net>
2024-09-09 09:51:39 +02:00
Nicholas Broad
66bc4def95
add sdpa mbart (#32033)
* add sdpa mbart

useful for donut

* update sdpa docs

* formatting

* add self._use_sdpa in mbartencoder

* use self.config to check attn

* retrigger checks

* [run-slow] mbart
2024-09-06 17:31:24 -07:00
Daniel Lok
a70286f827
Update author for QLorA/PEFT community notebook (#33338)
update author

Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
2024-09-06 22:50:26 +02:00
Matt
d7b04ea14d
Fix Prefill docs (#33352)
last -> final
2024-09-06 17:57:54 +01:00
Ita Zaporozhets
e48e5f1f13
Support reading tiktoken tokenizer.model file (#31656)
* use existing TikTokenConverter to read tiktoken tokenizer.model file

* del test file

* create titktoken integration file

* adding tiktoken llama test

* ALTNATIVE IMPLEMENTATION: supports llama 405B

* fix one char

* remove redundant line

* small fix

* rm unused import

* flag for converting from tiktokeng

* remove unneeded file

* ruff

* remove llamatiktokenconverter, stick to general converter

* tiktoken support v2

* update test

* remove stale changes

* udpate doc

* protect import

* use is_protobuf_available

* add templateprocessor in tiktokenconverter

* reverting templateprocessor from tiktoken support

* update test

* add require_tiktoken

* dev-ci

* trigger build

* trigger build again

* dev-ci

* [build-ci-image] tiktoken

* dev-ci

* dev-ci

* dev-ci

* dev-ci

* change tiktoken file name

* feedback review

* feedback rev

* applying feedback, removing tiktoken converters

* conform test

* adding docs for review

* add doc file for review

* add doc file for review

* add doc file for review

* support loading model without config.json file

* Revert "support loading model without config.json file"

This reverts commit 2753602e51c34cef2f184eb11f36d2ad1b02babb.

* remove dev var

* updating docs

* safely import protobuf

* fix protobuf import error

* fix protobuf import error

* trying isort to fix ruff error

* fix ruff error

* try to fix ruff again

* try to fix ruff again

* try to fix ruff again

* doc table of contents

* add fix for consistency.dockerfile torchaudio

* ruff

* applying feedback

* minor typo

* merging with push-ci-image

* clean up imports

* revert dockerfile consistency
2024-09-06 14:24:02 +02:00
Joao Gante
2b789f27f3
Docs: add more cross-references to the KV cache docs (#33323)
* add more cross-references

* nit

* import guard

* more import guards

* nit

* Update src/transformers/generation/configuration_utils.py
2024-09-06 10:22:00 +01:00
Daniel Lok
5792c459ed
Add a community notebook for fine-tuning with QLoRA, PEFT, and MLflow (#33319)
add notebook for finetuning with mlflow

Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
2024-09-06 09:35:01 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support (#33264)
* update gguf doc, config and tensor mapping

* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests

* apply code style fixes

* reformat files

* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Niklas Muennighoff
03164ba14e
Add paper link (#33305) 2024-09-05 15:49:28 +02:00
Raushan Turganbay
43df47d8e7
Llava Onevision: add model (#32673)
* working version

* fix copies

* update

* tests

* update docs

* codestyle

* add more tests

* add returns for docs

* clean up

* Update src/transformers/models/llava_onevision/processing_llava_onevision.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates

* codestyle

* style

* shouldn't be reversed

* [run-slow] llava_onevision

* [run-slow] llava_onevision

* add pooling in videos

* [run-slow] llava_onevision

* num-logits-to-keep

* [run-slow] llava_onevision

* [run-slow] llava_onevision

* Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* video matched orig impl

* fix tests

* chat template was modified

* Update docs/source/en/model_doc/llava_onevision.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add morer info in the doc page

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-05 14:43:20 +05:00
Aymeric Roucher
cfd92c64f5
Add new documentation page for advanced agent usage (#33265)
* Add new documentation page for advanced agent usage
2024-09-04 18:19:54 +02:00
Matt
01c8c6c419
Add a warning to the chat template docs about the tool_calls format (#33277)
* Add a warning to the chat template docs

* Add a warning to the chat template docs

* Add a warning to the chat template docs
2024-09-04 17:13:34 +01:00
Raushan Turganbay
ebbe8d8014
Cache docs: update (#32929)
* some changes

* more updates

* fix cache copy

* nits

* nits

* add tests
2024-09-04 15:05:31 +05:00
Niklas Muennighoff
ecd61c6286
Add OLMoE (#32406)
* Add OLMoE

* Add OLMoE

* Updates

* Make norm optional; add keys

* Add output

* Add

* Fix dtype

* Fix eos config

* Update

* Add OLMoE

* Fix OLMoE path

* Format

* Format

* Rmv copy statement

* Rmv copy statement

* Format

* Add copies

* Cp rotary

* Fix aming

* Fix naming

* Update RoPE integration; num_logits_to_keep; Add copy statements

* Add eps to config

* Format

* Add aux loss

* Adapt router_aux_loss_coef

* Update md

* Adapt

* adapt tests
2024-09-03 18:43:12 +02:00
Omar Salman
03c12d0d63
Add sdpa support for Albert (#32092)
* Add sdpa support for Albert

* [run_slow] albert

* Add benchmarks and PR suggestion

* Fix quality

* Fix

* [run_slow] albert
2024-09-03 14:01:00 +01:00
Matt
0d86727354
Update chat template docs to remove Blenderbot (#33254)
* Update docs to remove obsolete Blenderbot

* Remove another reference to Blenderbot
2024-09-03 12:18:04 +01:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types (#32625)
* use gguf internal dequantize

* add Q5_0 test

* add iq1 test

* add remained test

* remove duplicated test

* update docs

* add gguf version limit

* make style

* update gguf import catch

* revert vocab_size patch

* make style

* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Sergio Paniego Blanco
28952248b1
Fixed typo repeated word in DETR docs (#33250) 2024-09-02 17:19:18 +02:00
Matt
52a0213755
Add assistant prefill for chat templates and TextGenerationPipeline (#33198)
* Add assistant prefill to chat templates

* Add assistant prefill to pipeline

* Add assistant prefill to pipeline

* Tweak another test that ended in assistant message

* Update tests that ended in assistant messages

* Update tests that ended in assistant messages

* Replace assistant_prefill with continue_final_message

* Allow passing continue_final_message to pipeline

* Small fixup

* Add continue_final_message as a pipeline kwarg

* Update docstrings

* Move repos to hf-internal-testing!

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Add explanatory comment

* make fixup

* Update chat templating docs to explain continue_last_message

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-09-02 13:23:47 +01:00
Aymeric Roucher
1ca9ff5c91
Add duckduckgo search tool (#32882)
* Add duckduckgo search tool
2024-09-02 09:56:20 +02:00
Merve Noyan
2e3f8f7474
Add video text to text docs (#33164)
---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-09-01 12:06:31 +03:00
Yijun Lee
db70426854
🌐 [i18n-KO] Translated llm_optims.md to Korean (#32325)
* docs: ko: llm_optims.md

* feat: nmt draft

* fix toc title

* fix: manual edits

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>

* Update docs/source/ko/llm_optims.md

Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>

* Update llm_optims.md

* fix: resolve suggestions

* fix: resolve suggestions

* Apply suggestions from code review

fix: resolve suggestions

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>
2024-08-30 09:52:41 -07:00
Aymeric Roucher
c79bfc71b8
Create local Transformers Engine (#33218)
* Create local Transformers Engine
2024-08-30 18:22:27 +02:00
Gerben van V
5129671290
Add a static cache that offloads to the CPU or other device (#32161)
* Add a static cache that offloads to the CPU or other device

* Fix PR comments, add unit-tests
2024-08-29 11:51:09 +02:00
JB (Don)
f1a385b1de
[RoBERTa-based] Add support for sdpa (#30510)
* Adding SDPA support for RoBERTa-based models

* add not is_cross_attention

* fix copies

* fix test

* add minimal test for camembert and xlm_roberta as their test class does not inherit from ModelTesterMixin

* address some review comments

* use copied from

* style

* consistency

* fix lists

---------

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-28 10:26:00 +02:00
Mayank Mishra
c35d2ccf5a
Granite language models (#31502)
* first commit

* drop tokenizer

* drop tokenizer

* drop tokenizer

* drop convert

* granite

* drop tokenization test

* mup

* fix

* reformat

* reformat

* reformat

* fix docs

* stop checking for checkpoint

* update support

* attention multiplier

* update model

* tiny drop

* saibo drop

* skip test

* fix test

* fix test

* drop

* drop useless imports

* update docs

* drop flash function

* copied from

* drop pretraining tp

* drop pretraining tp

* drop pretraining tp

* drop unused import

* drop code path

* change name

* softmax scale

* head dim

* drop legacy cache

* rename params

* cleanup

* fix copies

* comments

* add back legacy cache

* multipliers

* multipliers

* multipliers

* text fix

* fix copies

* merge

* multipliers

* attention multiplier

* drop unused imports

* fix

* fix

* fix

* move rope?

* Update src/transformers/models/granite/configuration_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* Update src/transformers/models/granite/modeling_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix

* fix

* fix

* fix-copies

* torch rmsnorm

* add authors

* change model path

* fix

* test

* drop static cache test

* uupdate readme

* drop non-causal

* readme

* drop useless imports

* Update docs/source/en/model_doc/granite.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/granite.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/model_doc/granite.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-27 21:27:21 +02:00
Juan Pizarro
7591ca5bc5
🚨 Add Blip2ForImageTextRetrieval (#29261)
* add Blip2ForImageTextRetrieval

* use one line and remove unnecessary space in tests

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use  value from the config, rather than hardcoded

* change order of params in Blip2QFormerModel.forward

* update docstring

* fix style

* update test_inference_opt

* move embeddings out of Blip2QFormerModel

* remove from_vision_qformer_configs

* remove autocast float16 in Blip2QFormerModel

* rename fiels into vision_projection,text_projection,use_image_text_matching_head

* use CLIPOutput for  Blip2ImageTextMatchingModelOutput

* remove past_key_values_length from Blip2TextEmbeddings

* fix small typo in the CLIPOutput docstring

* add Blip2ForImageTextRetrieval to Zero Shot Image Classification mapping

* update docstring and add require_torch_fp16

* rollback test_inference_opt

* use use_image_text_matching_head=True in convert

* skip test_model_get_set_embeddings

* fix create_rename_keys error on new itm fields

* revert to do  scale after dot product between "query" and "key"

* fix ValueError on convert script for blip2-opt-2.7b

* update org of paths to Salesforce

* add is_pipeline_test_to_skip for VisualQuestionAnsweringPipelineTests

* [run_slow] blip_2

* removed Blip2ForImageTextRetrieval from IGNORE_NON_AUTO_CONFIGURED

* fix docstring of Blip2ImageTextMatchingModelOutput

* [run_slow] blip_2

* fix multi-gpu tests

* [run_slow] blip_2

* [run_slow] blip_2

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-27 18:50:27 +01:00
Ali Salamatian
27903de7ec
Very small change to one of the function parameters (#32548)
Very small change to one of the parameters

np.random.randint second parameter is not included in the possible options. Therefore, we want the upper range to be 2, so that we have some 1 labels in our classification as well.
2024-08-27 09:29:05 -07:00
Sae_Chan_Oh
6101d934a1
🌐 [i18n-KO] Translated conversations.md to Korean (#32468)
* docs: ko: conversations.md

* feat: hand-crafted translate docs

* fix: modify typo after Grammar Check

* Update docs/source/ko/conversations.md

감사합니다

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* fix: accept suggestions about anchor and spacing

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/conversations.md

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* fix: anchor 'what happened inside piepeline?' be removed question mark

* fix: translate the comments in the code block

---------

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
2024-08-27 09:25:41 -07:00
Vaibhav Srivastav
6f0ecf1049
[docs] add quick usage snippet to Whisper. (#31289)
* [docs] add quick usage snippet to Whisper.

* Apply suggestions from review.

* 💉 Fix the device for pipeline.
2024-08-27 14:11:52 +02:00
Pablo Montalvo
26f043bd4d
quickfix documentation (#32566)
* fix documentation

* update config
2024-08-26 17:49:44 +02:00
Ritik Nandwal
a378a54a57
Add changes for uroman package to handle non-Roman characters (#32404)
* Add changes for uroman package to handle non-Roman characters

* Update docs for uroman changes

* Modifying error message to warning, for backward compatibility

* Update instruction for user to install uroman

* Update docs for uroman python version dependency and backward compatibility

* Update warning message for python version compatibility with uroman

* Refine docs
2024-08-26 17:07:01 +02:00
Shijie
19e6e80e10
support qwen2-vl (#32318)
* support-qwen2-vl

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* tidy

* hyphen->underscore

* make style

* add-flash2-tipd

* delete-tokenize=False

* remove-image_processor-in-init-file

* add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES

* format-doct

* support-Qwen2VLVisionConfig

* remove-standardize_cache_format

* fix-letter-varaibles

* remove-torch-in-image-processor

* remove-useless-docstring

* fix-one-letter-varaible-name

* change-block-name

* default-quick-gelu-in-vision

* remove-useless-doc

* use-preimplemented-flash-forward

* fix-doc

* fix-image-processing-doc

* fix-apply-rotary-embed

* fix-flash-attn-sliding-window

* refactor

* remove-default_template

* remove-reorder_cache

* simple-get-rope_deltas

* update-prepare_inputs_for_generation

* update-attention-mask

* update-rotary_seq_len

* remove-state

* kv_seq_length

* remove-warning

* _supports_static_cache

* remove-legacy-cache

* refactor

* fix-replace

* mrope-section-doc

* code-quality

* code-quality

* polish-doc

* fix-image-processing-test

* update readme

* Update qwen2_vl.md

* fix-test

* Update qwen2_vl.md

* nit

* processor-kwargs

* hard-code-norm_layer

* code-quality

* discard-pixel-values-in-gen

* fix-inconsistent-error-msg

* unify-image-video

* hidden_act

* add-docstring

* vision-encode-as-PreTrainedModel

* pixel-to-target-dtype

* update doc and low memoryvit

* format

* format

* channel-foramt

* fix vit_flashatt

* format

* inherit-Qwen2VLPreTrainedModel

* simplify

* format-test

* remove-one-line-func-in-image-processing

* avoid-one-line-reshape

* simplify-rotary_seq_len

* avoid-single-letter-variable

* no-for-loop-sdpa

* avoid-single-letter-variable

* remove-one-line-reshape

* remove-one-line-reshape

* remove-no-rope-in-vit-logic

* default-mrope

* add-copied-from

* more-docs-for-mrope

* polish-doc

* comment-and-link

* polish-doc

* single-letter-variables

* simplify-image-processing

* video->images

* kv_seq_len-update

* vision-rope-on-the-fly

* vision-eager-attention

* change-processor-order

---------

Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com>
Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>
2024-08-26 15:16:44 +02:00
S M Jishanul Islam
8defc95df3
Updated the custom_models.md changed cross_entropy code (#33118) 2024-08-26 13:15:43 +02:00
Matt
0a7af19f4d
Update Jinja docs with new functions and general cleanup (#33097) 2024-08-23 17:40:06 +01:00
Jason (Siyu) Zhu
adb91179b9
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer (#32860)
* add liger integration

* fix syntax

* fix import issue

* add trainer.md

* Use _apply_liger_kernel()

* Fixed log message

* Update docs/source/en/trainer.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update docs/source/en/trainer.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

* Update docs/source/en/trainer.md

Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>

* Fixed checkstyle and updated readme

* Added test

* Fixed checkstyle

* fix docstring

* rename use_liger to use_liger_kernel

* Trigger Build

* Added test

* add fix-copies

* Fixed copy inconsistencies

---------

Co-authored-by: shimizust <sshimizu@linkedin.com>
Co-authored-by: Steven Shimizu <shimizust@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
2024-08-23 13:20:49 +02:00
Joao Gante
970a16ec7f
Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 (#32659)
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-23 11:12:53 +01:00
Jinuk
09e6579d2d
🌐 [i18n-KO] Translated `knowledge_distillation_for_image_classification.md to Korean" (#32334)
* docs: ko: tasks/knowledge_distillation_for_image_classification.md

* feat: nmt draft

* fix: manual edits

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

---------

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-08-22 10:42:39 -07:00
Shubham Ugare
9282413611
Add SynCode to llm_tutorial (#32884) 2024-08-22 15:30:22 +02:00
Matt
85345bb439
Add tip to clarify tool calling (#32883) 2024-08-19 18:37:35 +01:00
Sai-Suraj-27
37204848f1
Docs: Fixed whisper-large-v2 model link in docs (#32871)
Fixed whisper-large-v2 model link in docs.
2024-08-19 09:50:35 -07:00
Kamil Akesbi
8260cb311e
Add Descript-Audio-Codec model (#31494)
* dac model

* original dac works

* add dac model

* dac can be instatiated

* add forward pass

* load weights

* all weights are used

* convert checkpoint script ready

* test

* add feature extractor

* up

* make style

* apply cookicutter

* fix tests

* iterate on FeatureExtractor

* nit

* update dac doc

* replace nn.Sequential with nn.ModuleList

* nit

* apply review suggestions 1/2

* Update src/transformers/models/dac/modeling_dac.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* up

* apply review suggestions 2/2

* update padding in FeatureExtractor

* apply review suggestions

* iterate on design and tests

* add integration tests

* feature extractor tests

* make style

* all tests pass

* make style

* fixup

* apply review suggestions

* fix-copies

* apply review suggestions

* apply review suggestions

* Update docs/source/en/model_doc/dac.md

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update docs/source/en/model_doc/dac.md

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* anticipate transfer weights to descript

* up

* make style

* apply review suggestions

* update slow test values

* update slow tests

* update test values

* update with CI values

* update with vorace values

* update test with slice

* make style

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-08-19 10:21:51 +01:00
MAHIR DAIYAN
843e5e20ca
Add Flax Dinov2 (#31960)
* tfmsenv restored in main

* installed flax

* forward pass done and all tests passed

* make fix-copies and cleaning the scripts

* fixup attempt 1

* fixup attempt 2

* fixup third attempt

* fixup attempt 4

* fixup attempt 5

* dinov2 doc fixed

* FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE

* external pos_encoding layer removed

* fixup attempt 6

* fixed integration test values

* fixup attempt 7

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* comments removed

* comment removed from the test

* fixup

* Update src/transformers/models/dinov2/modeling_flax_dinov2.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* new fixes 1

* interpolate_pos_encoding function removed

* droppath rng fixed, pretrained beit copied-from still not working

* modeling_flax_dinov2.py reformatted

* Update tests/models/dinov2/test_modeling_flax_dinov2.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* added Copied from, to the tests

* copied from statements removed from tests

* fixed copied from statements in the tests

* [run_slow] dinov2

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-08-19 09:28:13 +01:00
Yangshen⚡Deng
a27182b7fc
Fix AutoConfig and AutoModel support for Llava-Next-Video (#32844)
* Fix: fix all model_type of Llava-Next-Video to llava_next_video

* Fix doc for llava_next_video

* * Fix formatting issues
* Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation

* Fix docs TOC for llava-next-video
2024-08-16 12:41:05 +01:00
Joao Gante
cf32ee1753
Cache: use batch_size instead of max_batch_size (#32657)
* more precise name

* better docstrings

* Update src/transformers/cache_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-16 11:48:45 +01:00
Joao Gante
70d5df6107
Generate: unify LogitsWarper and LogitsProcessor (#32626) 2024-08-16 11:20:41 +01:00
Dina Suehiro Jones
6577c77d93
Update the distributed CPU training on Kubernetes documentation (#32669)
* Update the Kubernetes CPU training example

* Add namespace arg

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>

---------

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
2024-08-14 09:36:43 -07:00
Jerry Zhang
78d78cdf8a
Add TorchAOHfQuantizer (#32306)
* Add TorchAOHfQuantizer

Summary:
Enable loading torchao quantized model in huggingface.

Test Plan:
local test

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix a few issues

* style

* Added tests and addressed some comments about dtype conversion

* fix torch_dtype warning message

* fix tests

* style

* TorchAOConfig -> TorchAoConfig

* enable offload + fix memory with multi-gpu

* update torchao version requirement to 0.4.0

* better comments

* add torch.compile to torchao README, add perf number link

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
2024-08-14 16:14:24 +02:00
Eric Hartford
481e15604a
Add support for GrokAdamW optimizer (#32521)
* add grokadamw

* reformat

* code review feedback, unit test

* reformat

* reformat
2024-08-13 13:20:28 +01:00
Quentin Gallouédec
f1c8542ff7
"to be not" -> "not to be" (#32636)
* "to be not" -> "not to be"

* Update sam.md

* Update trainer.py

* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py
2024-08-12 20:20:17 +01:00
Ahnjj_DEV
7f777ab7d9
🌐 [i18n-KO] Translated awq.mdto Korean (#32324)
* fix: manual edits

* Apply suggestions from code review

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* fix:manual edits

- 잘못된 경로에 번역본 파일을 생성해서 옮김

* Delete docs/source/ko/tasks/awq.md

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-12 10:12:48 -07:00
YONGSANG
4996990d61
🌐 [i18n-KO] Translated deepspeed.md to Korean (#32431)
* Update _toctree.yml

* docs: ko: deepspeed.md

* Apply suggestions from code review

Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ko/deepspeed.md

* Update docs/source/ko/deepspeed.md

Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* Apply suggestions from code review

Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>

* Update docs/source/ko/_toctree.yml

---------

Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
2024-08-12 10:07:31 -07:00
Matt
b7ea171403
Cleanup tool calling documentation and rename doc (#32337)
* Rename "Templates for Chat Models" doc to "Chat Templates"

* Small formatting fix

* Small formatting fix

* Small formatting fix

* Cleanup tool calling docs as well

* Remove unneeded 'revision'

* Move tip to below main code example

* Little bonus section on template editing
2024-08-12 16:20:14 +01:00
Younes Belkada
7c11491208
Add new model (#32615)
* v1 - working version

* fix

* fix

* fix

* fix

* rename to correct name

* fix title

* fixup

* rename files

* fix

* add copied from on tests

* rename to `FalconMamba` everywhere and fix bugs

* fix quantization + accelerate

* fix copies

* add `torch.compile` support

* fix tests

* fix tests and add slow tests

* copies on config

* merge the latest changes

* fix tests

* add few lines about instruct

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix tests

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-12 08:22:47 +02:00
wony617
48101cf8d1
🌐 [i18n-KO] Translated agent.md to Korean (#32351)
* docs: ko: main_classes/agent

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com>
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>

* fix: resolve suggestions

* fix: resolve code line number

---------

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: thsamaji <60818655+thsamajiki@users.noreply.github.com>
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
2024-08-09 09:58:52 -07:00
Steven Liu
85817d98fb
[docs] Translation guide (#32547)
clarify
2024-08-08 13:43:14 -07:00