Aymeric Roucher
489cbfd6d3
Add visit webpage tool ( #33353 )
...
* Add VisitWebpageTool
2024-09-09 10:32:42 +02:00
Wing Lian
62aecd85ff
schedulefree optimizers ( #30079 )
...
* schedulefree optimizers
* fix train instead of eval for optimizer
* fixes and update docs
* chore: lint
* add tests and drop overly-verbose _32bit suffix
* chore: lint
* fix for docs
* fix code review issues
* use duck-typing to avoid per-optimizer patches
* fixup style
* fixup style
* warn if incorrect accelerate version with schedule free
Co-authored-by: Aman Gupta Karmani <aman@tmm1.net>
---------
Co-authored-by: Aman Karmani <aman@tmm1.net>
2024-09-09 09:51:39 +02:00
Nicholas Broad
66bc4def95
add sdpa mbart ( #32033 )
...
* add sdpa mbart
useful for donut
* update sdpa docs
* formatting
* add self._use_sdpa in mbartencoder
* use self.config to check attn
* retrigger checks
* [run-slow] mbart
2024-09-06 17:31:24 -07:00
Daniel Lok
a70286f827
Update author for QLorA/PEFT community notebook ( #33338 )
...
update author
Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
2024-09-06 22:50:26 +02:00
Matt
d7b04ea14d
Fix Prefill docs ( #33352 )
...
last -> final
2024-09-06 17:57:54 +01:00
Ita Zaporozhets
e48e5f1f13
Support reading tiktoken tokenizer.model file ( #31656 )
...
* use existing TikTokenConverter to read tiktoken tokenizer.model file
* del test file
* create titktoken integration file
* adding tiktoken llama test
* ALTNATIVE IMPLEMENTATION: supports llama 405B
* fix one char
* remove redundant line
* small fix
* rm unused import
* flag for converting from tiktokeng
* remove unneeded file
* ruff
* remove llamatiktokenconverter, stick to general converter
* tiktoken support v2
* update test
* remove stale changes
* udpate doc
* protect import
* use is_protobuf_available
* add templateprocessor in tiktokenconverter
* reverting templateprocessor from tiktoken support
* update test
* add require_tiktoken
* dev-ci
* trigger build
* trigger build again
* dev-ci
* [build-ci-image] tiktoken
* dev-ci
* dev-ci
* dev-ci
* dev-ci
* change tiktoken file name
* feedback review
* feedback rev
* applying feedback, removing tiktoken converters
* conform test
* adding docs for review
* add doc file for review
* add doc file for review
* add doc file for review
* support loading model without config.json file
* Revert "support loading model without config.json file"
This reverts commit 2753602e51c34cef2f184eb11f36d2ad1b02babb.
* remove dev var
* updating docs
* safely import protobuf
* fix protobuf import error
* fix protobuf import error
* trying isort to fix ruff error
* fix ruff error
* try to fix ruff again
* try to fix ruff again
* try to fix ruff again
* doc table of contents
* add fix for consistency.dockerfile torchaudio
* ruff
* applying feedback
* minor typo
* merging with push-ci-image
* clean up imports
* revert dockerfile consistency
2024-09-06 14:24:02 +02:00
Joao Gante
2b789f27f3
Docs: add more cross-references to the KV cache docs ( #33323 )
...
* add more cross-references
* nit
* import guard
* more import guards
* nit
* Update src/transformers/generation/configuration_utils.py
2024-09-06 10:22:00 +01:00
Daniel Lok
5792c459ed
Add a community notebook for fine-tuning with QLoRA, PEFT, and MLflow ( #33319 )
...
add notebook for finetuning with mlflow
Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
2024-09-06 09:35:01 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support ( #33264 )
...
* update gguf doc, config and tensor mapping
* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests
* apply code style fixes
* reformat files
* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Niklas Muennighoff
03164ba14e
Add paper link ( #33305 )
2024-09-05 15:49:28 +02:00
Raushan Turganbay
43df47d8e7
Llava Onevision: add model ( #32673 )
...
* working version
* fix copies
* update
* tests
* update docs
* codestyle
* add more tests
* add returns for docs
* clean up
* Update src/transformers/models/llava_onevision/processing_llava_onevision.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* updates
* codestyle
* style
* shouldn't be reversed
* [run-slow] llava_onevision
* [run-slow] llava_onevision
* add pooling in videos
* [run-slow] llava_onevision
* num-logits-to-keep
* [run-slow] llava_onevision
* [run-slow] llava_onevision
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* video matched orig impl
* fix tests
* chat template was modified
* Update docs/source/en/model_doc/llava_onevision.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add morer info in the doc page
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-05 14:43:20 +05:00
Aymeric Roucher
cfd92c64f5
Add new documentation page for advanced agent usage ( #33265 )
...
* Add new documentation page for advanced agent usage
2024-09-04 18:19:54 +02:00
Matt
01c8c6c419
Add a warning to the chat template docs about the tool_calls format ( #33277 )
...
* Add a warning to the chat template docs
* Add a warning to the chat template docs
* Add a warning to the chat template docs
2024-09-04 17:13:34 +01:00
Raushan Turganbay
ebbe8d8014
Cache docs: update ( #32929 )
...
* some changes
* more updates
* fix cache copy
* nits
* nits
* add tests
2024-09-04 15:05:31 +05:00
Niklas Muennighoff
ecd61c6286
Add OLMoE ( #32406 )
...
* Add OLMoE
* Add OLMoE
* Updates
* Make norm optional; add keys
* Add output
* Add
* Fix dtype
* Fix eos config
* Update
* Add OLMoE
* Fix OLMoE path
* Format
* Format
* Rmv copy statement
* Rmv copy statement
* Format
* Add copies
* Cp rotary
* Fix aming
* Fix naming
* Update RoPE integration; num_logits_to_keep; Add copy statements
* Add eps to config
* Format
* Add aux loss
* Adapt router_aux_loss_coef
* Update md
* Adapt
* adapt tests
2024-09-03 18:43:12 +02:00
Omar Salman
03c12d0d63
Add sdpa support for Albert ( #32092 )
...
* Add sdpa support for Albert
* [run_slow] albert
* Add benchmarks and PR suggestion
* Fix quality
* Fix
* [run_slow] albert
2024-09-03 14:01:00 +01:00
Matt
0d86727354
Update chat template docs to remove Blenderbot ( #33254 )
...
* Update docs to remove obsolete Blenderbot
* Remove another reference to Blenderbot
2024-09-03 12:18:04 +01:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types ( #32625 )
...
* use gguf internal dequantize
* add Q5_0 test
* add iq1 test
* add remained test
* remove duplicated test
* update docs
* add gguf version limit
* make style
* update gguf import catch
* revert vocab_size patch
* make style
* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Sergio Paniego Blanco
28952248b1
Fixed typo repeated word in DETR docs ( #33250 )
2024-09-02 17:19:18 +02:00
Matt
52a0213755
Add assistant prefill for chat templates and TextGenerationPipeline ( #33198 )
...
* Add assistant prefill to chat templates
* Add assistant prefill to pipeline
* Add assistant prefill to pipeline
* Tweak another test that ended in assistant message
* Update tests that ended in assistant messages
* Update tests that ended in assistant messages
* Replace assistant_prefill with continue_final_message
* Allow passing continue_final_message to pipeline
* Small fixup
* Add continue_final_message as a pipeline kwarg
* Update docstrings
* Move repos to hf-internal-testing!
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Add explanatory comment
* make fixup
* Update chat templating docs to explain continue_last_message
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-09-02 13:23:47 +01:00
Aymeric Roucher
1ca9ff5c91
Add duckduckgo search tool ( #32882 )
...
* Add duckduckgo search tool
2024-09-02 09:56:20 +02:00
Merve Noyan
2e3f8f7474
Add video text to text docs ( #33164 )
...
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-09-01 12:06:31 +03:00
Yijun Lee
db70426854
🌐 [i18n-KO] Translated llm_optims.md
to Korean ( #32325 )
...
* docs: ko: llm_optims.md
* feat: nmt draft
* fix toc title
* fix: manual edits
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>
* Update docs/source/ko/llm_optims.md
Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>
* Update llm_optims.md
* fix: resolve suggestions
* fix: resolve suggestions
* Apply suggestions from code review
fix: resolve suggestions
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
---------
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: HyunJi Shin <74661937+shinhyunji36@users.noreply.github.com>
2024-08-30 09:52:41 -07:00
Aymeric Roucher
c79bfc71b8
Create local Transformers Engine ( #33218 )
...
* Create local Transformers Engine
2024-08-30 18:22:27 +02:00
Gerben van V
5129671290
Add a static cache that offloads to the CPU or other device ( #32161 )
...
* Add a static cache that offloads to the CPU or other device
* Fix PR comments, add unit-tests
2024-08-29 11:51:09 +02:00
JB (Don)
f1a385b1de
[RoBERTa-based] Add support for sdpa ( #30510 )
...
* Adding SDPA support for RoBERTa-based models
* add not is_cross_attention
* fix copies
* fix test
* add minimal test for camembert and xlm_roberta as their test class does not inherit from ModelTesterMixin
* address some review comments
* use copied from
* style
* consistency
* fix lists
---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-28 10:26:00 +02:00
Mayank Mishra
c35d2ccf5a
Granite language models ( #31502 )
...
* first commit
* drop tokenizer
* drop tokenizer
* drop tokenizer
* drop convert
* granite
* drop tokenization test
* mup
* fix
* reformat
* reformat
* reformat
* fix docs
* stop checking for checkpoint
* update support
* attention multiplier
* update model
* tiny drop
* saibo drop
* skip test
* fix test
* fix test
* drop
* drop useless imports
* update docs
* drop flash function
* copied from
* drop pretraining tp
* drop pretraining tp
* drop pretraining tp
* drop unused import
* drop code path
* change name
* softmax scale
* head dim
* drop legacy cache
* rename params
* cleanup
* fix copies
* comments
* add back legacy cache
* multipliers
* multipliers
* multipliers
* text fix
* fix copies
* merge
* multipliers
* attention multiplier
* drop unused imports
* fix
* fix
* fix
* move rope?
* Update src/transformers/models/granite/configuration_granite.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* Update src/transformers/models/granite/modeling_granite.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix
* fix
* fix
* fix-copies
* torch rmsnorm
* add authors
* change model path
* fix
* test
* drop static cache test
* uupdate readme
* drop non-causal
* readme
* drop useless imports
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-27 21:27:21 +02:00
Juan Pizarro
7591ca5bc5
🚨 Add Blip2ForImageTextRetrieval ( #29261 )
...
* add Blip2ForImageTextRetrieval
* use one line and remove unnecessary space in tests
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* use value from the config, rather than hardcoded
* change order of params in Blip2QFormerModel.forward
* update docstring
* fix style
* update test_inference_opt
* move embeddings out of Blip2QFormerModel
* remove from_vision_qformer_configs
* remove autocast float16 in Blip2QFormerModel
* rename fiels into vision_projection,text_projection,use_image_text_matching_head
* use CLIPOutput for Blip2ImageTextMatchingModelOutput
* remove past_key_values_length from Blip2TextEmbeddings
* fix small typo in the CLIPOutput docstring
* add Blip2ForImageTextRetrieval to Zero Shot Image Classification mapping
* update docstring and add require_torch_fp16
* rollback test_inference_opt
* use use_image_text_matching_head=True in convert
* skip test_model_get_set_embeddings
* fix create_rename_keys error on new itm fields
* revert to do scale after dot product between "query" and "key"
* fix ValueError on convert script for blip2-opt-2.7b
* update org of paths to Salesforce
* add is_pipeline_test_to_skip for VisualQuestionAnsweringPipelineTests
* [run_slow] blip_2
* removed Blip2ForImageTextRetrieval from IGNORE_NON_AUTO_CONFIGURED
* fix docstring of Blip2ImageTextMatchingModelOutput
* [run_slow] blip_2
* fix multi-gpu tests
* [run_slow] blip_2
* [run_slow] blip_2
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-27 18:50:27 +01:00
Ali Salamatian
27903de7ec
Very small change to one of the function parameters ( #32548 )
...
Very small change to one of the parameters
np.random.randint second parameter is not included in the possible options. Therefore, we want the upper range to be 2, so that we have some 1 labels in our classification as well.
2024-08-27 09:29:05 -07:00
Sae_Chan_Oh
6101d934a1
🌐 [i18n-KO] Translated conversations.md
to Korean ( #32468 )
...
* docs: ko: conversations.md
* feat: hand-crafted translate docs
* fix: modify typo after Grammar Check
* Update docs/source/ko/conversations.md
감사합니다
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
* fix: accept suggestions about anchor and spacing
* Update docs/source/ko/conversations.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
* Update docs/source/ko/conversations.md
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
* fix: anchor 'what happened inside piepeline?' be removed question mark
* fix: translate the comments in the code block
---------
Co-authored-by: SeungAhSon <gongsoonyee@gmail.com>
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
2024-08-27 09:25:41 -07:00
Vaibhav Srivastav
6f0ecf1049
[docs] add quick usage snippet to Whisper. ( #31289 )
...
* [docs] add quick usage snippet to Whisper.
* Apply suggestions from review.
* 💉 Fix the device for pipeline.
2024-08-27 14:11:52 +02:00
Pablo Montalvo
26f043bd4d
quickfix documentation ( #32566 )
...
* fix documentation
* update config
2024-08-26 17:49:44 +02:00
Ritik Nandwal
a378a54a57
Add changes for uroman package to handle non-Roman characters ( #32404 )
...
* Add changes for uroman package to handle non-Roman characters
* Update docs for uroman changes
* Modifying error message to warning, for backward compatibility
* Update instruction for user to install uroman
* Update docs for uroman python version dependency and backward compatibility
* Update warning message for python version compatibility with uroman
* Refine docs
2024-08-26 17:07:01 +02:00
Shijie
19e6e80e10
support qwen2-vl ( #32318 )
...
* support-qwen2-vl
* tidy
* tidy
* tidy
* tidy
* tidy
* tidy
* tidy
* hyphen->underscore
* make style
* add-flash2-tipd
* delete-tokenize=False
* remove-image_processor-in-init-file
* add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES
* format-doct
* support-Qwen2VLVisionConfig
* remove-standardize_cache_format
* fix-letter-varaibles
* remove-torch-in-image-processor
* remove-useless-docstring
* fix-one-letter-varaible-name
* change-block-name
* default-quick-gelu-in-vision
* remove-useless-doc
* use-preimplemented-flash-forward
* fix-doc
* fix-image-processing-doc
* fix-apply-rotary-embed
* fix-flash-attn-sliding-window
* refactor
* remove-default_template
* remove-reorder_cache
* simple-get-rope_deltas
* update-prepare_inputs_for_generation
* update-attention-mask
* update-rotary_seq_len
* remove-state
* kv_seq_length
* remove-warning
* _supports_static_cache
* remove-legacy-cache
* refactor
* fix-replace
* mrope-section-doc
* code-quality
* code-quality
* polish-doc
* fix-image-processing-test
* update readme
* Update qwen2_vl.md
* fix-test
* Update qwen2_vl.md
* nit
* processor-kwargs
* hard-code-norm_layer
* code-quality
* discard-pixel-values-in-gen
* fix-inconsistent-error-msg
* unify-image-video
* hidden_act
* add-docstring
* vision-encode-as-PreTrainedModel
* pixel-to-target-dtype
* update doc and low memoryvit
* format
* format
* channel-foramt
* fix vit_flashatt
* format
* inherit-Qwen2VLPreTrainedModel
* simplify
* format-test
* remove-one-line-func-in-image-processing
* avoid-one-line-reshape
* simplify-rotary_seq_len
* avoid-single-letter-variable
* no-for-loop-sdpa
* avoid-single-letter-variable
* remove-one-line-reshape
* remove-one-line-reshape
* remove-no-rope-in-vit-logic
* default-mrope
* add-copied-from
* more-docs-for-mrope
* polish-doc
* comment-and-link
* polish-doc
* single-letter-variables
* simplify-image-processing
* video->images
* kv_seq_len-update
* vision-rope-on-the-fly
* vision-eager-attention
* change-processor-order
---------
Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com>
Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>
2024-08-26 15:16:44 +02:00
S M Jishanul Islam
8defc95df3
Updated the custom_models.md changed cross_entropy code ( #33118 )
2024-08-26 13:15:43 +02:00
Matt
0a7af19f4d
Update Jinja docs with new functions and general cleanup ( #33097 )
2024-08-23 17:40:06 +01:00
Jason (Siyu) Zhu
adb91179b9
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer ( #32860 )
...
* add liger integration
* fix syntax
* fix import issue
* add trainer.md
* Use _apply_liger_kernel()
* Fixed log message
* Update docs/source/en/trainer.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update docs/source/en/trainer.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
* Update docs/source/en/trainer.md
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
* Fixed checkstyle and updated readme
* Added test
* Fixed checkstyle
* fix docstring
* rename use_liger to use_liger_kernel
* Trigger Build
* Added test
* add fix-copies
* Fixed copy inconsistencies
---------
Co-authored-by: shimizust <sshimizu@linkedin.com>
Co-authored-by: Steven Shimizu <shimizust@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
2024-08-23 13:20:49 +02:00
Joao Gante
970a16ec7f
Forbid PretrainedConfig
from saving generate
parameters; Update deprecations in generate
-related code 🧹 ( #32659 )
...
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-23 11:12:53 +01:00
Jinuk
09e6579d2d
🌐 [i18n-KO] Translated `knowledge_distillation_for_image_classification.md to Korean" ( #32334 )
...
* docs: ko: tasks/knowledge_distillation_for_image_classification.md
* feat: nmt draft
* fix: manual edits
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
* Apply suggestions from code review
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
* Apply suggestions from code review
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
---------
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-08-22 10:42:39 -07:00
Shubham Ugare
9282413611
Add SynCode to llm_tutorial ( #32884 )
2024-08-22 15:30:22 +02:00
Matt
85345bb439
Add tip to clarify tool calling ( #32883 )
2024-08-19 18:37:35 +01:00
Sai-Suraj-27
37204848f1
Docs: Fixed whisper-large-v2
model link in docs ( #32871 )
...
Fixed whisper-large-v2 model link in docs.
2024-08-19 09:50:35 -07:00
Kamil Akesbi
8260cb311e
Add Descript-Audio-Codec model ( #31494 )
...
* dac model
* original dac works
* add dac model
* dac can be instatiated
* add forward pass
* load weights
* all weights are used
* convert checkpoint script ready
* test
* add feature extractor
* up
* make style
* apply cookicutter
* fix tests
* iterate on FeatureExtractor
* nit
* update dac doc
* replace nn.Sequential with nn.ModuleList
* nit
* apply review suggestions 1/2
* Update src/transformers/models/dac/modeling_dac.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* up
* apply review suggestions 2/2
* update padding in FeatureExtractor
* apply review suggestions
* iterate on design and tests
* add integration tests
* feature extractor tests
* make style
* all tests pass
* make style
* fixup
* apply review suggestions
* fix-copies
* apply review suggestions
* apply review suggestions
* Update docs/source/en/model_doc/dac.md
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update docs/source/en/model_doc/dac.md
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* anticipate transfer weights to descript
* up
* make style
* apply review suggestions
* update slow test values
* update slow tests
* update test values
* update with CI values
* update with vorace values
* update test with slice
* make style
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-08-19 10:21:51 +01:00
MAHIR DAIYAN
843e5e20ca
Add Flax Dinov2 ( #31960 )
...
* tfmsenv restored in main
* installed flax
* forward pass done and all tests passed
* make fix-copies and cleaning the scripts
* fixup attempt 1
* fixup attempt 2
* fixup third attempt
* fixup attempt 4
* fixup attempt 5
* dinov2 doc fixed
* FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE
* external pos_encoding layer removed
* fixup attempt 6
* fixed integration test values
* fixup attempt 7
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* comments removed
* comment removed from the test
* fixup
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* new fixes 1
* interpolate_pos_encoding function removed
* droppath rng fixed, pretrained beit copied-from still not working
* modeling_flax_dinov2.py reformatted
* Update tests/models/dinov2/test_modeling_flax_dinov2.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* added Copied from, to the tests
* copied from statements removed from tests
* fixed copied from statements in the tests
* [run_slow] dinov2
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-08-19 09:28:13 +01:00
Yangshen⚡Deng
a27182b7fc
Fix AutoConfig and AutoModel support for Llava-Next-Video ( #32844 )
...
* Fix: fix all model_type of Llava-Next-Video to llava_next_video
* Fix doc for llava_next_video
* * Fix formatting issues
* Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation
* Fix docs TOC for llava-next-video
2024-08-16 12:41:05 +01:00
Joao Gante
cf32ee1753
Cache: use batch_size
instead of max_batch_size
( #32657 )
...
* more precise name
* better docstrings
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-16 11:48:45 +01:00
Joao Gante
70d5df6107
Generate: unify LogitsWarper
and LogitsProcessor
( #32626 )
2024-08-16 11:20:41 +01:00
Dina Suehiro Jones
6577c77d93
Update the distributed CPU training on Kubernetes documentation ( #32669 )
...
* Update the Kubernetes CPU training example
* Add namespace arg
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
---------
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
2024-08-14 09:36:43 -07:00
Jerry Zhang
78d78cdf8a
Add TorchAOHfQuantizer ( #32306 )
...
* Add TorchAOHfQuantizer
Summary:
Enable loading torchao quantized model in huggingface.
Test Plan:
local test
Reviewers:
Subscribers:
Tasks:
Tags:
* Fix a few issues
* style
* Added tests and addressed some comments about dtype conversion
* fix torch_dtype warning message
* fix tests
* style
* TorchAOConfig -> TorchAoConfig
* enable offload + fix memory with multi-gpu
* update torchao version requirement to 0.4.0
* better comments
* add torch.compile to torchao README, add perf number link
---------
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-08-14 16:14:24 +02:00
Eric Hartford
481e15604a
Add support for GrokAdamW optimizer ( #32521 )
...
* add grokadamw
* reformat
* code review feedback, unit test
* reformat
* reformat
2024-08-13 13:20:28 +01:00