Anton Vlasjuk
46df859975
[GPTNeoX
] Flex Attention + Refactor ( #34896 )
...
* gpt neox flex attention + refactor
* some formatting
* small fix on dropout
* add assertion on flex attn test
* flaky ci :(
* add head mask support
* style
* handle dtype, replace torch where
* fixup flex with output attns
* code review and several other fixes
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* style
* remove unnecessary comment
* remove incorrect comment
* make flex attn check more agnostic tor versions and centralized
* change peft input dtype check to value since q and k could be affected by other stuff like RoPE
* i forgor
* flaky
* code review and small fixes
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:48:28 +01:00
Vladislav Bronzov
accb7204f9
Add Pytorch Tensor Parallel support for Qwen2, Qwen2Moe, Starcoder2 ( #35007 )
...
* add base tp plan for qwen2 and qwen2moe
* add parallel tp for starcoder2
* fix modular conversion
* add infer dim for qkv states
* Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:43:36 +01:00
Tianshu Wang
c7a109ec81
Fix pad_token_tensor
is None in warning ( #34005 )
...
Fix pad_token_tensor is None in warning
2024-12-04 11:15:25 +01:00
Fanli Lin
329f5dbf97
[docs] use device-agnostic API instead of hard-coded cuda ( #35048 )
...
replace cuda
2024-12-03 10:54:15 -08:00
Fanli Lin
b8cdc262d5
[docs] use device-agnostic instead of cuda
( #35047 )
...
* fix on xpu
* [run_all]
* add the missing import for Image lib
* add more devices in comment
* bug fix
* replace cuda
2024-12-03 10:53:45 -08:00
wwwbai
346597b644
Translate community.md into Chinese ( #35013 )
...
* community translation
* Update docs/source/zh/community.md
Co-authored-by: Isotr0py <2037008807@qq.com>
---------
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-12-03 10:22:02 -08:00
Fanli Lin
3deaa8179d
[docs] fix example code bug ( #35054 )
...
fix code bug
2024-12-03 09:18:39 -08:00
Wang, Yi
125de41643
fix speecht5 failure issue in test_peft_gradient_checkpointing_enable… ( #34454 )
...
* fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
* [run-slow] speecht5
---------
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-12-03 13:58:54 +00:00
Yih-Dar
7a7f27697a
Fix BertGeneration
( #35043 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-03 13:56:59 +01:00
Aymeric Roucher
901f504580
Add token cost + runtime monitoring to Agent and HfEngine children ( #34548 )
...
* Add monitoring to Agent and HfEngine children
2024-12-03 13:14:52 +01:00
Cyril Vallez
ee37bf0d95
Automatic compilation in generate: do not rely on inner function ( #34923 )
...
* compiled forward in PreTrainedModel
* update
* style
* update name
* trigger CIs
* Add way to use custom compile args
* style
* switch parameterization to generation_config
* Add to inits
* Update configuration_utils.py
* inits
* style
* docs
* style
* Update configuration_utils.py
* back without dataclass for repo consistency
* Update configuration_utils.py
* style
* style
* style once again
* add config serialization
* update
* true dataclass
* trigger CIs
* merge compile methods + remove serialization of compile config
2024-12-03 11:20:31 +01:00
wwwbai
f9c7e6021e
Translate bertlogy.md into Chinese ( #34908 )
...
* bertology translation
* Update docs/source/zh/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/zh/bertology.md
Co-authored-by: blueingman <15329507600@163.com>
* Update docs/source/zh/bertology.md
Co-authored-by: blueingman <15329507600@163.com>
* Update docs/source/zh/bertology.md
Co-authored-by: Isotr0py <2037008807@qq.com>
* Update docs/source/zh/bertology.md
Co-authored-by: Isotr0py <2037008807@qq.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: blueingman <15329507600@163.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-12-02 11:42:40 -08:00
Fanli Lin
527dc04e46
[docs] add the missing import for Image and bug fix ( #34776 )
...
* add the missing import for Image lib
* add more devices in comment
* bug fix
2024-12-02 11:40:20 -08:00
Ahmed Almaghz
4955e4e638
[i18n-ar] Translated file : docs/source/ar/notebooks.md
into Arabic ( #33049 )
...
* Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md
* Update notebooks.md
* Update _toctree.yml
2024-12-02 11:40:04 -08:00
secrettoad
f0dec874f0
add docstring example for compute_loss_func ( #35020 )
2024-12-02 11:39:09 -08:00
Henry Hyeonmok Ko
31299670cd
Multiple typo fixes in Tutorials docs ( #35035 )
...
* Fixed typo in multi gpu docs and OLMoE version
* Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction
* Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs
2024-12-02 15:26:34 +00:00
Dmitry Rogozhkin
31830474bf
Fix test_eager_matches_sdpa_inference
for XPU
backend ( #34889 )
...
* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Fix test_eager_matches_sdpa_inference for XPU backend
As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH
which is implemented on PyTorch level using aten operators and is device
agnostic with respect to implementation of each aten operator. Thus, we can
reuse CUDA (or CPU) MATH weights for XPU.
Fixes : #34888
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
---------
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-12-02 16:21:04 +01:00
Jacky Lee
f41d5d8f74
Add type hints for forward functions in Gemma2 ( #35034 )
...
* feat: add gemma2 type hints
* fix: mask is optional
2024-12-02 14:03:36 +00:00
Bojun Feng
7b5f76e32e
Typo in warning switching to optimum-quanto ( #35028 )
...
fix typos
2024-12-02 13:47:05 +00:00
milesial
c24c79ebf9
Optimize memory usage of mllama encoder ( #34930 )
...
mllama encoder memory optimization
2024-12-02 11:46:45 +01:00
Weize Chen
9ab8c5b503
fix variable undefined bug when return_tensors is not specified in llava processing ( #34953 )
...
* fix variable undefined bug when return_tensors is not specified in llava processor
* improve readability
2024-12-02 11:44:42 +01:00
Joshua Lochner
3480cbb97e
Only cast cu_seqlens
when tracing ( #35016 )
...
* Only cast `cu_seqlens` when tracing
* Formatting
2024-12-02 11:39:39 +01:00
Alvaro Bartolome
19dabe9636
Update FillMaskPipeline.__call__
signature and docstring ( #35006 )
...
Update `FillMaskPipeline.__call__`
- Remove unused `*args`
- Update docstring with `inputs` over `args`
2024-11-29 13:44:56 +00:00
Samuel Larkin
f7427f58ed
fix: double verbs ( #35008 )
2024-11-29 13:19:57 +00:00
Pavel Iakubovskii
737f4dc4b6
Update timm version ( #35005 )
...
* Bump timm
* dev-ci
2024-11-29 12:46:59 +00:00
Tibor Reiss
89d7bf584f
🚨 🚨 🚨 Uniformize kwargs for TrOCR Processor ( #34587 )
...
* Make kwargs uniform for TrOCR
* Add tests
* Put back current_processor
* Remove args
* Add todo comment
* Code review - breaking change
2024-11-29 11:58:11 +00:00
Lucain
0b5b5e6a70
Let server decide default repo visibility ( #34999 )
...
* Let server decide default repo visibility
* code style
2024-11-28 17:05:08 +01:00
Mohamed Mekkouri
f491096f7d
Fix docker CI : install autogptq from source ( #35000 )
...
* Fixed Docker
* Test ci
* Finally
* add comment
2024-11-28 16:31:36 +01:00
Pavel Iakubovskii
01ad80f820
Improve .from_pretrained
type annotations ( #34973 )
...
* Fix from_pretrained type annotations
* Better typing for image processor's `from_pretrained`
2024-11-28 15:05:19 +00:00
Michael Goin
9d6f0ddcec
Add optimized PixtralImageProcessorFast
( #34836 )
...
* Add optimized PixtralImageProcessorFast
* make style
* Add dummy_vision_object
* Review comments
* Format
* Fix dummy
* Format
* np.ceil for math.ceil
2024-11-28 16:04:05 +01:00
Yih-Dar
6300212946
Fix utils/check_bad_commit.py
(for auto ping in CI) ( #34943 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-28 15:34:38 +01:00
Raushan Turganbay
5e8c1d713d
Offloaded cache: fix generate ( #34921 )
...
* fix cache impl
* require_torch_gpu
* fix mamba
* fix copies
2024-11-28 15:05:56 +01:00
George
57ca9e6d2f
Allow compressed-tensors quantized model to be trained ( #34520 )
...
* populate quantization_config for kv-cache-scheme only configs
* make compressed-tensors quantized models trainable
* populate versions on quant config
* pass oneshot then finetune
* remove breakpoint
* SunMarc comments and fix to_dict logic
* lint
* lint
* test
* comment
* comments'
2024-11-28 15:05:16 +01:00
xinpengzz
44af935ec5
Refine the code of Universal Assisted Generation ( #34823 )
...
* removed the useless attritbutes
* add configs for window size
* fixed the wrong kwargs
* added docstring
2024-11-28 15:04:24 +01:00
Oscar Skean
2b053fdf1a
🚨 🚨 🚨 Changed DINOv2Config default patch size to 14 ( #34568 )
...
Changed DINOv2Config default patch size to 14
2024-11-28 14:48:06 +01:00
Kyle Sayers
4f0bf9864c
Fix save_pretrained
for partially offloaded models ( #34890 )
...
* delete unnecessary reference
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* update comment, explicit delete state_dict
* Update src/transformers/modeling_utils.py
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* fix style
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
---------
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-11-28 14:46:56 +01:00
Benjamin Bossan
f4b674f269
[PEFT] Set eval mode when loading PEFT adapter ( #34509 )
...
* [PEFT] Set eval mode when loading PEFT adapter
Resolves #34469
When calling model.load_adapter to load a PEFT adapter, by default the
adapter should be set to eval mode. This is now correctly done. Users
can still pass is_trainable=True to load the adapter in training mode.
* Linter
2024-11-28 13:56:25 +01:00
Sergio Paniego Blanco
5523e38b55
Fixed typo in VisitWebpageTool
( #34978 )
...
Fixed typo in VisitWebpageTool
2024-11-27 12:49:21 -08:00
Xiao Yuan
4120cb257f
Fix typo in code block in vipllava.md ( #34957 )
...
fix typo in code block in vipllava.md
2024-11-27 08:19:34 -08:00
blueingman
2910015d6d
[i18n-zh]Translated perf_train_special.md into Chinese ( #34948 )
...
* Add translation for perf_train_special documentation
* Update docs/source/zh/perf_train_special.md
Co-authored-by: Isotr0py <2037008807@qq.com>
* Update docs/source/zh/perf_train_special.md
Co-authored-by: Isotr0py <2037008807@qq.com>
* Update _toctree.yml
* Update _toctree.yml
* Update perf_train_special.md
* Update perf_train_special.md
---------
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-11-27 07:57:43 -08:00
Fanli Lin
637225508f
[docs] add explanation to release_memory()
( #34911 )
...
* explain release_memory
* Update docs/source/en/llm_tutorial_optimization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-27 07:47:28 -08:00
MaCAT
0600f46353
🌐 [i18n-KO] Translated encoder-decoder.md to Korean ( #34880 )
...
* Initial version of translation, english still remaining
* Revised Translation, removed english. _toctree not updated
* updated _toctree.yml && 3rd ver translation
* updated _toctree.yml && 3rd ver translation
* Update encoder-decoder.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
* Update encoder-decoder.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
* Update encoder-decoder.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
* Update encoder-decoder.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
* Update encoder-decoder.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
* Update encoder-decoder.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
---------
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
2024-11-27 07:47:14 -08:00
Yih-Dar
5f8b24ee12
Fix flaky test execution caused by Thread
( #34966 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-27 16:32:50 +01:00
Yih-Dar
0d99a938aa
Avoid calling get_max_length
( #34971 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-27 15:15:35 +01:00
Mohamed Mekkouri
8f48ccf548
Fix : Add PEFT from source to CI docker ( #34969 )
...
* Docker fix peft
* Test new docker
* uncomment
2024-11-27 14:10:47 +01:00
Arthur
4c1388f48e
[FlexAttention
] Update gemma2 ( #34942 )
...
* update tests
* now maybe this fixes the previous fialing tests!
* nit default
* Update src/transformers/models/gemma2/modular_gemma2.py
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
* fix-copies
---------
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2024-11-27 11:50:48 +01:00
blueingman
6c3f168b36
[i18n-zh]Translated tiktoken.md into chinese ( #34936 )
...
* Add translation for tiktoken documentation
* Update tiktoken.md
* Update tiktoken.md
2024-11-26 10:09:52 -08:00
谭九鼎
5bfb40bc8e
docs: HUGGINGFACE_HUB_CACHE -> HF_HUB_CACHE ( #34904 )
2024-11-26 09:37:18 -08:00
Fanli Lin
784d22078a
[doc] use full path for run_qa.py ( #34914 )
...
use full path for run_qa.py
2024-11-26 09:23:44 -08:00
Fanli Lin
6bc0c219c1
[docs] use device-agnostic API instead of cuda ( #34913 )
...
add device-agnostic API
Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-11-26 09:23:34 -08:00