Commit Graph

15231 Commits

Author SHA1 Message Date
Yih-Dar
44fe1a1cc4
Avoid using uncessary get_values(MODEL_MAPPING) (#29362)
* more fixes

* more fixes

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-29 17:19:17 +08:00
Younes Belkada
b647acdb53
FIX [CI] require_read_token in the llama FA2 test (#29361)
Update test_modeling_llama.py
2024-02-29 04:49:01 +01:00
Younes Belkada
8d8ac9c2df
FIX [CI]: Fix failing tests for peft integration (#29330)
fix failing tests for peft integration
2024-02-29 03:56:16 +01:00
Younes Belkada
1aee9afd1c
FIX [CI / starcoder2] Change starcoder2 path to correct one for slow tests (#29359)
change starcoder2 path to correct one
2024-02-29 03:52:13 +01:00
Michael
2209b7afa0
[i18n-zh] Sync source/zh/index.md (#29331)
* [i18n-zh] Sync source/zh/index.md

* apply review comments
2024-02-28 09:41:18 -08:00
fxmarty
49204c1d37
Better SDPA unmasking implementation (#29318)
* better unmask imple

* comment

* typo

* bug report pytorch

* cleanup

* fix import

* add back example

* retrigger ci

* come on
2024-02-28 16:36:47 +01:00
Marc Sun
f54d82cace
[CI] Quantization workflow (#29046)
* [CI] Quantization workflow

* build dockerfile

* fix dockerfile

* update self-cheduled.yml

* test build dockerfile on push

* fix torch install

* udapte to python 3.10

* update aqlm version

* uncomment build dockerfile

* tests if the scheduler works

* fix docker

* do not trigger on psuh again

* add additional runs

* test again

* all good

* style

* Update .github/workflows/self-scheduled.yml

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* test build dockerfile with torch 2.2.0

* fix extra

* clean

* revert changes

* Revert "revert changes"

This reverts commit 4cb52b8822.

* revert correct change

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-28 10:09:25 -05:00
jiqing-feng
554e7ada89
check if position_ids exists before using it (#29306)
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-02-28 14:56:25 +00:00
Daniel Han
d3a4b47544
RoPE loses precision for Llama / Gemma + Gemma logits.float() (#29285)
* Update modeling_llama.py

Llama - Force float32 since bfloat16 loses precision on long contexts

* Update modeling_llama.py

* Update modeling_gemma.py

Fix RoPE and logits.float()

* @torch.no_grad()

* @torch.no_grad()

* Cos, Sin to float32

* cos, sin to float32

* Update src/transformers/models/gemma/modeling_gemma.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Resolve PR conflicts

* Fix RoPE for llama

* Revert "Fix RoPE for llama"

This reverts commit b860a22dab.

* Fix RoPE for llama

* RoPE device

* Autocast device type

* RoPE

* RoPE isinstance

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-28 15:16:53 +01:00
Joao Gante
7628b3a0f4
Idefics: generate fix (#29320) 2024-02-28 11:34:54 +00:00
Leonardo Emili
2ce56d35f6
Disable Mixtral output_router_logits during inference (#29249)
* Set output_router_logits=False in prepare_inputs_for_generation for mixtral

* Add output_router_logits=False to prepare_inputs_for_generation for mixtral

* Fix style
2024-02-28 11:16:15 +01:00
Arthur
8a8a0a4ae0
[Llama ROPE] Fix torch export but also slow downs in forward (#29198)
* remove control flow

* update gptneox

* update ....

* nits

* Actually let's just break. Otherwise we are silently failing which imo is not optimal

* version BC

* fix tests

* fix eager causal

* nit

* add a test

* style

* nits

* nits

* more nits for the test

* update and fix

* make sure cuda graphs are not skipped

* read token is needed for meta llama

* update!

* fiixup

* compile test should be slow

* fix thet fix copies

* stle 🫠
2024-02-28 10:45:53 +01:00
Arthur
7c87f3577e
[T5 and Llama Tokenizer] remove warning (#29346)
* remove warning

* add co-author

* update

---------

Co-authored-by: hiaoxui <hiaoxui@users.noreply.github.com>
2024-02-28 10:41:58 +01:00
Arthur
a52888524d
[require_read_token] fix typo (#29345)
fix wrapper
2024-02-28 10:13:57 +01:00
fxmarty
e715c78c66
Remove numpy usage from owlvit (#29326)
* remove numpy usage from owlvit

* fix init owlv2

* style
2024-02-28 09:38:44 +01:00
Younes Belkada
ad00c482c7
FIX [Gemma / CI] Make sure our runners have access to the model (#29242)
* pu hf token in gemma tests

* update suggestion

* add to flax

* revert

* fix

* fixup

* forward contrib credits from discussion

---------

Co-authored-by: ArthurZucker <ArthurZucker@users.noreply.github.com>
2024-02-28 06:25:23 +01:00
Jared Van Bortel
bd5b986306
simplify get_class_in_module and fix for paths containing a dot (#29262) 2024-02-28 03:10:36 +01:00
RaymondLi0
63caa370e6
Starcoder2 model - bis (#29215)
* Copy model

* changes

* misc

* fixes

* add embed and residual dropout (#30)

* misc

* remove rms norm and gated MLP

* remove copied mentions where its not a copy anymore

* remove unused _shape

* copied from mistral instead

* fix copies

* fix copies

* add not doctested

* fix

* fix copyright

* Update docs/source/en/model_doc/starcoder2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/starcoder2/configuration_starcoder2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/starcoder2/configuration_starcoder2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix doc

* revert some changes

* add fa2 tests

* fix styling nit

* fix

* push dummy docs

---------

Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-28 01:24:34 +01:00
Michael
83ab0115d1
[i18n-zh] Translate fsdp.md into Chinese (#29305)
* [i18n-zh] Translate fsdp.md into Chinese

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>

* apply suggestions from Fan-Lin

---------

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2024-02-27 11:26:57 -08:00
Sadra Barikbin
227cd54aa5
Fix a few typos in GenerationMixin's docstring (#29277)
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-02-27 18:15:43 +00:00
Raushan Turganbay
ddf7ac4237
Token level timestamps for long-form generation in Whisper (#29148) 2024-02-27 18:15:26 +00:00
Marc Sun
8a1faf2803
Add compatibility with skip_memory_metrics for mps device (#29264)
* Add compatibility with mps device

* fix

* typo and style
2024-02-27 09:58:43 -05:00
Yih-Dar
5c341d4555
Use torch 2.2 for deepspeed CI (#29246)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-27 17:51:37 +08:00
Fanli Lin
63a0c8f1cb
[tests] enable benchmark unit tests on XPU (#29284)
* add xpu for benchmark

* no auto_map

* use require_torch_gpu

* use gpu

* revert

* revert

* fix style
2024-02-27 09:44:48 +00:00
fxmarty
6d3b643e2a
Fix attn_implementation documentation (#29295)
fix
2024-02-27 10:43:01 +01:00
Merve Noyan
83e366bfd4
Image Feature Extraction docs (#28973)
* Image Feature Extraction docs

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update image_feature_extraction.md

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Address comments

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update image_feature_extraction.md

* Update image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
2024-02-27 09:39:58 +00:00
Andrei Panferov
e3fc90ae68
Cleaner Cache dtype and device extraction for CUDA graph generation for quantizers compatibility (#29079)
* input_layernorm as the beacon of hope

* cleaner dtype extraction

* AQLM + CUDA graph test

* is available check

* shorter text test
2024-02-27 09:32:39 +01:00
regisss
a3f9221a44
Add generate kwargs to VQA pipeline (#29134) 2024-02-27 03:03:00 +01:00
FredericOdermatt
871ba71dfa
GenerationConfig validate both constraints and force_words_ids (#29163)
GenerationConfig validate both options for constrained decoding: constraints and force_words_ids
2024-02-27 01:43:52 +01:00
Eduardo Pacheco
3fcfbe7549
Adding SegGPT (#27735)
* First commit

* Improvements

* More improvements

* Converted original checkpoint to HF checkpoint

* Fix style

* Fixed forward

* More improvements

* More improvements

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove asserts

* Remove unnecessary attributes

* Changed model name to camel case

* Improve forward doc

* Improve tests

* More improvements

* Fix copies

* Fix doc

* Make SegGptImageProcessor more flexible

* Added few-shot test

* Fix style

* Update READMEs and docs

* Update READMEs

* Make inputs required

* Add SegGptForImageSegmentation

* Make tests pass

* Rename to out_indicies

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fixed naming convention

* Copying SegGptMlp from modeling_sam.py

* Some minor improvements

* Remove mlp_ratio

* Fix docstrings

* Fixed docstring match

* Objects defined before use

* Storing only patch_size and beta for SegGptLoss

* removed _prepare_inputs method

* Removed modified from headers

* Renamed to output_indicies

* Removed unnecessary einsums

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixing issues

* Raise error as soon as possible

* More fixes

* Fix merge

* Added palette to SegGptImageProcessor

* Fixed typo

* Fixed shape typo

* Added permute before doing palette to class mapping

* Fixed style

* Fixed and added tests

* Fixed docstrings

* Matching SegFormer API for post_processing_semantic_segmentation

* Fixed copies

* Fixed SegGptImageProcessor to handle both binary and RGB masks

* Updated docstrings of SegGptImageProcessor

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/seggpt.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/convert_seggpt_to_hf.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Object definitions above & fix style

* Renamed output_indices to intermediate_feature_indices

* Removed unnecessary check on bool_masked_pos

* Loss first in the outputs

* Added validation for do_normalize

* Improved SegGptImageProcessor and added new tests

* Added comment

* Added docstrings to SegGptLoss

* Reimplemented ensemble condition logic in SegGptEncoder

* Update src/transformers/models/seggpt/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/convert_seggpt_to_hf.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Updated docstrings to use post_process_semantic_segmentation

* Fixed typo on docstrings

* moved pixel values test to test_image_processing_seggpt

* Addressed comments

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Updated docstrings for SegGptLoss

* Address comments

* Added SegGpt example to model docs

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* moved patchify and unpatchify

* Rename checkpoint

* Renamed intermediate_features to intermediate_hidden_states for consistency

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Replaced post_process_masks for post_process_semantic_segmentation in the docs

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-26 18:17:19 +00:00
Eduardo Pacheco
3b8c053631
Fixed Deformable Detr typo when loading cuda kernels for MSDA (#29294) 2024-02-26 17:24:30 +00:00
Michael
a44d2dc3a9
[i18n-zh] Translated task/asr.md into Chinese (#29233)
* [zh] Translate a task: asr.md

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>

* apply suggestions from Fan-Lin

---------

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2024-02-26 08:53:05 -08:00
David Nguyen
c29135046a
[i18n-vi] Translate README.md to Vietnamese (#29229)
* Add Tiếng Việt language support

* Add Vietnamese translation link to README.md

* update README_vi.md
2024-02-26 08:42:46 -08:00
Ming Xu (徐明)
734eb25476
🌐 [i18n-ZH] Translate chat_templating.md into Chinese (#28790)
* [Pix2struct] Simplify generation (#22527)

* Add model to doc tests

* Remove generate and replace by prepare_inputs_for_generation

* More fixes

* Remove print statements

* Update integration tests

* Fix generate

* Remove model from auto mapping

* Use auto processor

* Fix integration tests

* Fix test

* Add inference code snippet

* Remove is_encoder_decoder

* Update docs

* Remove notebook link

* Release: v4.28.0

* Revert (for now) the change on `Deta` in #22437 (#22750)

fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Patch release: v4.28.1

* update zh chat template.

* Update docs/source/zh/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/zh/_toctree.yml

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Michael <haifeng.yao@daocloud.io>
2024-02-26 08:42:24 -08:00
Michael
b43340455d
[i18n-zh] Translated torchscript.md into Chinese (#29234)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2024-02-26 08:27:47 -08:00
Aaron Jimenez
9f7535bda8
[docs] Spanish translation of tasks_explained.md (#29224)
* Add tasks_explained.md to es/

* Fix little typo in en/ version

* translate speach/audio section

* translate part of vision computer section | fix little typo in en/

* Fix little typo in en/

* Translate vision computer section | remove ** ** to * * in both files

* Translate NLP section | fix link to task/translation in en/

* Updete link in es/tasks_summary.md

* Fix task_summary title link
2024-02-26 08:18:15 -08:00
Raushan Turganbay
8f2f0f0f85
Track each row separately for stopping criteria (#29116) 2024-02-26 16:06:16 +00:00
Joao Gante
ece1b62b93
Generate: v4.38 removals and related updates (#29171) 2024-02-26 13:36:12 +00:00
fxmarty
24d59c7969
Use torch.bool instead of torch.int64 for non-persistant causal mask buffer (#29241)
use torch.bool instead of torch.int64
2024-02-26 14:06:43 +01:00
Merve Noyan
7c4995f93d
Add feature extraction mapping for automatic metadata update (#28944)
* add feature extraction mapping

* added prefix

* ruff check

* minor fix

* Update modeling_auto.py

* fix typo

* remove prefix to make variable public/importable

* Update src/transformers/models/auto/modeling_auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixes

* addressed comments

* nit

* fix-copies

* remove from tests

* this should fix

* Update tests/models/convnextv2/test_modeling_convnextv2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-26 10:35:37 +00:00
fxmarty
2a7746c4d1
Add non_device_test pytest mark to filter out non-device tests (#29213)
* add conftest

* fix

* remove deselected
2024-02-26 11:05:49 +01:00
Yih-Dar
93f8617afd
Use DS_DISABLE_NINJA=1 (#29290)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-26 17:41:01 +08:00
Benjamin Muskalla
9fe360883e
Cache is_vision_available result (#29280)
Cache `is_vision_available`

This check is used quite often during process in image models and can take up a serious amount of time compared to the other processing steps.
2024-02-26 09:01:45 +00:00
Yih-Dar
c8d98405a8
Use torch 2.2 for daily CI (model tests) (#29208)
* Use torch 2.2 for daily CI (model tests)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-23 21:37:08 +08:00
Matt
371b572e55
Allow remote code repo names to contain "." (#29175)
* stash commit

* stash commit

* It works!

* Remove unnecessary change

* We don't actually need the cache_dir!

* Update docstring

* Add test

* Add test with custom cache dir too

* Update model repo path
2024-02-23 12:46:31 +00:00
Arthur
89c64817ce
[Doc] update model doc qwen2 (#29238)
* update model doc qwen2

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-23 10:43:31 +01:00
Alessandro Palla
3f60d11a87
Improve _update_causal_mask performance (#29210)
* Fix issue 29206

* Fix style
2024-02-23 10:40:44 +01:00
Amin
75ed76ecea
Fix missing translation in README_ru (#29054)
* Fix missing translation in README_ru

* Update README_ru.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

---------

Co-authored-by: Maria Khalusova <kafooster@gmail.com>
2024-02-23 09:26:21 +01:00
cchen-dialpad
4524494072
fix(mlflow): check mlflow version to use the synchronous flag (#29195)
* fix(mlflow): check mlflow version to use the  flag

* fix indent

* add log_params async and fix quality
2024-02-23 09:19:51 +01:00
fxmarty
2cc8cf6ce7
Fix torch.compile with fullgraph=True when attention_mask input is used (#29211)
* fix torch.export.export for llama

* do not change doc title

* make fix copies
2024-02-22 16:40:06 +01:00