transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Yih-Dar	44fe1a1cc4	Avoid using uncessary `get_values(MODEL_MAPPING)` (#29362 ) * more fixes * more fixes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-29 17:19:17 +08:00
Younes Belkada	b647acdb53	FIX [`CI`] `require_read_token` in the llama FA2 test (#29361 ) Update test_modeling_llama.py	2024-02-29 04:49:01 +01:00
Younes Belkada	8d8ac9c2df	FIX [`CI`]: Fix failing tests for peft integration (#29330 ) fix failing tests for peft integration	2024-02-29 03:56:16 +01:00
Younes Belkada	1aee9afd1c	FIX [`CI` / `starcoder2`] Change starcoder2 path to correct one for slow tests (#29359 ) change starcoder2 path to correct one	2024-02-29 03:52:13 +01:00
Michael	2209b7afa0	[i18n-zh] Sync source/zh/index.md (#29331 ) * [i18n-zh] Sync source/zh/index.md * apply review comments	2024-02-28 09:41:18 -08:00
fxmarty	49204c1d37	Better SDPA unmasking implementation (#29318 ) * better unmask imple * comment * typo * bug report pytorch * cleanup * fix import * add back example * retrigger ci * come on	2024-02-28 16:36:47 +01:00
Marc Sun	f54d82cace	[CI] Quantization workflow (#29046 ) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit `4cb52b8822`. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-28 10:09:25 -05:00
jiqing-feng	554e7ada89	check if position_ids exists before using it (#29306 ) Co-authored-by: Joao Gante <joao@huggingface.co>	2024-02-28 14:56:25 +00:00
Daniel Han	d3a4b47544	RoPE loses precision for Llama / Gemma + Gemma logits.float() (#29285 ) * Update modeling_llama.py Llama - Force float32 since bfloat16 loses precision on long contexts * Update modeling_llama.py * Update modeling_gemma.py Fix RoPE and logits.float() * @torch.no_grad() * @torch.no_grad() * Cos, Sin to float32 * cos, sin to float32 * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Resolve PR conflicts * Fix RoPE for llama * Revert "Fix RoPE for llama" This reverts commit `b860a22dab`. * Fix RoPE for llama * RoPE device * Autocast device type * RoPE * RoPE isinstance --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-28 15:16:53 +01:00
Joao Gante	7628b3a0f4	Idefics: generate fix (#29320 )	2024-02-28 11:34:54 +00:00
Leonardo Emili	2ce56d35f6	Disable Mixtral `output_router_logits` during inference (#29249 ) * Set output_router_logits=False in prepare_inputs_for_generation for mixtral * Add output_router_logits=False to prepare_inputs_for_generation for mixtral * Fix style	2024-02-28 11:16:15 +01:00
Arthur	8a8a0a4ae0	[`Llama ROPE`] Fix torch export but also slow downs in forward (#29198 ) * remove control flow * update gptneox * update .... * nits * Actually let's just break. Otherwise we are silently failing which imo is not optimal * version BC * fix tests * fix eager causal * nit * add a test * style * nits * nits * more nits for the test * update and fix * make sure cuda graphs are not skipped * read token is needed for meta llama * update! * fiixup * compile test should be slow * fix thet fix copies * stle 🫠	2024-02-28 10:45:53 +01:00
Arthur	7c87f3577e	[`T5 and Llama Tokenizer`] remove warning (#29346 ) * remove warning * add co-author * update --------- Co-authored-by: hiaoxui <hiaoxui@users.noreply.github.com>	2024-02-28 10:41:58 +01:00
Arthur	a52888524d	[`require_read_token`] fix typo (#29345 ) fix wrapper	2024-02-28 10:13:57 +01:00
fxmarty	e715c78c66	Remove numpy usage from owlvit (#29326 ) * remove numpy usage from owlvit * fix init owlv2 * style	2024-02-28 09:38:44 +01:00
Younes Belkada	ad00c482c7	FIX [`Gemma` / `CI`] Make sure our runners have access to the model (#29242 ) * pu hf token in gemma tests * update suggestion * add to flax * revert * fix * fixup * forward contrib credits from discussion --------- Co-authored-by: ArthurZucker <ArthurZucker@users.noreply.github.com>	2024-02-28 06:25:23 +01:00
Jared Van Bortel	bd5b986306	simplify get_class_in_module and fix for paths containing a dot (#29262 )	2024-02-28 03:10:36 +01:00
RaymondLi0	63caa370e6	Starcoder2 model - bis (#29215 ) * Copy model * changes * misc * fixes * add embed and residual dropout (#30) * misc * remove rms norm and gated MLP * remove copied mentions where its not a copy anymore * remove unused _shape * copied from mistral instead * fix copies * fix copies * add not doctested * fix * fix copyright * Update docs/source/en/model_doc/starcoder2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix doc * revert some changes * add fa2 tests * fix styling nit * fix * push dummy docs --------- Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-28 01:24:34 +01:00
Michael	83ab0115d1	[i18n-zh] Translate fsdp.md into Chinese (#29305 ) * [i18n-zh] Translate fsdp.md into Chinese Signed-off-by: windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2024-02-27 11:26:57 -08:00
Sadra Barikbin	227cd54aa5	Fix a few typos in `GenerationMixin`'s docstring (#29277 ) Co-authored-by: Joao Gante <joao@huggingface.co>	2024-02-27 18:15:43 +00:00
Raushan Turganbay	ddf7ac4237	Token level timestamps for long-form generation in Whisper (#29148 )	2024-02-27 18:15:26 +00:00
Marc Sun	8a1faf2803	Add compatibility with skip_memory_metrics for mps device (#29264 ) * Add compatibility with mps device * fix * typo and style	2024-02-27 09:58:43 -05:00
Yih-Dar	5c341d4555	Use torch 2.2 for deepspeed CI (#29246 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-27 17:51:37 +08:00
Fanli Lin	63a0c8f1cb	[tests] enable benchmark unit tests on XPU (#29284 ) * add xpu for benchmark * no auto_map * use require_torch_gpu * use gpu * revert * revert * fix style	2024-02-27 09:44:48 +00:00
fxmarty	6d3b643e2a	Fix `attn_implementation` documentation (#29295 ) fix	2024-02-27 10:43:01 +01:00
Merve Noyan	83e366bfd4	Image Feature Extraction docs (#28973 ) * Image Feature Extraction docs * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_feature_extraction.md * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comments * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update image_feature_extraction.md * Update image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Maria Khalusova <kafooster@gmail.com>	2024-02-27 09:39:58 +00:00
Andrei Panferov	e3fc90ae68	Cleaner Cache `dtype` and `device` extraction for CUDA graph generation for quantizers compatibility (#29079 ) * input_layernorm as the beacon of hope * cleaner dtype extraction * AQLM + CUDA graph test * is available check * shorter text test	2024-02-27 09:32:39 +01:00
regisss	a3f9221a44	Add generate kwargs to VQA pipeline (#29134 )	2024-02-27 03:03:00 +01:00
FredericOdermatt	871ba71dfa	GenerationConfig validate both constraints and force_words_ids (#29163 ) GenerationConfig validate both options for constrained decoding: constraints and force_words_ids	2024-02-27 01:43:52 +01:00
Eduardo Pacheco	3fcfbe7549	Adding SegGPT (#27735 ) * First commit * Improvements * More improvements * Converted original checkpoint to HF checkpoint * Fix style * Fixed forward * More improvements * More improvements * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove asserts * Remove unnecessary attributes * Changed model name to camel case * Improve forward doc * Improve tests * More improvements * Fix copies * Fix doc * Make SegGptImageProcessor more flexible * Added few-shot test * Fix style * Update READMEs and docs * Update READMEs * Make inputs required * Add SegGptForImageSegmentation * Make tests pass * Rename to out_indicies * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fixed naming convention * Copying SegGptMlp from modeling_sam.py * Some minor improvements * Remove mlp_ratio * Fix docstrings * Fixed docstring match * Objects defined before use * Storing only patch_size and beta for SegGptLoss * removed _prepare_inputs method * Removed modified from headers * Renamed to output_indicies * Removed unnecessary einsums * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixing issues * Raise error as soon as possible * More fixes * Fix merge * Added palette to SegGptImageProcessor * Fixed typo * Fixed shape typo * Added permute before doing palette to class mapping * Fixed style * Fixed and added tests * Fixed docstrings * Matching SegFormer API for post_processing_semantic_segmentation * Fixed copies * Fixed SegGptImageProcessor to handle both binary and RGB masks * Updated docstrings of SegGptImageProcessor * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/seggpt.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Object definitions above & fix style * Renamed output_indices to intermediate_feature_indices * Removed unnecessary check on bool_masked_pos * Loss first in the outputs * Added validation for do_normalize * Improved SegGptImageProcessor and added new tests * Added comment * Added docstrings to SegGptLoss * Reimplemented ensemble condition logic in SegGptEncoder * Update src/transformers/models/seggpt/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated docstrings to use post_process_semantic_segmentation * Fixed typo on docstrings * moved pixel values test to test_image_processing_seggpt * Addressed comments * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated docstrings for SegGptLoss * Address comments * Added SegGpt example to model docs * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * moved patchify and unpatchify * Rename checkpoint * Renamed intermediate_features to intermediate_hidden_states for consistency * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Replaced post_process_masks for post_process_semantic_segmentation in the docs --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-26 18:17:19 +00:00
Eduardo Pacheco	3b8c053631	Fixed Deformable Detr typo when loading cuda kernels for MSDA (#29294 )	2024-02-26 17:24:30 +00:00
Michael	a44d2dc3a9	[i18n-zh] Translated task/asr.md into Chinese (#29233 ) * [zh] Translate a task: asr.md Signed-off-by: windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2024-02-26 08:53:05 -08:00
David Nguyen	c29135046a	[i18n-vi] Translate README.md to Vietnamese (#29229 ) * Add Tiếng Việt language support * Add Vietnamese translation link to README.md * update README_vi.md	2024-02-26 08:42:46 -08:00
Ming Xu (徐明)	734eb25476	🌐 [i18n-ZH] Translate chat_templating.md into Chinese (#28790 ) * [Pix2struct] Simplify generation (#22527) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link * Release: v4.28.0 * Revert (for now) the change on `Deta` in #22437 (#22750) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Patch release: v4.28.1 * update zh chat template. * Update docs/source/zh/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/_toctree.yml Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Michael <haifeng.yao@daocloud.io>	2024-02-26 08:42:24 -08:00
Michael	b43340455d	[i18n-zh] Translated torchscript.md into Chinese (#29234 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2024-02-26 08:27:47 -08:00
Aaron Jimenez	9f7535bda8	[docs] Spanish translation of tasks_explained.md (#29224 ) * Add tasks_explained.md to es/ * Fix little typo in en/ version * translate speach/audio section * translate part of vision computer section \| fix little typo in en/ * Fix little typo in en/ * Translate vision computer section \| remove to * * in both files * Translate NLP section \| fix link to task/translation in en/ * Updete link in es/tasks_summary.md * Fix task_summary title link	2024-02-26 08:18:15 -08:00
Raushan Turganbay	8f2f0f0f85	Track each row separately for stopping criteria (#29116 )	2024-02-26 16:06:16 +00:00
Joao Gante	ece1b62b93	Generate: v4.38 removals and related updates (#29171 )	2024-02-26 13:36:12 +00:00
fxmarty	24d59c7969	Use `torch.bool` instead of `torch.int64` for non-persistant causal mask buffer (#29241 ) use torch.bool instead of torch.int64	2024-02-26 14:06:43 +01:00
Merve Noyan	7c4995f93d	Add feature extraction mapping for automatic metadata update (#28944 ) * add feature extraction mapping * added prefix * ruff check * minor fix * Update modeling_auto.py * fix typo * remove prefix to make variable public/importable * Update src/transformers/models/auto/modeling_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixes * addressed comments * nit * fix-copies * remove from tests * this should fix * Update tests/models/convnextv2/test_modeling_convnextv2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-26 10:35:37 +00:00
fxmarty	2a7746c4d1	Add `non_device_test` pytest mark to filter out non-device tests (#29213 ) * add conftest * fix * remove deselected	2024-02-26 11:05:49 +01:00
Yih-Dar	93f8617afd	Use `DS_DISABLE_NINJA=1` (#29290 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-26 17:41:01 +08:00
Benjamin Muskalla	9fe360883e	Cache `is_vision_available` result (#29280 ) Cache `is_vision_available` This check is used quite often during process in image models and can take up a serious amount of time compared to the other processing steps.	2024-02-26 09:01:45 +00:00
Yih-Dar	c8d98405a8	Use torch 2.2 for daily CI (model tests) (#29208 ) * Use torch 2.2 for daily CI (model tests) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-23 21:37:08 +08:00
Matt	371b572e55	Allow remote code repo names to contain "." (#29175 ) * stash commit * stash commit * It works! * Remove unnecessary change * We don't actually need the cache_dir! * Update docstring * Add test * Add test with custom cache dir too * Update model repo path	2024-02-23 12:46:31 +00:00
Arthur	89c64817ce	[`Doc`] update model doc qwen2 (#29238 ) * update model doc qwen2 * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-23 10:43:31 +01:00
Alessandro Palla	3f60d11a87	Improve _update_causal_mask performance (#29210 ) * Fix issue 29206 * Fix style	2024-02-23 10:40:44 +01:00
Amin	75ed76ecea	Fix missing translation in README_ru (#29054 ) * Fix missing translation in README_ru * Update README_ru.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> --------- Co-authored-by: Maria Khalusova <kafooster@gmail.com>	2024-02-23 09:26:21 +01:00
cchen-dialpad	4524494072	fix(mlflow): check mlflow version to use the synchronous flag (#29195 ) * fix(mlflow): check mlflow version to use the flag * fix indent * add log_params async and fix quality	2024-02-23 09:19:51 +01:00
fxmarty	2cc8cf6ce7	Fix `torch.compile` with `fullgraph=True` when `attention_mask` input is used (#29211 ) * fix torch.export.export for llama * do not change doc title * make fix copies	2024-02-22 16:40:06 +01:00

1 2 3 4 5 ...

15231 Commits