transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-19 12:38:23 +06:00

Author	SHA1	Message	Date
Fanli Lin	63a0c8f1cb	[tests] enable benchmark unit tests on XPU (#29284 ) * add xpu for benchmark * no auto_map * use require_torch_gpu * use gpu * revert * revert * fix style	2024-02-27 09:44:48 +00:00
fxmarty	6d3b643e2a	Fix `attn_implementation` documentation (#29295 ) fix	2024-02-27 10:43:01 +01:00
Merve Noyan	83e366bfd4	Image Feature Extraction docs (#28973 ) * Image Feature Extraction docs * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_feature_extraction.md * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comments * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update image_feature_extraction.md * Update image_feature_extraction.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Maria Khalusova <kafooster@gmail.com>	2024-02-27 09:39:58 +00:00
Andrei Panferov	e3fc90ae68	Cleaner Cache `dtype` and `device` extraction for CUDA graph generation for quantizers compatibility (#29079 ) * input_layernorm as the beacon of hope * cleaner dtype extraction * AQLM + CUDA graph test * is available check * shorter text test	2024-02-27 09:32:39 +01:00
regisss	a3f9221a44	Add generate kwargs to VQA pipeline (#29134 )	2024-02-27 03:03:00 +01:00
FredericOdermatt	871ba71dfa	GenerationConfig validate both constraints and force_words_ids (#29163 ) GenerationConfig validate both options for constrained decoding: constraints and force_words_ids	2024-02-27 01:43:52 +01:00
Eduardo Pacheco	3fcfbe7549	Adding SegGPT (#27735 ) * First commit * Improvements * More improvements * Converted original checkpoint to HF checkpoint * Fix style * Fixed forward * More improvements * More improvements * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove asserts * Remove unnecessary attributes * Changed model name to camel case * Improve forward doc * Improve tests * More improvements * Fix copies * Fix doc * Make SegGptImageProcessor more flexible * Added few-shot test * Fix style * Update READMEs and docs * Update READMEs * Make inputs required * Add SegGptForImageSegmentation * Make tests pass * Rename to out_indicies * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fixed naming convention * Copying SegGptMlp from modeling_sam.py * Some minor improvements * Remove mlp_ratio * Fix docstrings * Fixed docstring match * Objects defined before use * Storing only patch_size and beta for SegGptLoss * removed _prepare_inputs method * Removed modified from headers * Renamed to output_indicies * Removed unnecessary einsums * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixing issues * Raise error as soon as possible * More fixes * Fix merge * Added palette to SegGptImageProcessor * Fixed typo * Fixed shape typo * Added permute before doing palette to class mapping * Fixed style * Fixed and added tests * Fixed docstrings * Matching SegFormer API for post_processing_semantic_segmentation * Fixed copies * Fixed SegGptImageProcessor to handle both binary and RGB masks * Updated docstrings of SegGptImageProcessor * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/seggpt.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Object definitions above & fix style * Renamed output_indices to intermediate_feature_indices * Removed unnecessary check on bool_masked_pos * Loss first in the outputs * Added validation for do_normalize * Improved SegGptImageProcessor and added new tests * Added comment * Added docstrings to SegGptLoss * Reimplemented ensemble condition logic in SegGptEncoder * Update src/transformers/models/seggpt/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated docstrings to use post_process_semantic_segmentation * Fixed typo on docstrings * moved pixel values test to test_image_processing_seggpt * Addressed comments * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated docstrings for SegGptLoss * Address comments * Added SegGpt example to model docs * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * moved patchify and unpatchify * Rename checkpoint * Renamed intermediate_features to intermediate_hidden_states for consistency * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Replaced post_process_masks for post_process_semantic_segmentation in the docs --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-26 18:17:19 +00:00
Eduardo Pacheco	3b8c053631	Fixed Deformable Detr typo when loading cuda kernels for MSDA (#29294 )	2024-02-26 17:24:30 +00:00
Michael	a44d2dc3a9	[i18n-zh] Translated task/asr.md into Chinese (#29233 ) * [zh] Translate a task: asr.md Signed-off-by: windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2024-02-26 08:53:05 -08:00
David Nguyen	c29135046a	[i18n-vi] Translate README.md to Vietnamese (#29229 ) * Add Tiếng Việt language support * Add Vietnamese translation link to README.md * update README_vi.md	2024-02-26 08:42:46 -08:00
Ming Xu (徐明)	734eb25476	🌐 [i18n-ZH] Translate chat_templating.md into Chinese (#28790 ) * [Pix2struct] Simplify generation (#22527) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link * Release: v4.28.0 * Revert (for now) the change on `Deta` in #22437 (#22750) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Patch release: v4.28.1 * update zh chat template. * Update docs/source/zh/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/_toctree.yml Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Michael <haifeng.yao@daocloud.io>	2024-02-26 08:42:24 -08:00
Michael	b43340455d	[i18n-zh] Translated torchscript.md into Chinese (#29234 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2024-02-26 08:27:47 -08:00
Aaron Jimenez	9f7535bda8	[docs] Spanish translation of tasks_explained.md (#29224 ) * Add tasks_explained.md to es/ * Fix little typo in en/ version * translate speach/audio section * translate part of vision computer section \| fix little typo in en/ * Fix little typo in en/ * Translate vision computer section \| remove to * * in both files * Translate NLP section \| fix link to task/translation in en/ * Updete link in es/tasks_summary.md * Fix task_summary title link	2024-02-26 08:18:15 -08:00
Raushan Turganbay	8f2f0f0f85	Track each row separately for stopping criteria (#29116 )	2024-02-26 16:06:16 +00:00
Joao Gante	ece1b62b93	Generate: v4.38 removals and related updates (#29171 )	2024-02-26 13:36:12 +00:00
fxmarty	24d59c7969	Use `torch.bool` instead of `torch.int64` for non-persistant causal mask buffer (#29241 ) use torch.bool instead of torch.int64	2024-02-26 14:06:43 +01:00
Merve Noyan	7c4995f93d	Add feature extraction mapping for automatic metadata update (#28944 ) * add feature extraction mapping * added prefix * ruff check * minor fix * Update modeling_auto.py * fix typo * remove prefix to make variable public/importable * Update src/transformers/models/auto/modeling_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixes * addressed comments * nit * fix-copies * remove from tests * this should fix * Update tests/models/convnextv2/test_modeling_convnextv2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-26 10:35:37 +00:00
fxmarty	2a7746c4d1	Add `non_device_test` pytest mark to filter out non-device tests (#29213 ) * add conftest * fix * remove deselected	2024-02-26 11:05:49 +01:00
Yih-Dar	93f8617afd	Use `DS_DISABLE_NINJA=1` (#29290 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-26 17:41:01 +08:00
Benjamin Muskalla	9fe360883e	Cache `is_vision_available` result (#29280 ) Cache `is_vision_available` This check is used quite often during process in image models and can take up a serious amount of time compared to the other processing steps.	2024-02-26 09:01:45 +00:00
Yih-Dar	c8d98405a8	Use torch 2.2 for daily CI (model tests) (#29208 ) * Use torch 2.2 for daily CI (model tests) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-23 21:37:08 +08:00
Matt	371b572e55	Allow remote code repo names to contain "." (#29175 ) * stash commit * stash commit * It works! * Remove unnecessary change * We don't actually need the cache_dir! * Update docstring * Add test * Add test with custom cache dir too * Update model repo path	2024-02-23 12:46:31 +00:00
Arthur	89c64817ce	[`Doc`] update model doc qwen2 (#29238 ) * update model doc qwen2 * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-23 10:43:31 +01:00
Alessandro Palla	3f60d11a87	Improve _update_causal_mask performance (#29210 ) * Fix issue 29206 * Fix style	2024-02-23 10:40:44 +01:00
Amin	75ed76ecea	Fix missing translation in README_ru (#29054 ) * Fix missing translation in README_ru * Update README_ru.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> --------- Co-authored-by: Maria Khalusova <kafooster@gmail.com>	2024-02-23 09:26:21 +01:00
cchen-dialpad	4524494072	fix(mlflow): check mlflow version to use the synchronous flag (#29195 ) * fix(mlflow): check mlflow version to use the flag * fix indent * add log_params async and fix quality	2024-02-23 09:19:51 +01:00
fxmarty	2cc8cf6ce7	Fix `torch.compile` with `fullgraph=True` when `attention_mask` input is used (#29211 ) * fix torch.export.export for llama * do not change doc title * make fix copies	2024-02-22 16:40:06 +01:00
NielsRogge	dabe855668	[Mistral, Mixtral] Improve docs (#29084 ) * Improve docs * Improve chat template	2024-02-22 11:48:01 +01:00
Sanchit Gandhi	2a9b1f80c4	[Gemma] Fix eager attention (#29187 ) * fix modelling code * add tests * fix tests * add some logit tests * style * fix fix	2024-02-22 01:07:52 +01:00
Andrei Panferov	fc37f38915	Add training version check for AQLM quantizer. (#29142 ) * training version check * warn old aqlm * aqlm 1.0.2 real * docs	2024-02-21 17:09:36 +01:00
Younes Belkada	ae49b218c3	FIX [`Gemma`] Fix bad rebase with transformers main (#29170 ) fix bad rebase	2024-02-21 14:56:34 +01:00
Arthur	594c1277b2	[ `gemma`] Adds support for Gemma 💎 (#29167 ) * inital commit * update * update conversion checkpoint * update conversion script * nits * some fixes * nits * merge * fix permute * nits * fix * nits * nits * nits * fix rope * fix both rope * nites * style * make sure flax works * fix flax init code * fix foward * nits * print flax generation out * current code * nits * SIIIIIIIIIIIIIIIIIII * update * add new tokenizer * correct fast tokenizer * fix conversion * more comments * fix modeling and conversion * nits and nits * nits testing * add some tokenization tests * add some edge cases * add slow tests and fix them * fixup * fix copies for modeling * fix copies * add 7B slow tests * fix * fix * fix tests * make tokenizer cis go green * styling * last tokenizer nits * update jax tests * fix flax for 7b * add jit testing 🤗 * cleanups * isolated nit, inv_freq for rotary_emb.inv_freq * propagate to jax * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adjust test * fix conversion script * change name * correct file names * update conversion script * Fix bos and eos token ids in the model configuration (#3) * update modelling * update conversion script * add static cache for gemma * fix sdpa generate * fix batched * multiple fixes * fix FA2 * final fix * Rename a few missing strings and filenames (#4) * merge with upstream main * fix copies * fix copies * fix fixup * fix fixup * fix * fix * final tests * fix fx gemma tests * fix fx bf16/fp16 tests * update slow fx tests * fx slow tests: one logits, one generation * move jit test standalone * Apply suggestions from code review * nits * tokenizer updates * more tokenization updates: custom GemmaSentencepieceExtrator * style * Update src/transformers/cache_utils.py * Update src/transformers/models/gemma/__init__.py * Update tests/models/gemma/test_modeling_flax_gemma.py * small nits * style * update tokenization test * fix the rotary embedding * with style * fix slow tests * WARNING this commit might be very important for precisions * Update tests/models/gemma/test_modeling_flax_gemma.py * Update src/transformers/models/gemma/configuration_gemma.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/models/gemma/modeling_flax_gemma.py Co-authored-by: Lysandre Debut <hi@lysand.re> * small nits here and there! * forgotten nit * remove on the fly computation of inv_freq * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_flax_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * nit conversion script link * fix some tests * add not doctest and pr doctest * repo consistency * fix last CIs 🚀 * update all readmes --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-02-21 14:21:28 +01:00
amyeroberts	58245ba6fb	[`Maskformer`] safely get backbone config (#29166 ) Safe getattr	2024-02-21 13:51:15 +01:00
Ekaterina Aidova	1d0ea7abe0	support SDPA Attention in stablelm (#29106 ) * support SDPA Attention in stablelm * add integration test * add fallback for output_attentions * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/stablelm/test_modeling_stablelm.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * handle non-contiguous states --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-21 13:12:49 +01:00
fxmarty	cc4a664baa	`torch.compile` compatibility with `generate` + static cache (#29114 ) * fix compatibility * working version * cleanup * sanity checks * more sanity * working version WITH refactor * working without API change * cleanup & tests pass * more cleaning * fix test * fix tests * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * smaller comment * update comment * update comment --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-21 12:19:30 +01:00
Joao Gante	3994fa5baf	🚨 Llama: update rope scaling to match static cache changes (#29143 )	2024-02-21 09:47:41 +00:00
Arthur Zucker	1a77f07f65	v4.39.dev.0	2024-02-21 15:23:22 +09:00
amyeroberts	e770f0316d	[`pipeline`] Add pool option to image feature extraction pipeline (#28985 ) * Add pool option * PR comments - error message and exact outputs check	2024-02-20 20:22:08 +00:00
Fernando Pérez-García	c47576ca6e	Fix drop path being ignored in DINOv2 (#29147 ) Fix drop path not being used	2024-02-20 17:31:59 +00:00
Gustavo Isturiz	3c00b885b9	Added image_captioning version in es and included in toctree file (#29104 ) added image_captioning version in es and included in toctree file	2024-02-20 09:13:15 -08:00
Joao Gante	857fd8eaab	Generate: missing generation config eos token setting in encoder-decoder tests (#29146 )	2024-02-20 16:17:51 +00:00
Pablo Montalvo	1c81132e80	Raise unused kwargs image processor (#29063 ) * draft processor arg capture * add missing vivit model * add new common test for image preprocess signature * fix quality * fix up * add back missing validations * quality * move info level to warning for unused kwargs	2024-02-20 16:20:20 +01:00
JB (Don)	b8b16475d4	[Phi] Add support for sdpa (#29108 )	2024-02-20 14:33:12 +01:00
Yih-Dar	7688d8df84	Save (circleci) cache at the end of a job (#29141 ) nice job Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-20 21:31:36 +08:00
Taylor Jackle Spriggs	ee3af60be0	Add support for fine-tuning CLIP-like models using contrastive-image-text example (#29070 ) * add support for siglip and chinese-clip model training with contrastive-image-text example * codebase fixups	2024-02-20 12:08:31 +00:00
amyeroberts	0996a10077	Revert low cpu mem tie weights (#29135 ) * Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)" This reverts commit `725f4ad1cc`. * Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)" This reverts commit `4156f517ce`.	2024-02-20 12:06:46 +00:00
Arthur	15cfe38942	[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues (#28010 ) * add add_dummy_prefix_space option to slow * checking kwargs might be better. Should be there for all spm tokenizer IMO * nits * fix copies * more copied * nits * add prefix space * nit * nits * Update src/transformers/convert_slow_tokenizer.py * fix inti * revert wrong styling * fix * nits * style * updates * make sure we use slow tokenizer for conversion instead of looking for the decoder * support llama ast well * update llama tokenizer fast * nits * nits nits nits * update the doc * update * update to fix tests * skip unrelated tailing test * Update src/transformers/convert_slow_tokenizer.py * add proper testing * test decode as well * more testing * format * fix llama test * Apply suggestions from code review	2024-02-20 12:50:31 +01:00
Younes Belkada	efdd436663	FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled models (#29055 ) * handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment	2024-02-20 12:45:08 +01:00
Arthur	5e95dcabe1	[`cuda kernels`] only compile them when initializing (#29133 ) * only compile when needed * fix mra as well * fix yoso as well * update * rempve comment * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py * opps * Update src/transformers/models/deta/modeling_deta.py * nit	2024-02-20 12:38:59 +01:00
Joao Gante	a7755d2409	Generate: unset GenerationConfig parameters do not raise warning (#29119 )	2024-02-20 11:34:31 +00:00

... 17 18 19 20 21 ...

16108 Commits