transformers

Arthur 9968c85e4b fixes	2025-07-03 15:36:52 +02:00
..
commands	Fix chat (#39128 )	2025-06-30 13:47:48 +00:00
data	[HPU][Critical Issue Fix] ThreadPool instead of Pool for parallel pre-processing (#39002 )	2025-06-24 20:24:50 +02:00
generation	holy shit it was just graph breaks	2025-07-02 12:17:30 +02:00
integrations	feat: support indivisible shards for TP model loading and TPlizing. (#37220 )	2025-07-01 10:03:22 +00:00
kernels	Use `deformable_detr` kernel from the Hub (#36853 )	2025-03-21 13:08:47 +01:00
loss	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
models	fixes	2025-07-03 15:36:52 +02:00
onnx	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
pipelines	guard torch distributed check (#39057 )	2025-06-27 14:49:47 +00:00
quantizers	Correctly raise error for awq quantization (#38945 )	2025-06-20 17:18:06 +02:00
sagemaker	[Refactor] Relative imports wherever we can (#21880 )	2023-03-02 09:45:42 +01:00
utils	current updates	2025-07-03 15:28:32 +02:00
__init__.py	Dev version	2025-06-26 18:04:36 +02:00
activations_tf.py	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
activations.py	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
audio_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
cache_utils.py	Fix bugs in DynamicCache (#37880 )	2025-06-24 19:43:40 +02:00
configuration_utils.py	Add Dia model (#38405 )	2025-06-26 11:04:23 +00:00
convert_graph_to_onnx.py	Update ruff to `0.11.2` (#36962 )	2025-03-25 16:00:11 +01:00
convert_pytorch_checkpoint_to_tf2.py	Set weights_only in torch.load (#36991 )	2025-03-27 14:55:50 +00:00
convert_slow_tokenizer.py	Add optional RMSNorm support to BitNet quantization (config + layers) (#38087 )	2025-05-16 12:38:06 +02:00
convert_slow_tokenizers_checkpoints_to_fast.py	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
convert_tf_hub_seq_to_seq_bert_to_pytorch.py	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
debug_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
dependency_versions_check.py	⚠️ Time to say goodbye to py37 (#24091 )	2023-06-28 07:22:39 +02:00
dependency_versions_table.py	Split `transformers chat` and `transformers serve` (#38443 )	2025-06-30 15:10:53 +02:00
dynamic_module_utils.py	Fix custom generate from local directory (#38916 )	2025-06-20 17:36:57 +01:00
feature_extraction_sequence_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
feature_extraction_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
file_utils.py	[core] Large/full refactor of `from_pretrained` (#36033 )	2025-03-12 13:39:25 +01:00
hf_argparser.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
hyperparameter_search.py	Fix Optional type annotation (#36841 )	2025-03-26 13:53:44 +00:00
image_processing_base.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
image_processing_utils_fast.py	Internvl fix (#38946 )	2025-06-26 13:44:59 +02:00
image_processing_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
image_transforms.py	Add Idefics2/3 and SmolVLM Fast image processors + improvements for fast image processors (#38157 )	2025-06-23 14:17:25 +00:00
image_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
keras_callbacks.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
masking_utils.py	[`Flex Attn`] Fix torch 2.5.1 incompatibilities (#37406 )	2025-06-26 18:23:55 +02:00
model_debugging_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
modelcard.py	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
modeling_attn_mask_utils.py	Fix attention mask expansion when converting to executorch (#38637 )	2025-06-09 15:00:55 +00:00
modeling_flash_attention_utils.py	[qwen2-vl] fix FA2 inference (#39121 )	2025-07-01 10:18:37 +00:00
modeling_flax_outputs.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
modeling_flax_pytorch_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
modeling_flax_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
modeling_gguf_pytorch_utils.py	Support loading Gemma3 QAT GGUF models (#37649 )	2025-04-22 11:23:17 +02:00
modeling_layers.py	Apply GradientCheckpointingLayer to the whole repo (#38913 )	2025-06-23 14:24:48 +02:00
modeling_outputs.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
modeling_rope_utils.py	Use HF papers (#38184 )	2025-06-13 11:07:09 +00:00
modeling_tf_outputs.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
modeling_tf_pytorch_utils.py	Stop TF weight rename reDOS (#38325 )	2025-05-26 16:58:51 +01:00
modeling_tf_utils.py	More PYUP fixes (#38883 )	2025-06-18 14:38:08 +01:00
modeling_utils.py	update	2025-07-01 14:39:58 +02:00
optimization_tf.py	Two ReDOS fixes (#39013 )	2025-06-25 17:31:26 +01:00
optimization.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
processing_utils.py	Add Dia model (#38405 )	2025-06-26 11:04:23 +00:00
py.typed	Add py.typed (#37022 )	2025-04-02 14:17:27 +01:00
pytorch_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
safetensors_conversion.py	Change back to `Thread` for SF conversion (#35236 )	2024-12-12 16:05:04 +01:00
testing_utils.py	Several fixes for Gemma3n (#39135 )	2025-07-01 10:34:53 +02:00
tf_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
time_series_utils.py	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
tokenization_utils_base.py	Fixed markdown for BertTokenizer's '[CLS]' token. (#38506 )	2025-06-18 13:09:58 +00:00
tokenization_utils_fast.py	fix: add __bool__ operator to tokenizer to avoid bloated asserts (#38899 )	2025-06-23 14:32:16 +00:00
tokenization_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
trainer_callback.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
trainer_pt_utils.py	Remove `ALL_LAYERNORM_LAYERS` (#38922 )	2025-06-20 12:06:48 +02:00
trainer_seq2seq.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
trainer_utils.py	No more Tuple, List, Dict (#38797 )	2025-06-17 19:37:18 +01:00
trainer.py	small fixes	2025-07-01 16:05:29 +02:00
training_args_seq2seq.py	[docs] Remove sortish_sampler (#35539 )	2025-01-07 12:06:19 -08:00
training_args_tf.py	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
training_args.py	feat: add flexible Liger Kernel configuration to TrainingArguments (#38911 )	2025-06-19 15:54:08 +00:00
video_processing_utils.py	[video processor] support torchcodec and decrease cuda memory usage (#38880 )	2025-06-25 08:23:37 +00:00
video_utils.py	[video processor] support torchcodec and decrease cuda memory usage (#38880 )	2025-06-25 08:23:37 +00:00

commands

Fix chat (#39128 )

2025-06-30 13:47:48 +00:00

data

[HPU][Critical Issue Fix] ThreadPool instead of Pool for parallel pre-processing (#39002 )

2025-06-24 20:24:50 +02:00

generation

holy shit it was just graph breaks

2025-07-02 12:17:30 +02:00

integrations

feat: support indivisible shards for TP model loading and TPlizing. (#37220 )

2025-07-01 10:03:22 +00:00

kernels

Use deformable_detr kernel from the Hub (#36853 )

2025-03-21 13:08:47 +01:00

loss

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

models

fixes

2025-07-03 15:36:52 +02:00

onnx

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

pipelines

guard torch distributed check (#39057 )

2025-06-27 14:49:47 +00:00

quantizers

Correctly raise error for awq quantization (#38945 )

2025-06-20 17:18:06 +02:00

sagemaker

[Refactor] Relative imports wherever we can (#21880 )

2023-03-02 09:45:42 +01:00

utils

current updates

2025-07-03 15:28:32 +02:00

__init__.py

Dev version

2025-06-26 18:04:36 +02:00

activations_tf.py

Use HF papers (#38184 )

2025-06-13 11:07:09 +00:00

activations.py

Use HF papers (#38184 )

2025-06-13 11:07:09 +00:00

audio_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

cache_utils.py

Fix bugs in DynamicCache (#37880 )

2025-06-24 19:43:40 +02:00

configuration_utils.py

Add Dia model (#38405 )

2025-06-26 11:04:23 +00:00

convert_graph_to_onnx.py

Update ruff to 0.11.2 (#36962 )

2025-03-25 16:00:11 +01:00

convert_pytorch_checkpoint_to_tf2.py

Set weights_only in torch.load (#36991 )

2025-03-27 14:55:50 +00:00

convert_slow_tokenizer.py

Add optional RMSNorm support to BitNet quantization (config + layers) (#38087 )

2025-05-16 12:38:06 +02:00

convert_slow_tokenizers_checkpoints_to_fast.py

Use pyupgrade --py39-plus to improve code (#36843 )

2025-03-20 14:39:44 +00:00

convert_tf_hub_seq_to_seq_bert_to_pytorch.py

Use pyupgrade --py39-plus to improve code (#36843 )

2025-03-20 14:39:44 +00:00

debug_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

dependency_versions_check.py

⚠️ Time to say goodbye to py37 (#24091 )

2023-06-28 07:22:39 +02:00

dependency_versions_table.py

Split transformers chat and transformers serve (#38443 )

2025-06-30 15:10:53 +02:00

dynamic_module_utils.py

Fix custom generate from local directory (#38916 )

2025-06-20 17:36:57 +01:00

feature_extraction_sequence_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

feature_extraction_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

file_utils.py

[core] Large/full refactor of from_pretrained (#36033 )

2025-03-12 13:39:25 +01:00

hf_argparser.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

hyperparameter_search.py

Fix Optional type annotation (#36841 )

2025-03-26 13:53:44 +00:00

image_processing_base.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

image_processing_utils_fast.py

Internvl fix (#38946 )

2025-06-26 13:44:59 +02:00

image_processing_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

image_transforms.py

Add Idefics2/3 and SmolVLM Fast image processors + improvements for fast image processors (#38157 )

2025-06-23 14:17:25 +00:00

image_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

keras_callbacks.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

masking_utils.py

[Flex Attn] Fix torch 2.5.1 incompatibilities (#37406 )

2025-06-26 18:23:55 +02:00

model_debugging_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

modelcard.py

Use HF papers (#38184 )

2025-06-13 11:07:09 +00:00

modeling_attn_mask_utils.py

Fix attention mask expansion when converting to executorch (#38637 )

2025-06-09 15:00:55 +00:00

modeling_flash_attention_utils.py

[qwen2-vl] fix FA2 inference (#39121 )

2025-07-01 10:18:37 +00:00

modeling_flax_outputs.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

modeling_flax_pytorch_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

modeling_flax_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

modeling_gguf_pytorch_utils.py

Support loading Gemma3 QAT GGUF models (#37649 )

2025-04-22 11:23:17 +02:00

modeling_layers.py

Apply GradientCheckpointingLayer to the whole repo (#38913 )

2025-06-23 14:24:48 +02:00

modeling_outputs.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

modeling_rope_utils.py

Use HF papers (#38184 )

2025-06-13 11:07:09 +00:00

modeling_tf_outputs.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

modeling_tf_pytorch_utils.py

Stop TF weight rename reDOS (#38325 )

2025-05-26 16:58:51 +01:00

modeling_tf_utils.py

More PYUP fixes (#38883 )

2025-06-18 14:38:08 +01:00

modeling_utils.py

update

2025-07-01 14:39:58 +02:00

optimization_tf.py

Two ReDOS fixes (#39013 )

2025-06-25 17:31:26 +01:00

optimization.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

processing_utils.py

Add Dia model (#38405 )

2025-06-26 11:04:23 +00:00

py.typed

Add py.typed (#37022 )

2025-04-02 14:17:27 +01:00

pytorch_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

safetensors_conversion.py

Change back to Thread for SF conversion (#35236 )

2024-12-12 16:05:04 +01:00

testing_utils.py

Several fixes for Gemma3n (#39135 )

2025-07-01 10:34:53 +02:00

tf_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

time_series_utils.py

Use pyupgrade --py39-plus to improve code (#36843 )

2025-03-20 14:39:44 +00:00

tokenization_utils_base.py

Fixed markdown for BertTokenizer's '[CLS]' token. (#38506 )

2025-06-18 13:09:58 +00:00

tokenization_utils_fast.py

fix: add __bool__ operator to tokenizer to avoid bloated asserts (#38899 )

2025-06-23 14:32:16 +00:00

tokenization_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

trainer_callback.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

trainer_pt_utils.py

Remove ALL_LAYERNORM_LAYERS (#38922 )

2025-06-20 12:06:48 +02:00

trainer_seq2seq.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

trainer_utils.py

No more Tuple, List, Dict (#38797 )

2025-06-17 19:37:18 +01:00

trainer.py

small fixes

2025-07-01 16:05:29 +02:00

training_args_seq2seq.py

[docs] Remove sortish_sampler (#35539 )

2025-01-07 12:06:19 -08:00

training_args_tf.py

Use pyupgrade --py39-plus to improve code (#36843 )

2025-03-20 14:39:44 +00:00

training_args.py

feat: add flexible Liger Kernel configuration to TrainingArguments (#38911 )

2025-06-19 15:54:08 +00:00

video_processing_utils.py

[video processor] support torchcodec and decrease cuda memory usage (#38880 )

2025-06-25 08:23:37 +00:00

video_utils.py

[video processor] support torchcodec and decrease cuda memory usage (#38880 )

2025-06-25 08:23:37 +00:00