transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Fanli Lin	f0ae65c198	[tests] further fix `Tester object has no attribute '_testMethodName'` (#35781 ) * bug fix * update with more cases * more entries * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:05:33 +01:00
Yih-Dar	ec7790f0d3	update docker file `transformers-pytorch-deepspeed-latest-gpu` (#35940 ) update docker file for deepspeed Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:01:27 +01:00
Zach Mueller	5d257111c1	Trainer Refactor: Part 1 (#35567 ) * start * So far: 30% * Small fix * Continuing update * Continuing * Forgot to check if not None * Continuing refactor * Fix if else * Fix ref * Should make tests pass * Keep grad norm same * Document * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Err instead of info for logging RNG state error * Seperate out to func --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-01-29 09:50:54 -05:00
Jonas Rohw	23d782ead2	Output dicts support in text generation pipeline (#35092 ) * Support for generate_argument: return_dict_in_generate=True, instead of returning a error * fix: call test with return_dict_in_generate=True * fix: Only import torch if it is present * update: Encapsulate output_dict changes * fix: added back original comments --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-01-29 14:44:46 +00:00
Yih-Dar	cf90404807	Fix flaky `test_assisted_decoding_matches_greedy_search` (#35951 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:50:07 +01:00
Yih-Dar	692afa102d	Update `squad_convert_example_to_features` to work with numpy v2 (#35955 ) * Fix * Fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:33:06 +01:00
Yih-Dar	c600e89f5c	Update `unwrap_and_save_reload_schedule` to use `weights_only=False` (#35952 ) * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:30:57 +01:00
Nadav Timor	42c8ccfd4c	fix `test_generated_length_assisted_generation` (#34935 ) fix test_generated_length_assisted_generation	2025-01-29 12:03:45 +00:00
Mohamed Abu El-Nasr	ec7afad609	use torch constraints to check if covariance is positive definite during mean resizing. (#35693 ) * use torch constraints to check for psd * small nit * Small change * Small change for the ci * nit	2025-01-28 17:33:42 +01:00
Ella Charlaix	61cbb723fc	Remove INC notebook reference in documentation (#35936 ) remove INC notebook in documentation	2025-01-28 17:10:02 +01:00
NanoCode012	478c4f2d0d	fix(FA): QKV not being casted to target_dtype for FA with dpo lora (#35834 ) fix(FA): QKV not being casted to target_dtype due to dtype check	2025-01-28 17:06:56 +01:00
Joao Gante	ece8c42488	Test: generate with `torch.compile(model.forward)` as a fast test (#34544 )	2025-01-28 14:10:38 +00:00
Cyril Vallez	f48ecd7608	Fix TP initialization (#35860 ) * fix tp * Update modeling_utils.py * style * style * Update test_tp.py * Update test_tp.py * style * Update test_tp.py * Update test_tp.py * Update test_tp.py * Update test_tp.py	2025-01-28 15:07:37 +01:00
Raushan Turganbay	f85ba20449	Qwen-2-5-VL: fix CI (#35935 ) fix	2025-01-28 14:51:57 +01:00
Cyril Vallez	3f860dba55	Fix mask slicing for models with HybridCache (#35681 ) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular	2025-01-28 14:35:00 +01:00
Raushan Turganbay	b764c20b09	Fix: loading DBRX back from saved path (#35728 ) * fix dtype as dict for some models + add test * add comment in tests	2025-01-28 11:38:45 +01:00
Cyril Vallez	3613f568cd	Add default TP plan for all models with backend support (#35870 ) * Add some tp plans! * More tp plans! * Add it in the comment * style * Update configuration_mixtral.py * Update configuration_phi.py * update the layout according to special archs * fix mixtral * style * trigger CIs * trigger CIs * CIs * olmo2 --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-28 11:20:58 +01:00
ivarflakstad	96625d85fd	Use rocm6.2 for AMD images (#35930 ) * Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm * Use stable wheel index for torch libs	2025-01-28 11:10:28 +01:00
Yih-Dar	bf16a182ba	Remove `_supports_static_cache = True` for some model classes (#34975 ) * use mask_fill * remove comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-28 10:42:10 +01:00
Steven Liu	86d7564611	[docs] Fix Zamba2 (#35916 ) fix code block	2025-01-27 11:44:10 -08:00
Matt	414658f94f	Close Zamba2Config code block (#35914 ) * close zamba2 code block * Add Zamba2 to toctree	2025-01-27 19:09:42 +00:00
Matt	63e9c941eb	Fix the config class comparison for remote code models (#35592 ) * Fix the config class comparison when repeatedly saving and loading remote code models * once again you have committed your debug breakpoint	2025-01-27 18:37:30 +00:00
Steven Liu	c550a1c640	[docs] uv install (#35821 ) uv install	2025-01-27 08:49:28 -08:00
CalOmnie	cd6591bfb2	Fix typing in audio_utils.chroma_filter_bank (#35888 ) * Fix typing in audio_utils.chroma_filter_bank * Apply make style --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-27 16:06:03 +00:00
Isotr0py	e57b459997	Split and clean up GGUF quantization tests (#35502 ) * clean up ggml test Signed-off-by: Isotr0py <2037008807@qq.com> * port remaining tests Signed-off-by: Isotr0py <2037008807@qq.com> * further cleanup Signed-off-by: Isotr0py <2037008807@qq.com> * format Signed-off-by: Isotr0py <2037008807@qq.com> * fix broken tests Signed-off-by: Isotr0py <2037008807@qq.com> * update comment Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * reorganize tests Signed-off-by: Isotr0py <2037008807@qq.com> * k-quants use qwen2.5-0.5B Signed-off-by: Isotr0py <2037008807@qq.com> * move ggml tokenization test Signed-off-by: Isotr0py <2037008807@qq.com> * remove dead code Signed-off-by: Isotr0py <2037008807@qq.com> * add assert for serilization test Signed-off-by: Isotr0py <2037008807@qq.com> * use str for parameterize Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-27 15:46:57 +01:00
Ross Wightman	5c576f5a66	🚨🚨🚨 image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848 ) single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline	2025-01-27 15:34:57 +01:00
Mikhail Moskovchenko	5450e7c84a	🔴 🔴 🔴 Added `segmentation maps` support for DPT image processor (#34345 ) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements	2025-01-27 15:14:00 +01:00
ivarflakstad	a50befa9b9	Update deepspeed amd image (#35906 )	2025-01-27 14:32:36 +01:00
pglorio	33cb1f7b61	Add Zamba2 (#34517 ) * First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit `9007a522b1`. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular --------- Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-27 10:51:23 +01:00
Sugendran Ganess	14a9bb520e	Fix fast image processor warnings in object detection examples (#35892 ) Have the DETR examples default to using the fast image processor	2025-01-27 08:32:44 +00:00
Steven Liu	f11f57c925	[doctest] Fixes (#35863 ) doctest fixes	2025-01-26 15:26:38 -08:00
Yih-Dar	fc269f77da	Add `Rocketknight1` to `self-comment-ci.yml` (#35881 ) my bad Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-24 19:07:07 +00:00
Fanli Lin	bcb841f007	add xpu device check in device_placement (#35865 ) add xpu device	2025-01-24 19:13:07 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
Suyuchen Wang	72d1a4cd53	Fix Llava-NeXT / Llava-NeXT Video / Llava-OneVision's token unpadding mismatch (#35779 ) * Fix Llava OneVision's token padding * Fix Llava next and Llava next video's token unpadding for consistency	2025-01-24 09:10:27 +01:00
CalOmnie	b5aaf87509	Fix `test_pipelines_video_classification` that was always failing (#35842 ) * Fix test_pipelines_video_classification that was always failing * Update video pipeline docstring to reflect actual return type --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-23 19:22:32 +01:00
baoyf4244	328e2ae4c0	fix apply_chat_template() padding choice (#35828 ) fix apply_chat_template() padding choice to bool, str, PaddingStrategy and the docstring of pad()	2025-01-23 17:32:32 +00:00
SilverSoldier	d2a424b550	Fix typo (#35854 )	2025-01-23 17:32:18 +00:00
Yosshi999	045c02f209	[DOC] Fix contamination and missing paragraph in translation (#35851 ) Fix contamination and missing paragraph in translation	2025-01-23 08:33:44 -08:00
Alex Brooks	71cc8161b2	Granite Vision Support (#35579 ) * Add multimodal granite support Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-01-23 17:15:52 +01:00
Arthur	8f1509a96c	Fix more CI tests (#35661 ) add tooslow for the fat ones	2025-01-23 14:45:42 +01:00
Jack Roberts	0a950e0bbe	Fix uploading processors/tokenizers to WandB on train end (#35701 ) * rename tokenizer to processing_class in WandbCallback.on_train_end * rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback	2025-01-23 13:32:15 +01:00
張庭瑜	4ec425ffad	Fix GA loss for Deepspeed (#35808 ) * Fix GA loss for Deepspeed * Turn off loss scaling in DeepSpeed engine by scale_wrt_gas * Add comment linking to PR	2025-01-23 11:45:02 +01:00
ShuaiBai623	f3f6c86582	add qwen2.5vl (#35569 ) * add qwen2.5vl * fix * pass check table * add modular file * fix style * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * padd copy check * use modular * fix * fix * fix * update flashatt2&sdpa support_list * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update config * update * fix hf path * rename Qwen2_5_VLVideosKwargs * fix * fix * update * excuted modular * rollback init * fix * formated * simpler init * fix * fix * fix * fix * fix * update docs * fix * fix * update Qwen2VLRotaryEmbedding for yarn * fix --------- Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: gewenbin0992 <gewenbin292@163.com> Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>	2025-01-23 11:23:00 +01:00
Cyril Vallez	d3af76df58	[Backend support] Allow `num_logits_to_keep` as Tensor + add flag (#35757 ) * support * Update modeling_utils.py * style * most models * Other models * fix-copies * tests + generation utils	2025-01-23 09:47:54 +01:00
Arthur	8736e91ad6	[ `tests`] remove some flash attention class tests (#35817 ) remove class from tests	2025-01-23 09:44:21 +01:00
Marc Sun	2c3a44f9a7	Fix NoneType type as it requires py>=3.10 (#35843 ) fix type	2025-01-22 15:56:53 +00:00
Mohit Sharma	fdcc62c855	Add PyTorch version check for FA backend on AMD GPUs (#35813 ) Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)	2025-01-22 16:09:23 +01:00
LRL-ModelCloud	3b9770581e	Fix compatibility issues when using auto_gptq with these older versions (#35830 ) convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.	2025-01-22 15:46:47 +01:00
Joao Gante	62bd83947a	[chat] docs fix (#35840 ) docs fix	2025-01-22 14:32:27 +00:00

1 2 3 4 5 ...

17908 Commits