transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
RUFFY-369	29f56e2230	chore:init files for sam2	2024-08-01 12:24:49 +05:30
sangbumchoi	a898595938	initial comment	2024-07-30 00:25:42 +00:00
Kamil Akesbi	3fbaaaa64d	Whisper tokenizer word level timestamps (#32197 ) * fix _fix_key in PreTrainedModel * fix _find_longest_common_sequence * add test * remove result.json * nit * update test	2024-07-29 11:19:52 +01:00
Joao Gante	7ffe25f2b9	Generate: end-to-end compilation (#30788 ) * mvp * added test (a few models need fixes) * fix a few test cases * test nits * harder test 😈 * revert changes in stablelm * test with improved condition * add todo * tmp commit * merged with main * nits * add todo * final corrections * add docs for generation compilation * docs nits * add tip * PR suggestions * add more details to the compilation docs * fix cache positions * cache is now init in generate; update docs * tag test as flaky * docs * post rebase make fixup and other nits * remove unintended changes * whisper (encoder-decoder) not supported * move token default updates to ; add tests for token defaults * push changes * manual rebase * chameleon doesn't support this * fix test_static_cache_mha_mqa_gqa (broken in another PR) * docs: dynamic is better with end-to-end compilation	2024-07-29 10:52:13 +01:00
Sai-Suraj-27	49928892d6	fix(docs): Fixed a link in docs (#32274 ) Fixed a link in docs.	2024-07-29 10:50:43 +01:00
Fanli Lin	6494479f1d	make `p_mask` a numpy array before passing to `select_starts_ends` (#32076 ) * fix * bug fix * refine * fix	2024-07-29 10:29:11 +01:00
Joao Gante	535fe78b9f	Repo: remove exceptions in `check_docstrings` (#32259 ) remove exceptions	2024-07-29 11:06:05 +02:00
Sai-Suraj-27	a2ad9d5ad5	fix: Fixed wrong argument passed to `convert_blip_checkpoint` function call (#32262 ) Removed one wrong argument passed to convert_blip_checkpoint function call.	2024-07-29 10:43:09 +02:00
leejet	5019aabfac	Optimize t5 tokenize logic to avoid redundant calls (#32270 ) * Optimize t5 tokenize logic to avoid redundant calls * fix and overwrite copies	2024-07-29 09:51:43 +02:00
Yih-Dar	f2122cc6eb	Upload new model failure report to Hub (#32264 ) upload Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-29 09:42:54 +02:00
Raushan Turganbay	f739687684	🚨 Bloom support for cache class (#31445 ) * bloom dynamic cache * bloom follows standard cache format * no skips for bloom anymore * use cache position when possible * clean up * codestyle * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pr comments * isinstance fix * address comments * make musicgen test happy * [run-slow] bloom --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-29 10:58:59 +05:00
Joao Gante	44f6fdd74f	Llama 3.1: replace for loop by tensor ops at inv_freq initialization (#32244 ) * replace for loop by tensor ops * rm assert; readability	2024-07-27 10:19:46 +01:00
Yih-Dar	8da9068730	More flexible trigger condition (#32251 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-26 20:52:45 +02:00
Raushan Turganbay	81233c069c	Flash-Attn: fix generation when no attention mask or no pading (#32241 ) * fix * fix prev test (half of failures) * [run-slow] llama, gemma2 * [run-slow] llama, gemma2	2024-07-26 14:45:55 +05:00
Fanli Lin	27c7f971c0	[tests] fix `static` cache implementation is not compatible with `attn_implementation==flash_attention_2` (#32039 ) * add flash attention check * fix * fix	2024-07-26 11:41:27 +02:00
Connor Anderson	5f841c74b6	Add check for `target_sizes is None` in `post_process_image_guided_detection` for owlv2 (#31934 ) * Add check for target_sizes is None in post_process_image_guided_detection * Make sure Owlvit and Owlv2 in sync * Fix incorrect indentation; add check for correct size of target_sizes	2024-07-26 10:05:46 +01:00
Rohit Dwivedula	f9756d9edb	Adds: extra_repr for RMSNorm layers in most models (#32204 ) * adds: extra_repr() to RMSNorm layers in multiple models * adds: extra_repr for deprecated models as well * formatting as per style guide	2024-07-26 11:05:38 +02:00
Sai-Suraj-27	b8e5cd5396	Refactor: Removed un-necessary `object` base class (#32230 ) * Refactored to remove un-necessary object base class. * small fix.	2024-07-26 10:33:02 +02:00
João Nadkarni	1c7ebf1d6e	don't log base model architecture in wandb if log model is false (#32143 ) * don't log base model architecture in wandb is log model is false * Update src/transformers/integrations/integration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * convert log model setting into an enum * fix formatting --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-26 09:38:59 +02:00
Raushan Turganbay	c46edfb823	Resize embeds with DeepSpeed (#32214 ) * fix resize when deepspeed * deepsped uses new embeds * we needed this	2024-07-26 10:52:06 +05:00
Raushan Turganbay	fad15fba78	Llava: generate without images (#32183 ) * llava w/o images * tests	2024-07-26 10:17:27 +05:00
Raushan Turganbay	4ab33c2d81	Generation: stop at `eos` for assisted decoding (#31301 ) * fix * move changes to prompt lookup * add test * set eos in assistant model * style * fix flakiness * changes for new `main` * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add comment to explain --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-26 10:16:06 +05:00
Pavel Iakubovskii	9d6c0641c4	Fix code snippet for Grounding DINO (#32229 ) Fix code snippet for grounding-dino	2024-07-25 19:20:47 +01:00
jrhe	3a83ec48a6	Allow a specific microphone to be used by the ffmpeg audio pipeline utility functions. Default to using the currently active microphone on Mac (#31846 ) * use currently active microphone on mac for ffmpeg_microphone * Allow ffmpeg_microphone device to be specified Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-25 17:16:13 +01:00
Huazhong Ji	6ed0bf1e85	translate philosophy.md to chinese (#32177 ) * translate philosophy.md to chinese * add the missing link	2024-07-25 09:01:06 -07:00
Yih-Dar	df6eee9201	Follow up for #31973 (#32025 ) * fix * [test_all] trigger full CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-25 16:12:23 +02:00
Kashif Rasul	de2318894e	[warnings] fix E721 warnings (#32223 ) fix E721 warnings	2024-07-25 15:12:23 +02:00
Kashif Rasul	9b9a54e61b	[BigBird Pegasus] set _supports_param_buffer_assignment to False (#32222 ) set _supports_param_buffer_assignment to False	2024-07-25 15:11:43 +02:00
Austin	1ecedf1d9e	Update question_answering.py (#32208 )	2024-07-25 13:20:27 +01:00
Huazhong Ji	f53a5dec7b	remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0 (#32210 ) remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0	2024-07-25 11:04:04 +02:00
Sanchit Gandhi	5658e749ad	[whisper] fix short-form output type (#32178 ) * [whisper] fix short-form output type * add test * make style * update long-form tests * fixes * last fix * finalise test	2024-07-25 16:58:02 +08:00
Sai-Suraj-27	85a1269e19	fix: Replaced deprecated `unittest method` with the correct one (#32198 ) Replaced deprecated unittest method with the correct one.	2024-07-24 18:00:21 +01:00
Matt	edd68f4ed8	🚨 No more default chat templates (#31733 ) * No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again	2024-07-24 17:36:32 +01:00
Penut Chen	1c122a46dc	Support dequantizing GGUF FP16 format (#31783 ) * support gguf fp16 * support gguf bf16 with pytorch * add gguf f16 test * remove bf16	2024-07-24 17:59:59 +02:00
Marc Sun	af0e4b7b37	Fix float8_e4m3fn in modeling_utils (#32193 ) * Fix float8_e4m3fn in modeling_utils * style * fix * comment	2024-07-24 17:14:05 +02:00
Raushan Turganbay	1392a6867f	Fix resize embedding with Deepspeed (#32192 ) fix resize when deepspeed	2024-07-24 19:26:20 +05:00
Arthur	8d2534c4d0	let's not warn when someone is running a forward (#32176 ) * let's not warn when someone is running a foward without cache + self.training * more models * fixup	2024-07-24 16:06:39 +02:00
Joao Gante	e0182f3bd7	RoPE: relaxed rope validation (#32182 ) * relaxed rope check * lets also accept rope_type=None, defaulting to the original implementation * type and rope_type can coexist	2024-07-24 15:00:48 +01:00
amyeroberts	165116bc14	Remove conversational pipeline tests (#32099 ) Remove conversation pipeline tests	2024-07-24 14:03:40 +01:00
Dr. Artificial曾小健	5f4ee98a7a	Update qwen2.md (#32108 ) * Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go	2024-07-24 11:54:41 +01:00
조준래	8678879f1d	fix: default value reflects the runtime environment variables rather than the ones present at import time. (#32153 ) * fix: default value reflects the runtime environment variables rather than the ones present at import time. * Fix: Change `deterministic` to None by default; use env var if None	2024-07-24 11:38:49 +01:00
Rohit Dwivedula	01be5b4879	adds: extra_repr() to MambaRMSNorm to include hidden size / size of weights in the layer (#32171 ) * adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer * style fix with ruff:	2024-07-24 09:09:59 +02:00
Fanli Lin	c85510f958	[docs] change temperature to a positive value (#32077 ) fix	2024-07-23 17:47:51 +01:00
Sai-Suraj-27	bc2adb0112	fix: Fixed an if condition that is always evaluating to true (#32160 ) Fixed an if condition always evaluating to true.	2024-07-23 16:52:41 +01:00
Joao Gante	23f6a43f82	fix (#32162 )	2024-07-23 16:48:16 +01:00
Lysandre	d5a99dfcee	Llama 3.1 conversion Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-07-23 17:13:25 +02:00
Lysandre	ff0d708fe6	Dev version: v4.44.0.dev0	2024-07-23 17:12:47 +02:00
Sai-Suraj-27	d2c687b3f1	Updated `ruff` to the latest version (#31926 ) * Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff	2024-07-23 17:07:31 +02:00
RhuiDih	9cf4f2aa9a	Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629 ) * add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py	2024-07-23 15:56:41 +02:00
Deep Gandhi	7d92009af6	Added additional kwarg for successful running of optuna hyperparameter search (#31924 ) Update integration_utils.py Added additional kwarg	2024-07-23 14:41:52 +01:00

1 2 3 4 5 ...

16467 Commits