transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 01:02:25 +06:00

Author	SHA1	Message	Date
Arthur	9968c85e4b	fixes	2025-07-03 15:36:52 +02:00
Arthur	5af5bccd56	current updates	2025-07-03 15:28:32 +02:00
Arthur	3cba8ac3f3	fix stupid kosmos2	2025-07-03 13:17:18 +02:00
Arthur	0f3c368384	nits	2025-07-03 11:44:07 +02:00
Arthur	d462a8ea38	fix csm!	2025-07-03 11:41:30 +02:00
Arthur	a9690f43fd	fix cross attention outputs!	2025-07-03 11:32:48 +02:00
Arthur	6eb5e53e75	more fixes to moonshine!	2025-07-03 11:12:29 +02:00
Arthur	cfe62b6b95	generic needs to support more	2025-07-03 11:10:11 +02:00
Arthur	b81df9bd56	nits	2025-07-03 10:54:31 +02:00
Arthur	b3c8641f24	more moonshine fixes, 3 failures left!	2025-07-03 10:51:45 +02:00
Arthur	499ae87ef7	fix moonshine	2025-07-03 10:48:50 +02:00
Arthur	4fc83fa3a2	fix samhq?	2025-07-03 10:20:02 +02:00
Arthur	c4d43c5324	updates	2025-07-03 10:14:28 +02:00
Arthur	17cf5424b0	protect torch Some checks are pending Secret Leaks / trufflehog (push) Waiting to run Details	2025-07-02 13:35:02 +02:00
Arthur	a267d8d472	holy shit it was just graph breaks	2025-07-02 12:17:30 +02:00
Arthur	253307a305	update Some checks are pending Secret Leaks / trufflehog (push) Waiting to run Details	2025-07-01 17:47:17 +02:00
Arthur	501aead20b	dose this fix it?	2025-07-01 17:28:48 +02:00
Arthur	0c9f6de0fd	more fixes?	2025-07-01 16:29:15 +02:00
Arthur	e2973440d1	phix phi3	2025-07-01 16:23:22 +02:00
Arthur	d8ee27e495	fixup	2025-07-01 16:20:27 +02:00
Arthur	4834aeca61	only for some models	2025-07-01 16:12:54 +02:00
Arthur	6a5f410d26	fix janusss	2025-07-01 16:09:58 +02:00
Arthur	5065b9a285	small fixes	2025-07-01 16:05:29 +02:00
Arthur	d04c2b1ab6	fix mistral now Some checks are pending Secret Leaks / trufflehog (push) Waiting to run Details	2025-07-01 15:56:37 +02:00
Arthur	075bd0c2f3	fux csm and mistral	2025-07-01 15:53:44 +02:00
Arthur	5e5ae84a05	fix csm now	2025-07-01 15:41:52 +02:00
Arthur	aaae861fc8	fix another one	2025-07-01 15:37:33 +02:00
Arthur	9fa5f266a1	fix small lm3	2025-07-01 15:32:17 +02:00
Arthur	6a132a0799	finish fixing gemma3n	2025-07-01 15:22:52 +02:00
Arthur	f7a1f0da3d	some fixes, loss_kwargs should never had been	2025-07-01 15:19:32 +02:00
Arthur	0b119ffb1f	quel enfer	2025-07-01 15:06:54 +02:00
Arthur	3ac6c52f34	move the fix a bit	2025-07-01 15:00:38 +02:00
Arthur	00afce9837	fix emu3	2025-07-01 14:58:12 +02:00
Arthur	10fb88ae84	fix emu3	2025-07-01 14:53:05 +02:00
Arthur	209d5022ac	update	2025-07-01 14:47:38 +02:00
Arthur	da50ccc549	fix conflicts	2025-07-01 14:42:33 +02:00
Arthur	2748b99388	update	2025-07-01 14:39:58 +02:00
Arthur	22423738c4	update	2025-07-01 14:27:21 +02:00
Arthur	15a8ff4fe9	update	2025-07-01 14:20:56 +02:00
Arthur	a13a98c6da	more fix	2025-07-01 14:19:38 +02:00
Arthur	7a0512a1f5	fixes	2025-07-01 14:16:22 +02:00
StevenBucaille	1283877571	[superglue] fix wrong concatenation which made batching results wrong (#38850 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details	2025-07-01 12:14:44 +00:00
Raushan Turganbay	f8b88866f5	[VLMs] support passing embeds along with pixels (#38467 ) * VLMs can work with embeds now * update more models * fix tests * fix copies * fixup * fix * style * unskip tests * fix copies * fix tests * style * omni modality models * qwen models had extra indentation * fix some other tests * fix copies * fix test last time * unrelated changes revert * we can't rely only on embeds * delete file * de-flake mistral3 * fix qwen models * fix style * fix tests * fix copies * deflake the test * modular reverted by fixes, fix again * flaky test, overwritten * fix copies * style	2025-07-01 11:33:20 +00:00
Ayush Singh	20901f1d68	[typing] LlamaAttention return typehint (#38998 ) * helo llama * helo llama * helo llama * apply modular * fix dia --------- Co-authored-by: qubvel <qubvel@gmail.com>	2025-07-01 11:29:52 +01:00
Raushan Turganbay	7a25f8dfdb	[qwen2-vl] fix FA2 inference (#39121 ) * fix FA2 * update is causal flag and remove mask for FA2 * update for FA2 with varlen path * how the tests were passing with different devices? * add comment and ref to the PR * move mask preparation to base pretrained model * seq len is the first dim, not second * fix copies to fix GLM4V	2025-07-01 10:18:37 +00:00
Mehant Kammakomati	def9663239	feat: support indivisible shards for TP model loading and TPlizing. (#37220 ) * feat: support uneven loading and sharding resolve merge conflicts Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: allow for empty tensor computations Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * test: add llama1b test case Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * due to q_proj colwise it has to be multi of 2 Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>	2025-07-01 10:03:22 +00:00
jiqing-feng	06c4a4d499	fix caching_allocator_warmup with tie weights (#39070 ) * fix caching_allocator_warmup with tie weights Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-01 11:32:20 +02:00
Raushan Turganbay	e435574721	🚨 Don't use cache in non-generative models (#38751 ) * deprecate for 1 version * style * fix some tests * fix esm * skip for now, GC requires positional args but we have keyword args * remove transpose for scores in modified models only * skip fx trace tests	2025-07-01 09:08:21 +00:00
Arthur	3c0c56b84d	test this	2025-07-01 10:58:16 +02:00
Arthur	780141ca52	same	2025-07-01 10:56:29 +02:00

1 2 3 4 5 ...

19564 Commits