transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 12:50:06 +06:00

Author	SHA1	Message	Date
Matt	53fb245eb6	🚨 🚨 Inherited CausalLM Tests (#37590 ) * stash commit * Experiment 1: Try just Gemma * Experiment 1: Just try Gemma * make fixup * Trigger tests * stash commit * Try adding Gemma3 as well * make fixup * Correct attrib names * Correct pipeline model mapping * Add in all_model_classes for Gemma1 again * Move the pipeline model mapping around again * make fixup * Revert Gemma3 changes since it's a VLM * Let's try Falcon * Correct attributes * Correct attributes * Let's try just overriding get_config() for now * Do Nemotron too * And Llama! * Do llama/persimmon * Correctly skip tests * Fix Persimmon * Include Phimoe * Fix Gemma2 * Set model_tester_class correctly * Add GLM * More models! * models models models * make fixup * Add Qwen3 + Qwen3MoE * Correct import * make fixup * Add the QuestionAnswering classes * Add the QuestionAnswering classes * Move pipeline mapping to the right place * Jetmoe too * Stop RoPE testing models with no RoPE * Fix up JetMOE a bit * Fix up JetMOE a bit * Can we just force pad_token_id all the time? * make fixup * fix starcoder2 * Move pipeline mapping * Fix RoPE skipping * Fix RecurrentGemma tests * Fix Falcon tests * Add MoE attributes * Fix values for RoPE testing * Make sure we set bos_token_id and eos_token_id in an appropriate range * make fixup * Fix GLM4 * Add mamba attributes * Revert bits of JetMOE * Re-add the JetMOE skips * Update tests/causal_lm_tester.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add licence --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-23 18:29:31 +01:00
cyyever	1e6b546ea6	Use Python 3.9 syntax in tests (#37343 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-08 14:12:08 +02:00
Matt	2d46a08b63	Purge unused ModelTester code (#37085 ) * Purge correctly this time * Remove more methods from recent PRs * make fixup	2025-04-03 17:48:35 +01:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
Joao Gante	62c7ea0201	CI: avoid human error, automatically infer generative models (#33212 ) * tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one	2025-02-13 16:27:11 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
Arthur	2c47618c1a	🚨All attention refactor🚨 (#35235 ) * refactor LlamaAttention * minimal changes * fix llama * update * modular gemmas * modular nits * modular updates * nits * simplify * gpt2 * more modualr and fixes * granite * modular modular modular * nits * update * qwen2 + starcoder2 * mostly gemma2 * Update image_processing_auto.py * fix * Update modular_starcoder2.py * fix * remove all copied from attentions * remove gcv * make fix-copies * oups * oups2.0 * fix some modulars + all copied from * should be good now * revert unwanted changes * Update modeling_decision_transformer.py * finish cleanup * Update modeling_olmo.py * consistency * re-add gradient checkpointing attribute * fix * style * make config necessary * bis * bis * Update modeling_my_new_model2.py * is_causal attr * fix * remove past kv return from decoder layer * fix * default rope config * correctly fix rope config * fix bias * fix gpt2 attention output * fix test * fix inits * fix default sdpa * fix default sdpa implementation * harmonize classes * fix mistral * fix sliding window models * mixtral * be more explicit * style * fix * several fixes * Update modeling_dbrx.py * fix test * olmo + phi * rotary * syle * phi * phi again * again * kwargs * Update test_modeling_common.py * skip fx tracing tests * Update modeling_utils.py * gemma 2 * again * Update modeling_recurrent_gemma.py * gemma2 * granite * style * starcoder * Update sdpa_attention.py * switch args * Update modeling_mllama.py * fix * cache type tests * gpt2 * Update test_modeling_common.py * fix * consistency * fix shape with encoder * should be the last one * tests non model * most comments * small oupsi * be more explicit in modulars * more explicit modulars * CIs! it works locally * add kwargs to _flash_attention_forward --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2024-12-18 16:53:39 +01:00
Cyril Vallez	d363e71d0e	🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858 ) * update * style * fix missing args * remove last trace of old rope classes * remove deprecated copied from * fix copies * trigger CIs * post rebase clean-up * reverse mistral * cleanup after dropping commits * Add comment	2024-12-11 11:16:52 +01:00
Joao Gante	186b8dc190	Tests: upgrade `test_eager_matches_sdpa_generate` (#34386 )	2024-10-25 11:55:07 +01:00
Joao Gante	e878eaa9fc	Tests: upcast `logits` to `float()` (#34042 ) upcast	2024-10-11 11:51:49 +01:00
Joao Gante	d29738f5b4	Generate tests: modality-agnostic input preparation (#33685 )	2024-10-03 14:01:24 +01:00
Raushan Turganbay	65bb284448	Compile compatibilty for decoder-only models (#32617 ) * squash into one commit * add qwen2-vl for rope standardization * fix mistral compile * fix qwen2-vl * fix-copies	2024-09-09 10:59:04 +02:00
Fanli Lin	e85d86398a	add the missing flash attention test marker (#32419 ) * add flash attention check * fix * fix * add the missing marker * bug fix * add one more * remove order * add one more	2024-08-06 11:18:58 +01:00
Arthur	673440d073	update ruff version (#30932 ) * update ruff version * fix research projects * Empty * Fix errors --------- Co-authored-by: Lysandre <lysandre@huggingface.co>	2024-05-22 06:40:15 +02:00
Joseph Enguehard	07bf2dff78	Add TokenClassification for Mistral, Mixtral and Qwen2 (#29878 ) * Add MistralForTokenClassification * Add tests and docs * Add token classification for Mixtral and Qwen2 * Save llma for token classification draft * Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2 * Formatting * Add token classification support for Qwen2Moe model * Add dropout layer to each ForTokenClassification model * Add copied from in tests * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Propagate suggested changes * Style --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-05-20 10:06:57 +02:00
Jonathan Tow	2f12e40822	[`StableLm`] Add QK normalization and Parallel Residual Support (#29745 ) * init: add StableLm 2 support * add integration test for parallel residual and qk layernorm * update(modeling): match qk norm naming for consistency with phi/persimmon * fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity * `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward` * refactor: rename head states var in `StableLmLayerNormPerHead` * tests: update test model and add generate check	2024-04-08 23:51:58 +02:00
Yih-Dar	43d17c1836	Mark `test_eager_matches_sdpa_generate` flaky for some models (#29479 ) * fix * revert for qwen2 * revert for qwen2 * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-29 11:51:20 +01:00
Joao Gante	441de62f49	RoPE models: add numerical sanity-check test for RoPE scaling (#29808 ) * add hard rope scaling test * make fixup * quick rope scaling tests * add copy statements	2024-03-28 11:25:50 +00:00
Ekaterina Aidova	1d0ea7abe0	support SDPA Attention in stablelm (#29106 ) * support SDPA Attention in stablelm * add integration test * add fallback for output_attentions * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/stablelm/test_modeling_stablelm.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * handle non-contiguous states --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-21 13:12:49 +01:00
Jonathan Tow	de6029a059	Add `StableLM` (#28810 ) * Add `StableLM` * fix(model): re-create from `huggingface-cli add-new-model-like persimmon` * fix: re-add changes to address comments * fix(readme): add links to paper * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref * fix(tests): re-add `@slow` decorator to integration tests * fix(tests): import slow... * fix(readme_hd): remove whitespace edit * fix(tokenizer): auto tokenizer tuple * skip doctests for `modeling_stablelm`	2024-02-14 07:15:18 +01:00

20 Commits