Jonathan Tow
2f12e40822
[StableLm
] Add QK normalization and Parallel Residual Support ( #29745 )
...
* init: add StableLm 2 support
* add integration test for parallel residual and qk layernorm
* update(modeling): match qk norm naming for consistency with phi/persimmon
* fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity
* `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward`
* refactor: rename head states var in `StableLmLayerNormPerHead`
* tests: update test model and add generate check
2024-04-08 23:51:58 +02:00
Yih-Dar
43d17c1836
Mark test_eager_matches_sdpa_generate
flaky for some models ( #29479 )
...
* fix
* revert for qwen2
* revert for qwen2
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-03-29 11:51:20 +01:00
Joao Gante
441de62f49
RoPE models: add numerical sanity-check test for RoPE scaling ( #29808 )
...
* add hard rope scaling test
* make fixup
* quick rope scaling tests
* add copy statements
2024-03-28 11:25:50 +00:00
Ekaterina Aidova
1d0ea7abe0
support SDPA Attention in stablelm ( #29106 )
...
* support SDPA Attention in stablelm
* add integration test
* add fallback for output_attentions
* Update src/transformers/models/stablelm/modeling_stablelm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/models/stablelm/test_modeling_stablelm.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/stablelm/modeling_stablelm.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* handle non-contiguous states
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-21 13:12:49 +01:00
Jonathan Tow
de6029a059
Add StableLM
( #28810 )
...
* Add `StableLM`
* fix(model): re-create from `huggingface-cli add-new-model-like persimmon`
* fix: re-add changes to address comments
* fix(readme): add links to paper
* fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref
* fix(tests): re-add `@slow` decorator to integration tests
* fix(tests): import slow...
* fix(readme_hd): remove whitespace edit
* fix(tokenizer): auto tokenizer tuple
* skip doctests for `modeling_stablelm`
2024-02-14 07:15:18 +01:00