mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-18 03:58:25 +06:00
![]() * init: add StableLm 2 support * add integration test for parallel residual and qk layernorm * update(modeling): match qk norm naming for consistency with phi/persimmon * fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity * `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward` * refactor: rename head states var in `StableLmLayerNormPerHead` * tests: update test model and add generate check |
||
---|---|---|
.. | ||
__init__.py | ||
test_modeling_stablelm.py |