riaz.somc/transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 21:30:07 +06:00

Shane A 9121ab8fe8

Rename OLMo November to OLMo2 (#34864 )

* Rename/move OLMo Nov files to OLMo2

* Rename Olmo1124 and its variants to Olmo2

2024-11-25 16:31:22 +01:00

1.5 KiB

Raw Blame History

OLMo2

Overview

The OLMo2 model is the successor of the OLMo model, which was proposed in OLMo: Accelerating the Science of Language Models.

The architectural changes from the original OLMo model to this model are:

RMSNorm is used instead of standard layer norm.
Norm is applied to attention queries and keys.
Norm is applied after attention/feedforward layers rather than before.

This model was contributed by shanearora. The original code can be found here.

Olmo2Config

autodoc Olmo2Config

Olmo2Model

autodoc Olmo2Model - forward

Olmo2ForCausalLM

autodoc Olmo2ForCausalLM - forward