Update modeling_gpt_neox.py (#17575)

I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`. If this is incorrect, please feel free to just close the PR. Thanks!
2025-08-03 03:31:05 +06:00 · 2022-06-13 09:59:27 -04:00 · 2022-06-13 09:59:27 -04:00 · 5483388631
commit 5483388631
parent a1344dbfb9
1 changed files with 1 additions and 0 deletions
--- a/src/transformers/models/gpt_neox/modeling_gpt_neox.py
+++ b/src/transformers/models/gpt_neox/modeling_gpt_neox.py
@ -53,6 +53,7 @@ class GPTNeoXPreTrainedModel(PreTrainedModel):
    config_class = GPTNeoXConfig
    base_model_prefix = "gpt_neox"
    supports_gradient_checkpointing = True
+    _no_split_modules = ["GPTNeoXLayer"]

    def _init_weights(self, module):
        """Initialize the weights"""