mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-03 03:31:05 +06:00
changed "ot" to "to" (#21488)
This commit is contained in:
parent
fa0ae17958
commit
8581fbaa6d
@ -176,7 +176,7 @@ class AdamWeightDecay(Adam):
|
||||
with the m and v parameters in strange ways as shown in [Decoupled Weight Decay
|
||||
Regularization](https://arxiv.org/abs/1711.05101).
|
||||
|
||||
Instead we want ot decay the weights in a manner that doesn't interact with the m/v parameters. This is equivalent
|
||||
Instead we want to decay the weights in a manner that doesn't interact with the m/v parameters. This is equivalent
|
||||
to adding the square of the weights to the loss with plain (non-momentum) SGD.
|
||||
|
||||
Args:
|
||||
|
Loading…
Reference in New Issue
Block a user