Fix max_steps documentation regarding the end-of-training condition (#27624)

* fix max_steps doc

* Update src/transformers/training_args.py [ci skip]

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* propagate suggested change

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This commit is contained in:
Quentin Gallouédec 2023-11-22 12:10:11 +01:00 committed by GitHub
parent c651eb23c3
commit b2c63c79c3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 10 additions and 8 deletions

View File

@ -234,8 +234,8 @@ class TrainingArguments:
the last epoch before stopping training). the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`. If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
In case of using a finite iterable dataset the training may stop before reaching the set number of steps For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
when all data is exhausted `max_steps` is reached.
lr_scheduler_type (`str` or [`SchedulerType`], *optional*, defaults to `"linear"`): lr_scheduler_type (`str` or [`SchedulerType`], *optional*, defaults to `"linear"`):
The scheduler type to use. See the documentation of [`SchedulerType`] for all possible values. The scheduler type to use. See the documentation of [`SchedulerType`] for all possible values.
lr_scheduler_kwargs ('dict', *optional*, defaults to {}): lr_scheduler_kwargs ('dict', *optional*, defaults to {}):
@ -2181,9 +2181,9 @@ class TrainingArguments:
Total number of training epochs to perform (if not an integer, will perform the decimal part percents Total number of training epochs to perform (if not an integer, will perform the decimal part percents
of the last epoch before stopping training). of the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
`num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
the set number of steps when all data is exhausted. `max_steps` is reached.
gradient_accumulation_steps (`int`, *optional*, defaults to 1): gradient_accumulation_steps (`int`, *optional*, defaults to 1):
Number of updates steps to accumulate the gradients for, before performing a backward/update pass. Number of updates steps to accumulate the gradients for, before performing a backward/update pass.
@ -2588,9 +2588,9 @@ class TrainingArguments:
Total number of training epochs to perform (if not an integer, will perform the decimal part percents Total number of training epochs to perform (if not an integer, will perform the decimal part percents
of the last epoch before stopping training). of the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
`num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
the set number of steps when all data is exhausted. `max_steps` is reached.
warmup_ratio (`float`, *optional*, defaults to 0.0): warmup_ratio (`float`, *optional*, defaults to 0.0):
Ratio of total training steps used for a linear warmup from 0 to `learning_rate`. Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
warmup_steps (`int`, *optional*, defaults to 0): warmup_steps (`int`, *optional*, defaults to 0):

View File

@ -92,6 +92,8 @@ class TFTrainingArguments(TrainingArguments):
Total number of training epochs to perform. Total number of training epochs to perform.
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`. If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
`max_steps` is reached.
warmup_ratio (`float`, *optional*, defaults to 0.0): warmup_ratio (`float`, *optional*, defaults to 0.0):
Ratio of total training steps used for a linear warmup from 0 to `learning_rate`. Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
warmup_steps (`int`, *optional*, defaults to 0): warmup_steps (`int`, *optional*, defaults to 0):