TFTrainer dataset doc & fix evaluation bug (#6618)

* TFTrainer dataset doc & fix evaluation bug

discussed in #6551

* add docstring to test/eval datasets
This commit is contained in:
Joe Davison 2020-08-20 12:11:36 -04:00 committed by GitHub
parent 573bdb0a5d
commit f9d280a959
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -38,9 +38,17 @@ class TFTrainer:
args (:class:`~transformers.TFTrainingArguments`): args (:class:`~transformers.TFTrainingArguments`):
The arguments to tweak training. The arguments to tweak training.
train_dataset (:class:`~tf.data.Dataset`, `optional`): train_dataset (:class:`~tf.data.Dataset`, `optional`):
The dataset to use for training. The dataset to use for training. The dataset should yield tuples of ``(features, labels)`` where
``features`` is a dict of input features and ``labels`` is the labels. If ``labels`` is a tensor, the loss is
calculated by the model by calling ``model(features, labels=labels)``. If ``labels`` is a dict, such as when
using a QuestionAnswering head model with multiple targets, the loss is instead calculated by calling
``model(features, **labels)``.
eval_dataset (:class:`~tf.data.Dataset`, `optional`): eval_dataset (:class:`~tf.data.Dataset`, `optional`):
The dataset to use for evaluation. The dataset to use for evaluation. The dataset should yield tuples of ``(features, labels)`` where
``features`` is a dict of input features and ``labels`` is the labels. If ``labels`` is a tensor, the loss is
calculated by the model by calling ``model(features, labels=labels)``. If ``labels`` is a dict, such as when
using a QuestionAnswering head model with multiple targets, the loss is instead calculated by calling
``model(features, **labels)``.
compute_metrics (:obj:`Callable[[EvalPrediction], Dict]`, `optional`): compute_metrics (:obj:`Callable[[EvalPrediction], Dict]`, `optional`):
The function that will be used to compute metrics at evaluation. Must take a The function that will be used to compute metrics at evaluation. Must take a
:class:`~transformers.EvalPrediction` and return a dictionary string to metric values. :class:`~transformers.EvalPrediction` and return a dictionary string to metric values.
@ -145,7 +153,11 @@ class TFTrainer:
Args: Args:
eval_dataset (:class:`~tf.data.Dataset`, `optional`): eval_dataset (:class:`~tf.data.Dataset`, `optional`):
If provided, will override `self.eval_dataset`. If provided, will override `self.eval_dataset`. The dataset should yield tuples of ``(features,
labels)`` where ``features`` is a dict of input features and ``labels`` is the labels. If ``labels``
is a tensor, the loss is calculated by the model by calling ``model(features, labels=labels)``. If
``labels`` is a dict, such as when using a QuestionAnswering head model with multiple targets, the
loss is instead calculated by calling ``model(features, **labels)``.
Subclass and override this method if you want to inject some custom behavior. Subclass and override this method if you want to inject some custom behavior.
""" """
@ -173,7 +185,12 @@ class TFTrainer:
Returns a test :class:`~tf.data.Dataset`. Returns a test :class:`~tf.data.Dataset`.
Args: Args:
test_dataset (:class:`~tf.data.Dataset`): The dataset to use. test_dataset (:class:`~tf.data.Dataset`):
The dataset to use. The dataset should yield tuples of ``(features, labels)`` where ``features`` is
a dict of input features and ``labels`` is the labels. If ``labels`` is a tensor, the loss is
calculated by the model by calling ``model(features, labels=labels)``. If ``labels`` is a dict, such
as when using a QuestionAnswering head model with multiple targets, the loss is instead calculated
by calling ``model(features, **labels)``.
Subclass and override this method if you want to inject some custom behavior. Subclass and override this method if you want to inject some custom behavior.
""" """
@ -405,14 +422,18 @@ class TFTrainer:
Args: Args:
eval_dataset (:class:`~tf.data.Dataset`, `optional`): eval_dataset (:class:`~tf.data.Dataset`, `optional`):
Pass a dataset if you wish to override :obj:`self.eval_dataset`. Pass a dataset if you wish to override :obj:`self.eval_dataset`. The dataset should yield tuples of
``(features, labels)`` where ``features`` is a dict of input features and ``labels`` is the labels.
If ``labels`` is a tensor, the loss is calculated by the model by calling ``model(features,
labels=labels)``. If ``labels`` is a dict, such as when using a QuestionAnswering head model with
multiple targets, the loss is instead calculated by calling ``model(features, **labels)``.
Returns: Returns:
A dictionary containing the evaluation loss and the potential metrics computed from the predictions. A dictionary containing the evaluation loss and the potential metrics computed from the predictions.
""" """
eval_ds, steps, num_examples = self.get_eval_tfdataset(eval_dataset) eval_ds, steps, num_examples = self.get_eval_tfdataset(eval_dataset)
output = self._prediction_loop(eval_ds, steps, num_examples, description="Evaluation") output = self.prediction_loop(eval_ds, steps, num_examples, description="Evaluation")
logs = {**output.metrics} logs = {**output.metrics}
logs["epoch"] = self.epoch_logging logs["epoch"] = self.epoch_logging
@ -666,7 +687,11 @@ class TFTrainer:
Args: Args:
test_dataset (:class:`~tf.data.Dataset`): test_dataset (:class:`~tf.data.Dataset`):
Dataset to run the predictions on. Dataset to run the predictions on. The dataset should yield tuples of ``(features, labels)`` where
``features`` is a dict of input features and ``labels`` is the labels. If ``labels`` is a tensor,
the loss is calculated by the model by calling ``model(features, labels=labels)``. If ``labels`` is
a dict, such as when using a QuestionAnswering head model with multiple targets, the loss is instead
calculated by calling ``model(features, **labels)``.
Returns: Returns:
`NamedTuple`: `NamedTuple`:
predictions (:obj:`np.ndarray`): predictions (:obj:`np.ndarray`):