mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-27 00:09:00 +06:00
![]() * Expand a bit the presentation of examples * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> |
||
---|---|---|
.. | ||
README.md | ||
requirements.txt | ||
run_swag.py | ||
run_tf_multiple_choice.py | ||
utils_multiple_choice.py |
Multiple Choice
Based on the script run_swag.py
.
PyTorch script: fine-tuning on SWAG
run_swag
allows you to fine-tune any model from our hub (as long as its architecture as a ForMultipleChoice
version in the library) on the SWAG dataset or your own csv/jsonlines files as long as they are structured the same way. To make it works on another dataset, you will need to tweak the preprocess_function
inside the script.
python examples/multiple-choice/run_swag.py \
--model_name_or_path roberta-base \
--do_train \
--do_eval \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--output_dir /tmp/swag_base \
--per_gpu_eval_batch_size=16 \
--per_device_train_batch_size=16 \
--overwrite_output
Training with the defined hyper-parameters yields the following results:
***** Eval results *****
eval_acc = 0.8338998300509847
eval_loss = 0.44457291918821606
Tensorflow
export SWAG_DIR=/path/to/swag_data_dir
python ./examples/multiple-choice/run_tf_multiple_choice.py \
--task_name swag \
--model_name_or_path bert-base-cased \
--do_train \
--do_eval \
--data_dir $SWAG_DIR \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--max_seq_length 80 \
--output_dir models_bert/swag_base \
--per_gpu_eval_batch_size=16 \
--per_device_train_batch_size=16 \
--logging-dir logs \
--gradient_accumulation_steps 2 \
--overwrite_output