transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Julien Chaumond 042a6aa777 Tokenizers: ability to load from model subfolder (#8586 ) * <small>tiny typo</small> * Tokenizers: ability to load from model subfolder * use subfolder for local files as well * Uniformize model shortcut name => model id * from s3 => from huggingface.co Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>		2020-11-17 08:58:45 -05:00
..
README.md	BIG Reorganize examples (#4213 )	2020-05-07 13:48:44 -04:00
run_mmimdb.py	Tokenizers: ability to load from model subfolder (#8586 )	2020-11-17 08:58:45 -05:00
utils_mmimdb.py	Black 20 release	2020-08-26 17:20:22 +02:00

README.md

MM-IMDb

Based on the script run_mmimdb.py.

MM-IMDb is a Multimodal dataset with around 26,000 movies including images, plots and other metadata.

Training on MM-IMDb

python run_mmimdb.py \
    --data_dir /path/to/mmimdb/dataset/ \
    --model_type bert \
    --model_name_or_path bert-base-uncased \
    --output_dir /path/to/save/dir/ \
    --do_train \
    --do_eval \
    --max_seq_len 512 \
    --gradient_accumulation_steps 20 \
    --num_image_embeds 3 \
    --num_train_epochs 100 \
    --patience 5