transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 17:22:25 +06:00

Author	SHA1	Message	Date
Patrick von Platen	96f57c9ccb	[Benchmark] Memory benchmark utils (#4198 ) * improve memory benchmarking * correct typo * fix current memory * check torch memory allocated * better pytorch function * add total cached gpu memory * add total gpu required * improve torch gpu usage * update memory usage * finalize memory tracing * save intermediate benchmark class * fix conflict * improve benchmark * improve benchmark * finalize * make style * improve benchmarking * correct typo * make train function more flexible * fix csv save * better repr of bytes * better print * fix __repr__ bug * finish plot script * rename plot file * delete csv and small improvements * fix in plot * fix in plot * correct usage of timeit * remove redundant line * remove redundant line * fix bug * add hf parser tests * add versioning and platform info * make style * add gpu information * ensure backward compatibility * finish adding all tests * Update src/transformers/benchmark/benchmark_args.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/benchmark/benchmark_args_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * delete csv files * fix isort ordering * add out of memory handling * add better train memory handling Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-05-27 23:22:16 +02:00
Lysandre Debut	6a17688021	per_device instead of per_gpu/error thrown when argument unknown (#4618 ) * per_device instead of per_gpu/error thrown when argument unknown * [docs] Restore examples.md symlink * Correct absolute links so that symlink to the doc works correctly * Update src/transformers/hf_argparser.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Warning + reorder * Docs * Style * not for squad Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-27 11:36:55 -04:00
Hao Tan	a9aa7456ac	Add back --do_lower_case to uncased models (#4245 ) The option `--do_lower_case` is currently required by the uncased models (i.e., bert-base-uncased, bert-large-uncased). Results: BERT-BASE without --do_lower_case: 'exact': 73.83, 'f1': 82.22 BERT-BASE with --do_lower_case: 'exact': 81.02, 'f1': 88.34	2020-05-26 21:13:07 -04:00
Antonis Maronikolakis	50d1ce411f	add DistilBERT to supported models (#4558 )	2020-05-25 14:50:45 -04:00
Zhangyx	49296533ca	Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463 ) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-21 09:17:44 -04:00
Tobias Lee	271bedb485	[examples] fix no grad in second pruning in run_bertology (#4479 ) * fix no grad in second pruning and typo * fix prune heads attention mismatch problem * fix * fix * fix * run make style * run make style	2020-05-21 09:17:03 -04:00
Patrick von Platen	aa925a52fa	[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468 ) * fix gpu slow tests in pytorch * change model to device syntax	2020-05-19 21:35:04 +02:00
Julien Chaumond	5e7fe8b585	Distributed eval: SequentialDistributedSampler + gather all results (#4243 ) * Distributed eval: SequentialDistributedSampler + gather all results * For consistency only write to disk from world_master Close https://github.com/huggingface/transformers/issues/4272 * Working distributed eval * Hook into scripts * Fix #3721 again * TPU.mesh_reduce: stay in tensor space Thanks @jysohn23 * Just a small comment * whitespace * torch.hub: pip install packaging * Add test scenarii	2020-05-18 22:02:39 -04:00
Boris Dayma	d9ece8233d	fix(run_language_modeling): use arg overwrite_cache (#4407 )	2020-05-18 11:37:35 -04:00
Julien Chaumond	757baee846	Fix un-prefixed f-string see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693 Hat/tip @girishponkiya	2020-05-18 11:20:46 -04:00
Julien Chaumond	15550ce0d1	[skip ci] remove local rank	2020-05-15 17:08:38 -04:00
Lysandre Debut	edf9ac11d4	Should return overflowing information for the log (#4385 )	2020-05-15 09:49:11 -04:00
Julien Chaumond	af2e6bf87c	[examples] Streamline doc	2020-05-14 20:34:31 -04:00
Julien Chaumond	448c467256	Fix: unpin flake8 and fix cs errors (#4367 ) * Fix: unpin flake8 and fix cs errors * Ok we still need to quote those	2020-05-14 13:14:26 -04:00
Julien Chaumond	c547f15a17	Use Filelock to ensure distributed barriers see context in https://github.com/huggingface/transformers/pull/4223	2020-05-14 11:58:32 -04:00
Julien Plu	ca13618681	Question Answering for TF trainer (#4320 ) * Add QA trainer example for TF * Make data_dir optional * Fix parameter logic * Fix feature convert * Update the READMEs to add the question-answering task * Apply style * Change 'sequence-classification' to 'text-classification' and prefix with 'eval' all the metric names * Apply style * Apply style	2020-05-13 09:22:31 -04:00
Julien Chaumond	241759101e	(v2) Improvements to the wandb integration (#4324 ) * Improvements to the wandb integration * small reorg + no global necessary * feat(trainer): log epoch and final metrics * Simplify logging a bit * Fixup * Fix crash when just running eval Co-authored-by: Chris Van Pelt <vanpelt@gmail.com> Co-authored-by: Boris Dayma <boris.dayma@gmail.com>	2020-05-12 21:52:01 -04:00
Viktor Alm	e4512aab3b	Add MultipleChoice to TFTrainer [WIP] (#4270 ) * catch gpu len 1 set to gpu0 * Add mpc to trainer * Add MPC for TF * fix TF automodel for MPC and add Albert * Apply style * Fix import * Note to self: double check * Make shape None, None for datasetgenerator output shapes * Add from_pt bool which doesnt seem to work * Original checkpoint dir * Fix docstrings for automodel * Update readme and apply style * Colab should probably not be from users * Colabs should probably not be from users * Add colab * Update README.md * Update README.md * Cleanup __intit__ * Cleanup flake8 trailing comma * Update src/transformers/training_args_tf.py * Update src/transformers/modeling_tf_auto.py Co-authored-by: Viktor Alm <viktoralm@pop-os.localdomain> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-12 08:48:48 -04:00
Stefan Schweter	3f42eb979f	Documentation: fix links to NER examples (#4279 ) * docs: fix link to token classification (NER) example * examples: fix links to NER scripts	2020-05-11 12:48:21 -04:00
Julien Chaumond	7b75aa9fa5	[TPU] Doc, fix xla_spawn.py, only preprocess dataset once (#4223 ) * [TPU] Doc, fix xla_spawn.py, only preprocess dataset once * Update examples/README.md * [xla_spawn] Add `_mp_fn` to other Trainer scripts * [TPU] Fix: eval dataloader was None	2020-05-08 14:10:05 -04:00
Julien Chaumond	c99fe0386b	[doc] Fix broken links + remove crazy big notebook	2020-05-07 18:44:18 -04:00
Julien Chaumond	6669915b65	[examples] Add column for pytorch-lightning support	2020-05-07 15:26:58 -04:00
Julien Chaumond	612fa1b10b	Examples readme.md (#4215 ) * README * Update README.md	2020-05-07 15:00:06 -04:00
Julien Chaumond	0ae96ff8a7	BIG Reorganize examples (#4213 ) * Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around	2020-05-07 13:48:44 -04:00
Lysandre Debut	ebf80e2e70	Tpu trainer (#4146 ) * wip * wip * a last wip * Better logging when using TPUs * Correct argument name * Tests * fix * Metrics in evaluation * Update src/transformers/training_args.py * [tpu] Use launcher script instead * [tpu] lots of tweaks * Fix formatting Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-07 10:34:04 -04:00
Julien Plu	aad50151f3	TF version of the trainer (#4017 ) * First commit to add a TF version of the trainer. * Make the TF trainer closer to what looks the PT trainer * Refactoring common code between the PT and TF trainer into an util file. * Some bugfix + better similarity with the PT trainer * Add missing class in transformers init * Bugfix over prediction + use classification report instead of simple metrics * Fix name error * Fix optimization tests + style * Apply style * Several bugfix for multi-gpu training * Apply style * Apply style * Add glue example for the TF trainer * Several bugix + address the reviews * Fix on the TF training args file * Add a debug mode * Bugfix in utils_ner.py when segment_ids is None * Apply style * Apply style * Add TPU strategy * Fix selection strategy	2020-05-06 12:56:52 -04:00
Simone Primarosa	25296b12aa	Fix overwrite_cache behaviour for pytorch lightning examples (#4093 )	2020-05-06 12:24:49 -04:00
William Falcon	4c5bd92183	Update run_pl_glue.py (#4117 )	2020-05-02 10:38:30 -04:00
William Falcon	5282b31df4	Update run_pl_ner.py (#4118 )	2020-05-02 10:38:21 -04:00
Stefan Schweter	1e616c0af3	NER: parse args from .args file or JSON (#4110 ) * ner: parse args from .args file or JSON * examples: mention json-based configuration file support for run_ner script	2020-05-02 10:29:17 -04:00
Julien Chaumond	b8686174be	Merge pull request #3934 from huggingface/examples_args_from_files [qol] example scripts: parse args from .args file or JSON	2020-04-30 22:40:13 -04:00
Julien Chaumond	455c639093	CDN urls (#4030 ) * [file_utils] use_cdn + documentation * Move to cdn. urls for weights * [urls] Hotfix for bert-base-japanese	2020-04-28 20:27:14 -04:00
Sam Shleifer	d714dfeaa8	[isort] add known 3rd party to setup.cfg (#4053 ) * add known 3rd party to setup.cfg * comment * Update CONTRIBUTING.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-28 17:12:00 -04:00
Patrick von Platen	180585741c	[Generation] Generation should allow to start with empty prompt (#3993 ) * fix empty prompt * fix length in generation pipeline	2020-04-28 14:33:15 +02:00
Julien Chaumond	c811526004	[examples] For convenience, also save the tokenizer Close #3921	2020-04-24 09:52:42 -04:00
Cola	b0167632ce	Shuffle train subset for summarization example (#3909 ) * Shuffle train subset * Cleaner shuffle	2020-04-24 07:55:34 -04:00
Julien Chaumond	1dc9b3c784	Fixes #3877	2020-04-22 01:15:10 +00:00
Julien Chaumond	dd9d483d03	Trainer (#3800 ) * doc * [tests] Add sample files for a regression task * [HUGE] Trainer * Feedback from @sshleifer * Feedback from @thomwolf + logging tweak * [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes * [glue] Use default max_seq_length of 128 like before * [glue] move DataTrainingArguments around * [ner] Change interface of InputExample, and align run_{tf,pl} * Re-align the pl scripts a little bit * ner * [ner] Add integration test * Fix language_modeling with API tweak * [ci] Tweak loss target * Don't break console output * amp.initialize: model must be on right device before * [multiple-choice] update for Trainer * Re-align to `827d6d6ef0`	2020-04-21 20:11:56 -04:00
Andrey Kulagin	b1ff0b2ae7	Fix bug in examples: double wrap into DataParallel during eval	2020-04-20 19:37:44 -04:00
Jared T Nielsen	c79b550dd0	Add `qas_id` to SquadResult and SquadExample (#3745 ) * Add qas_id * Fix incorrect name in squad.py * Make output files optional for squad eval	2020-04-20 16:08:57 -04:00
Sam Shleifer	a504cb49ec	[examples] fix summarization do_predict (#3866 )	2020-04-20 10:49:56 -04:00
Thomas Wolf	827d6d6ef0	Cleanup fast tokenizers integration (#3706 ) * First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By: Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by: Stefan Schweter <stefan@schweter.it>	2020-04-18 13:43:57 +02:00
Sam Shleifer	f0c96fafd1	[examples] summarization/bart/finetune.py supports t5 (#3824 ) renames `run_bart_sum.py` to `finetune.py`	2020-04-16 15:15:19 -04:00
Patrick von Platen	80a1694514	[Examples, T5] Change newstest2013 to newstest2014 and clean up (#3817 ) * Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion * More pythonic file handling through 'with' context * COSMETIC - ran Black and isort * Fixed reference to number of lines in newstest2014 * Fixed failing test. More pythonic file handling * finish PR from tholiao * remove outcommented lines * make style * make isort happy Co-authored-by: Thomas Liao <tholiao@gmail.com>	2020-04-16 20:00:41 +02:00
Davide Fiocco	b1e2368b32	Typo fix (#3821 )	2020-04-16 11:04:32 -04:00
Sam Shleifer	c59b1e682d	[examples] unit test for run_bart_sum (#3544 ) - adds pytorch-lightning dependency	2020-04-15 18:35:01 -04:00
Patrick von Platen	01c37dcdb5	[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734 ) * remove output_past from pt * make style * add optional input length for gpt2 * add use cache to prepare input * save memory in gpt2 * correct gpt2 test inputs * make past input optional for gpt2 * finish use_cache for all models * make style * delete modeling_gpt2 change in test file * correct docstring * correct is true statements for gpt2	2020-04-14 14:40:28 -04:00
elk-cloner	5ebd898953	fix dataset shuffling for Distributed training (#huggingface#3721) (#3766 )	2020-04-13 10:11:18 -04:00
Jin Young Sohn	700ccf6e35	Fix glue_convert_examples_to_features API breakage (#3742 )	2020-04-10 16:03:27 -04:00
Jin Young Sohn	551b450527	Add `run_glue_tpu.py` that trains models on TPUs (#3702 ) * Initial commit to get BERT + run_glue.py on TPU * Add README section for TPU and address comments. * Cleanup TPU bits from run_glue.py (#3) TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * No need to call `xm.mark_step()` explicitly (#4) Since for gradient accumulation we're accumulating on batches from `ParallelLoader` instance which on next() marks the step itself. * Resolve R/W conflicts from multiprocessing (#5) * Add XLNet in list of models for `run_glue_tpu.py` (#6) * Add RoBERTa to list of models in TPU GLUE (#7) * Add RoBERTa and DistilBert to list of models in TPU GLUE (#8) * Use barriers to reduce duplicate work/resources (#9) * Shard eval dataset and aggregate eval metrics (#10) * Shard eval dataset and aggregate eval metrics Also, instead of calling `eval_loss.item()` every time do summation with tensors on device. * Change defaultdict to float * Reduce the pred, label tensors instead of metrics As brought up during review some metrics like f1 cannot be aggregated via averaging. GLUE task metrics depends largely on the dataset, so instead we sync the prediction and label tensors so that the metrics can be computed accurately on those instead. * Only use tb_writer from master (#11) * Apply huggingface black code formatting * Style * Remove `--do_lower_case` as example uses cased * Add option to specify tensorboard logdir This is needed for our testing framework which checks regressions against key metrics writtern by the summary writer. * Using configuration for `xla_device` * Prefix TPU specific comments. * num_cores clarification and namespace eval metrics * Cache features file under `args.cache_dir` Instead of under `args.data_dir`. This is needed as our test infra uses data_dir with a read-only filesystem. * Rename `run_glue_tpu` to `run_tpu_glue` Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-04-10 12:53:54 -04:00

1 2 3 4 5 ...

975 Commits