transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-25 15:28:59 +06:00

Author	SHA1	Message	Date
Julien Chaumond	60a42ef1c0	[model_cards] Fix CamemBERT table markdown see https://github.com/huggingface/transformers/pull/3836	2020-04-17 20:21:15 -04:00
Julien Chaumond	88aecee6a2	[ci] GitHub-hosted runner has no space left on device	2020-04-17 20:16:00 -04:00
Benjamin Muller	73efa694e6	Update camembert-base-README.md (#3836 )	2020-04-17 20:08:13 -04:00
Patrick von Platen	e9d0bc027a	[Config, Serialization] more readable config serialization (#3797 ) * better config serialization * finish configuration utils	2020-04-17 20:07:18 -04:00
Lysandre Debut	8b63a01d95	XLM tokenizer should encode with bos token (#3791 ) * XLM tokenizer should encode with bos token * Update tests	2020-04-17 11:28:55 -04:00
Patrick von Platen	1d4a35b396	Higher tolerance for past testing in TF T5 (#3844 )	2020-04-17 11:26:16 -04:00
Patrick von Platen	d13eca11e2	Higher tolerance for past testing in T5 (#3843 )	2020-04-17 11:25:14 -04:00
Harutaka Kawamura	b0c9fbb293	Add workflow to build docs (#3763 )	2020-04-17 11:23:18 -04:00
Santiago Castro	c19727fd38	Add support for the null answer in `QuestionAnsweringPipeline` (#3441 ) * Add support for the null answer in `QuestionAnsweringPipeline` * black * Fix min null score computation * Fix a PR comment	2020-04-17 11:17:21 -04:00
Simon Böhm	edf0582c0b	Fix token_type_id in BERT question-answering example (#3790 ) token_type_id is converted into the segment embedding. For question answering, this needs to highlight whether a token belongs to sequence 0 or 1. encode_plus takes care of correctly setting this parameter automatically.	2020-04-17 11:14:12 -04:00
Pierric Cistac	6d00033e97	Question Answering support for Albert and Roberta in TF (#3812 ) * Add TFAlbertForQuestionAnswering * Add TFRobertaForQuestionAnswering * Update TFAutoModel with Roberta/Albert for QA * Clean `super` TF Albert calls	2020-04-17 10:45:30 -04:00
Patrick von Platen	f399c00610	Update README	2020-04-17 09:42:22 +02:00
Sam Shleifer	f0c96fafd1	[examples] summarization/bart/finetune.py supports t5 (#3824 ) renames `run_bart_sum.py` to `finetune.py`	2020-04-16 15:15:19 -04:00
Jonathan Sum	0cec4fab7d	typo: fine-grained token-leven Changing from "fine-grained token-leven" to "fine-grained token-level"	2020-04-16 15:11:23 -04:00
Aryansh Omray	14cdeee75a	Tanh torch warnings	2020-04-16 15:10:35 -04:00
Sam Shleifer	16469fedbd	[PretrainedTokenizer] Factor out tensor conversion method (#3777 )	2020-04-16 15:02:43 -04:00
Patrick von Platen	80a1694514	[Examples, T5] Change newstest2013 to newstest2014 and clean up (#3817 ) * Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion * More pythonic file handling through 'with' context * COSMETIC - ran Black and isort * Fixed reference to number of lines in newstest2014 * Fixed failing test. More pythonic file handling * finish PR from tholiao * remove outcommented lines * make style * make isort happy Co-authored-by: Thomas Liao <tholiao@gmail.com>	2020-04-16 20:00:41 +02:00
Lysandre Debut	d486795158	JIT not compatible with PyTorch/XLA (#3743 )	2020-04-16 11:19:24 -04:00
Davide Fiocco	b1e2368b32	Typo fix (#3821 )	2020-04-16 11:04:32 -04:00
Patrick von Platen	baca8fa8e6	clean pipelines (#3795 )	2020-04-16 10:21:34 -04:00
Patrick von Platen	38f7461df3	[TFT5, Cache] Add cache to TFT5 (#3772 ) * correct gpt2 test inputs * make style * delete modeling_gpt2 change in test file * translate from pytorch * correct tests * fix conflicts * fix conflicts * fix conflicts * fix conflicts * make tensorflow t5 caching work * make style * clean reorder cache * remove unnecessary spaces * fix test	2020-04-16 16:14:52 +02:00
Patrick von Platen	a5b249472e	change pad token id to config pad token id (#3793 )	2020-04-16 15:58:57 +02:00
Sam Shleifer	dbd041243d	[cleanup] factor out get_head_mask, invert_attn_mask, get_exten… (#3806 ) * Delete some copy pasted code	2020-04-16 09:55:25 -04:00
Patrick von Platen	d22894dfd4	[Docs] Add DialoGPT (#3755 ) * add dialoGPT * update README.md * fix conflict * update readme * add code links to docs * Update README.md * Update dialo_gpt2.rst * Update pretrained_models.rst * Update docs/source/model_doc/dialo_gpt2.rst Co-Authored-By: Julien Chaumond <chaumond@gmail.com> * change filename of dialogpt Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-16 09:04:32 +02:00
Sam Shleifer	c59b1e682d	[examples] unit test for run_bart_sum (#3544 ) - adds pytorch-lightning dependency	2020-04-15 18:35:01 -04:00
Patrick von Platen	301bf8d1b4	Create Modelcard for Reformer Model	2020-04-15 16:26:24 +02:00
Patrick von Platen	01c37dcdb5	[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734 ) * remove output_past from pt * make style * add optional input length for gpt2 * add use cache to prepare input * save memory in gpt2 * correct gpt2 test inputs * make past input optional for gpt2 * finish use_cache for all models * make style * delete modeling_gpt2 change in test file * correct docstring * correct is true statements for gpt2	2020-04-14 14:40:28 -04:00
Patrick von Platen	092cf881a5	[Generation, EncoderDecoder] Apply Encoder Decoder 1.5GB memory… (#3778 )	2020-04-13 22:29:28 -04:00
Teven	352d5472b0	Shift labels internally within TransfoXLLMHeadModel when called with labels (#3716 ) * Shifting labels inside TransfoXLLMHead * Changed doc to reflect change * Updated pytorch test * removed IDE whitespace changes * black reformat Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-04-13 18:11:23 +02:00
elk-cloner	5ebd898953	fix dataset shuffling for Distributed training (#huggingface#3721) (#3766 )	2020-04-13 10:11:18 -04:00
HenrykBorzymowski	7972a4019f	updated dutch squad model card (#3736 ) * added model_cards for polish squad models * corrected mistake in polish design cards * updated model_cards for squad2_dutch model * added links to benchmark models Co-authored-by: Henryk Borzymowski <henryk.borzymowski@pwc.com>	2020-04-11 06:44:59 -04:00
HUSEIN ZOLKEPLI	f8c1071c51	Added README huseinzol05/albert-tiny-bahasa-cased (#3746 ) * add bert bahasa readme * update readme * update readme * added xlnet * added tiny-bert and fix xlnet readme * added albert base * added albert tiny	2020-04-11 06:42:06 -04:00
Jin Young Sohn	700ccf6e35	Fix glue_convert_examples_to_features API breakage (#3742 )	2020-04-10 16:03:27 -04:00
Anthony MOI	b7cf9f43d2	Update tokenizers to 0.7.0-rc5 (#3705 )	2020-04-10 14:23:49 -04:00
Jin Young Sohn	551b450527	Add `run_glue_tpu.py` that trains models on TPUs (#3702 ) * Initial commit to get BERT + run_glue.py on TPU * Add README section for TPU and address comments. * Cleanup TPU bits from run_glue.py (#3) TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * No need to call `xm.mark_step()` explicitly (#4) Since for gradient accumulation we're accumulating on batches from `ParallelLoader` instance which on next() marks the step itself. * Resolve R/W conflicts from multiprocessing (#5) * Add XLNet in list of models for `run_glue_tpu.py` (#6) * Add RoBERTa to list of models in TPU GLUE (#7) * Add RoBERTa and DistilBert to list of models in TPU GLUE (#8) * Use barriers to reduce duplicate work/resources (#9) * Shard eval dataset and aggregate eval metrics (#10) * Shard eval dataset and aggregate eval metrics Also, instead of calling `eval_loss.item()` every time do summation with tensors on device. * Change defaultdict to float * Reduce the pred, label tensors instead of metrics As brought up during review some metrics like f1 cannot be aggregated via averaging. GLUE task metrics depends largely on the dataset, so instead we sync the prediction and label tensors so that the metrics can be computed accurately on those instead. * Only use tb_writer from master (#11) * Apply huggingface black code formatting * Style * Remove `--do_lower_case` as example uses cased * Add option to specify tensorboard logdir This is needed for our testing framework which checks regressions against key metrics writtern by the summary writer. * Using configuration for `xla_device` * Prefix TPU specific comments. * num_cores clarification and namespace eval metrics * Cache features file under `args.cache_dir` Instead of under `args.data_dir`. This is needed as our test infra uses data_dir with a read-only filesystem. * Rename `run_glue_tpu` to `run_tpu_glue` Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-04-10 12:53:54 -04:00
Julien Chaumond	cbad305ce6	[docs] The use of `do_lower_case` in scripts is on its way to deprecation (#3738 )	2020-04-10 12:34:04 -04:00
Julien Chaumond	b169ac9c2b	[examples] Generate argparsers from type hints on dataclasses (#3669 ) * [examples] Generate argparsers from type hints on dataclasses * [HfArgumentParser] way simpler API * Restore run_language_modeling.py for easier diff * [HfArgumentParser] final tweaks from code review	2020-04-10 12:21:58 -04:00
Sam Shleifer	7a7fdf71f8	Multilingual BART - (#3602 ) - support mbart-en-ro weights - add MBartTokenizer	2020-04-10 11:25:39 -04:00
Julien Chaumond	f98d0ef2a2	Big cleanup of `glue_convert_examples_to_features` (#3688 ) * Big cleanup of `glue_convert_examples_to_features` * Use batch_encode_plus * Cleaner wrapping of glue_convert_examples_to_features for TF @lysandrejik * Cleanup syntax, thanks to @mfuntowicz * Raise explicit error in case of user error	2020-04-10 10:20:18 -04:00
Patrick von Platen	ce2298fb5f	[T5, generation] Add decoder caching for T5 (#3682 ) * initial commit to add decoder caching for T5 * better naming for caching * finish T5 decoder caching * correct test * added extensive past testing for T5 * clean files * make tests cleaner * improve docstring * improve docstring * better reorder cache * make style * Update src/transformers/modeling_t5.py Co-Authored-By: Yacine Jernite <yjernite@users.noreply.github.com> * make set output past work for all layers * improve docstring * improve docstring Co-authored-by: Yacine Jernite <yjernite@users.noreply.github.com>	2020-04-10 01:02:50 +02:00
calpt	9384e5f6de	Fix force_download of files on Windows (#3697 )	2020-04-09 14:44:57 -04:00
Julien Chaumond	bc65afc4df	[Exbert] Change style of button	2020-04-09 10:44:42 -04:00
LysandreJik	31baeed614	Update quotes cc @julien-c	2020-04-09 09:09:00 -04:00
Teven	f8208fa456	Correct transformers-cli env call	2020-04-09 09:03:19 +02:00
Lysandre Debut	6435b9f908	Updating the TensorFlow models to work as expected with tokenizers v3.0.0 (#3684 ) * Updating modeling tf files; adding tests * Merge `encode_plus` and `batch_encode_plus`	2020-04-08 16:22:44 -04:00
LysandreJik	500aa12318	close #3699	2020-04-08 14:32:47 -04:00
Julien Chaumond	a594ee9c84	More doc for model cards (#3698 ) see https://github.com/huggingface/transformers/pull/3679#pullrequestreview-389368270	2020-04-08 12:12:52 -04:00
Julien Chaumond	83703cd077	Update doc for {Summarization,Translation}Pipeline and other tweaks	2020-04-08 09:45:00 -04:00
Seyone Chithrananda	a1b3b4167e	Created README.md for model card ChemBERTa (#3666 ) * created readme.md * update readme with fixes Fixes from PR comments	2020-04-08 09:10:20 -04:00
Lorenzo Ampil	747907dc5e	Fix typo in FeatureExtractionPipeline docstring	2020-04-08 09:08:56 -04:00

... 225 226 227 228 229 ...

15053 Commits