transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-20 13:08:21 +06:00

Author	SHA1	Message	Date
Ethan Perez	e5c393dceb	[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437 ) * Using loaded checkpoint with --do_predict Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix). * Update checkpoint loading * Fixing model loading	2020-03-30 17:06:08 -04:00
Sam Shleifer	8deff3acf2	[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488 )	2020-03-30 12:28:27 -04:00
Julien Plu	d38bbb225f	Update the NER TF script (#3511 ) * Update the NER TF script to remove the softmax and make the pad token label id to -1 * Reformat the quality and style Co-authored-by: Julien Plu <julien.plu@adevinta.com>	2020-03-30 09:50:12 -04:00
Sam Shleifer	33ef7002e1	[Docs] examples/summarization/bart: Simplify CNN/DM preprocessi… (#3516 )	2020-03-29 13:25:42 -04:00
Patrick von Platen	17dceae7a1	Fix circle ci flaky fail of wmt example (#3485 ) * force bleu * fix wrong file name * rename file * different filenames for each example test * test files should clean up after themselves * test files should clean up after themselves * do not force bleu * correct typo * fix isort	2020-03-27 13:01:28 -04:00
Funtowicz Morgan	b08259a120	run_ner.py / bert-base-multilingual-cased can output empty tokens (#2991 ) * Use tokenizer.num_added_tokens to count number of added special_tokens instead of hardcoded numbers. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * run_ner.py - Do not add a label to the labels_ids if word_tokens is empty. This can happen when using bert-base-multilingual-cased with an input containing an unique space. In this case, the tokenizer will output just an empty word_tokens thus leading to an non-consistent behavior over the labels_ids tokens adding one more tokens than tokens vector. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-27 10:59:55 -04:00
Patrick von Platen	f4f4946836	Rename `t5-large` to `t5-base` in README.md	2020-03-27 15:57:58 +01:00
Lysandre Debut	ff80b73157	Add option to choose T5 model size. (#3480 ) T5-small in test isort	2020-03-27 15:56:59 +01:00
Patrick von Platen	5ad2ea06af	Add wmt translation example (#3428 ) * add translation example * make style * adapt docstring * add gpu device as input for example * small renaming * better README	2020-03-26 19:07:59 +01:00
Patrick von Platen	e703e923ca	Add t5 summarization example (#3411 ) * rebase to master * change tf to pytorch * change to pytorch * small fix * renaming * add gpu training possibility * renaming * improve README * incoorporate collins feedback * better Readme * better README.md	2020-03-26 18:17:55 +01:00
Lysandre Debut	ffcffebe85	Force the return of token type IDs (#3439 )	2020-03-26 09:41:36 +01:00
Andre Carrera	3d76df3a12	BART for summarization training with CNN/DM using pytorch-lightning	2020-03-24 21:00:24 -04:00
Julien Chaumond	eaabaaf750	[run_language_modeling] Fix: initialize a new model from a config object	2020-03-24 17:56:40 -04:00
Julien Chaumond	f8823bad9a	Expose missing mappings (see #3415 )	2020-03-24 17:46:25 -04:00
Julien Chaumond	a8e3336a85	[examples] Use AutoModels in more examples	2020-03-23 20:11:14 -04:00
Julien Chaumond	f7dcf8fcea	[BertAbs] Move files around for more consistent naming	2020-03-23 13:58:49 -04:00
Julien Chaumond	cf72479bf1	One last reorder of {scheduler,optimizer}.step()	2020-03-20 18:05:50 -04:00
Elijah Rippeth	634bf6cf7e	fixes lr_scheduler warning For more details, see https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate	2020-03-20 18:03:50 -04:00
Patrick von Platen	95e00d0808	Clean special token init in modeling_....py (#3264 ) * make style * fix conflicts	2020-03-20 21:41:04 +01:00
Nitish Shirish Keskar	8becb73293	removing torch.cuda.empty_cache() from TF function (#3267 ) torch.cuda.empty_cache() was being called from a TF function (even when torch is unavailable) not sure any replacement is needed if TF OOMs	2020-03-19 23:25:30 +01:00
Julien Chaumond	656e1386a2	Fix #3305 : run_ner only possible on ModelForTokenClassification models	2020-03-19 16:41:28 -04:00
mataney	c44a17db1b	[FIX] not training when epoch is small (#3006 ) * solving bug where for small epochs and large gradient_accumulation_steps we never train * black formatting * no need to change these files	2020-03-19 11:21:21 -04:00
J.P Lee	2b60a26b46	Update examples/ner/run_ner.py to use AutoModel (#3305 ) * Update examples/ner/run_ner.py to use AutoModel * Fix missing code and apply `make style` command	2020-03-17 12:30:10 -04:00
Nathan Raw	930c9412b4	[WIP] Lightning glue example (#3290 ) * ✨ Alter base pl transformer to use automodels * 🐛 Add batch size env variable to function call * 💄 Apply black code style from Makefile * 🚚 Move lightning base out of ner directory * ✨ Add lightning glue example * 💄 self * move _feature_file to base class * ✨ Move eval logging to custom callback * 💄 Apply black code style * 🐛 Add parent to pythonpath, remove copy command * 🐛 Add missing max_length kwarg	2020-03-17 11:46:42 -04:00
Patrick von Platen	e8f44af5bf	[generate] do_sample default back to False (#3298 ) * change do_samples back * None better default as boolean * adapt do_sample to True in test example * make style	2020-03-17 10:52:37 -04:00
Thomas Wolf	2187c49f5c	CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186 ) * memory benchmark rss * have both forward pass and line-by-line mem tracing * cleaned up tracing * refactored and cleaning up API * no f-strings yet... * add GPU mem logging * fix GPU memory monitoring * style and quality * clean up and doc * update with comments * Switching to python 3.6+ * fix quality	2020-03-17 10:17:11 -04:00
Sam Shleifer	5ea8ba67b4	[BART] Remove unused kwargs (#3279 ) * Remove unused kwargs * dont call forward in tests	2020-03-15 23:00:44 -04:00
Thomas Wolf	3814e167d9	Merge pull request #3225 from patrickvonplaten/finalize_merge_bart_generate_into_default_generate Complete merge Seq-2-Seq generation into default generation	2020-03-14 15:08:59 +01:00
Patrick von Platen	4f75d380a4	make style	2020-03-13 16:35:52 +01:00
Patrick von Platen	c2ee3840ae	update file to new starting token logic	2020-03-13 16:34:44 +01:00
dependabot[bot]	afea70c01c	Bump psutil from 5.6.3 to 5.6.6 in /examples/distillation Bumps [psutil](https://github.com/giampaolo/psutil) from 5.6.3 to 5.6.6. - [Release notes](https://github.com/giampaolo/psutil/releases) - [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst) - [Commits](https://github.com/giampaolo/psutil/compare/release-5.6.3...release-5.6.6) Signed-off-by: dependabot[bot] <support@github.com>	2020-03-12 21:14:56 -04:00
Sam Shleifer	2e81b9d8d7	Bart: update example for #3140 compatibility (#3233 ) * Update bart example docs	2020-03-12 10:36:37 -04:00
Patrick von Platen	5b3000d933	renamed min_len to min_length	2020-03-11 11:06:56 +01:00
Shubham Agarwal	5ca356a464	NER - pl example (#3180 ) * 1. seqeval required by ner pl example. install from examples/requirements. 2. unrecognized arguments: save_steps * pl checkpoint callback filenotfound error: make directory and pass * #3159 pl checkpoint path difference * 1. Updated Readme for pl 2. pl script now also correct displays logs 3. pass gpu ids compared to number of gpus * Updated results in readme * 1. updated readme 2. removing deprecated pl methods 3. finalizing scripts * comment length check * using deprecated validation_end for stable results * style related changes	2020-03-09 20:43:38 -04:00
Sam Shleifer	3aca02efb3	Bart example: model.to(device) (#3194 )	2020-03-09 15:09:35 -04:00
Lysandre	eb3e6cb04f	cased -> uncased in BERT SQuAD example closes #3183	2020-03-09 10:54:18 -04:00
Sam Shleifer	857e0a0d3b	Rename BartForMaskedLM -> BartForConditionalGeneration (#3114 ) * improved documentation	2020-03-05 17:41:18 -05:00
Sam Shleifer	5b396457e5	Summarization Examples: add Bart CNN Evaluation (#3082 ) * Rename and improve example * Add test * slightly faster test * style * This breaks remy prolly * shorter test string * no slow * newdir structure * New tree * Style * shorter * docs * clean * Attempt future import * more import hax	2020-03-03 15:29:59 -05:00
Davide Fiocco	c0c7ec3458	Don't crash if fine-tuned model doesn't end with a number (#3099 ) That's the same fix applied in https://github.com/huggingface/transformers/issues/2258 , but for the GLUE example	2020-03-03 08:59:47 -05:00
Victor SANH	6b1ff25084	fix n_gpu count when no_cuda flag is activated (#3077 ) * fix n_gpu count when no_cuda flag is activated * someone was left behind	2020-03-02 10:20:21 -05:00
Julien Chaumond	298bed16a8	make style	2020-03-01 14:08:01 -05:00
VictorSanh	852e032ca6	include roberta in run_squad_w_distillation - cc @graviraja	2020-03-01 01:56:50 +00:00
VictorSanh	b5509abb36	--do_lower_case will always trick me...	2020-03-01 01:39:24 +00:00
srush	908fa43b54	Changes to NER examples for PLT and TPU (#3053 ) * changes to allow for tpu training * black * tpu * tpu	2020-02-27 16:45:32 -05:00
Lysandre Debut	8bcb37bfb8	NER support for Albert in run_ner.py and NerPipeline (#2983 ) * * Added support for Albert when fine-tuning for NER * Added support for Albert in NER pipeline * Added command-line options to examples/ner/run_ner.py to better control tokenization * Added class AlbertForTokenClassification * Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens * Added , * Now passes style guide enforcement * Changes from reviews. * Code now passes style enforcement * Added test for AlbertForTokenClassification * Added test for AlbertForTokenClassification	2020-02-27 10:22:55 -05:00
Martin Malmsten	d762d4289c	Code now passes style enforcement	2020-02-26 23:50:40 +01:00
Martin Malmsten	9495d38b0d	Changes from reviews.	2020-02-26 23:36:39 +01:00
Andrew Walker	5bc99e7f33	fix several typos in Distil* readme (#3034 )	2020-02-26 12:39:54 -05:00
Jhuo IH	7a7ee28cb9	missing ner link (#2967 )	2020-02-25 14:06:57 -05:00
Patrick von Platen	65d74c4965	Add preprocessing step for transfo-xl tokenization to avoid tokenizing words followed by punction to <unk> (#2987 ) * add preprocessing to add space before punctuation for transfo_xl * improve warning messages * make style * compile regex at instantination of tokenizer object	2020-02-24 15:11:10 -05:00
Martin Malmsten	105dcb4162	Now passes style guide enforcement	2020-02-23 21:47:59 +01:00
Martin Malmsten	33eb8a165d	Added ,	2020-02-23 21:43:31 +01:00
Martin Malmsten	869b66f6b3	* Added support for Albert when fine-tuning for NER * Added support for Albert in NER pipeline * Added command-line options to examples/ner/run_ner.py to better control tokenization * Added class AlbertForTokenClassification * Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens	2020-02-23 21:13:03 +01:00
saippuakauppias	cafc4dfc7c	fix hardcoded path in examples readme	2020-02-22 11:12:38 -05:00
Patrick von Platen	fc38d4c86f	Improve special_token_id logic in run_generation.py and add tests (#2885 ) * improving generation * finalized special token behaviour for no_beam_search generation * solved modeling_utils merge conflict * solve merge conflicts in modeling_utils.py * add run_generation improvements from PR #2749 * adapted language generation to not use hardcoded -1 if no padding token is available * remove the -1 removal as hard coded -1`s are not necessary anymore * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown * add slow language generation tests for pretrained models using hardcoded output with pytorch seed * delete ipdb * check that all generated tokens are valid * renaming * renaming Generation -> Generate * make style * updated so that generate_beam_search has same token behavior than generate_no_beam_search * consistent return format for run_generation.py * deleted pretrain lm generate tests -> will be added in another PR * cleaning of unused if statements and renaming * run_generate will always return an iterable * make style * consistent renaming * improve naming, make sure generate function always returns the same tensor, add docstring * add slow tests for all lmhead models * make style and improve example comments modeling_utils * better naming and refactoring in modeling_utils * improving generation * finalized special token behaviour for no_beam_search generation * solved modeling_utils merge conflict * solve merge conflicts in modeling_utils.py * add run_generation improvements from PR #2749 * adapted language generation to not use hardcoded -1 if no padding token is available * remove the -1 removal as hard coded -1`s are not necessary anymore * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown * add slow language generation tests for pretrained models using hardcoded output with pytorch seed * delete ipdb * check that all generated tokens are valid * renaming * renaming Generation -> Generate * make style * updated so that generate_beam_search has same token behavior than generate_no_beam_search * consistent return format for run_generation.py * deleted pretrain lm generate tests -> will be added in another PR * cleaning of unused if statements and renaming * run_generate will always return an iterable * make style * consistent renaming * improve naming, make sure generate function always returns the same tensor, add docstring * add slow tests for all lmhead models * make style and improve example comments modeling_utils * better naming and refactoring in modeling_utils * changed fast random lm generation testing design to more general one * delete in old testing design in gpt2 * correct old variable name * temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed * adapted all fast random generate tests to new design * better warning description in modeling_utils * better comment * better comment and error message Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-02-21 12:09:59 -05:00
maximeilluin	c749a543fa	Added CamembertForQuestionAnswering (#2746 ) * Added CamembertForQuestionAnswering * fixed camembert tokenizer case	2020-02-21 12:01:02 -05:00
Martin Malmsten	4452b44b90	Labels are now added to model config under id2label and label2id (#2945 )	2020-02-21 08:53:05 -05:00
Sam Shleifer	53ce3854a1	New BartModel (#2745 ) * Results same as fairseq * Wrote a ton of tests * Struggled with api signatures * added some docs	2020-02-20 18:11:13 -05:00
srush	889d3bfdbb	default arg fix (#2937 )	2020-02-20 15:31:17 -05:00
srush	b662f0e625	Support for torch-lightning in NER examples (#2890 ) * initial pytorch lightning commit * tested multigpu * Fix learning rate schedule * black formatting * fix flake8 * isort * isort * . Co-authored-by: Check your git settings! <chris@chris-laptop>	2020-02-20 11:50:05 -05:00
VictorSanh	2ae98336d1	fix vocab size in binarized_data (distil): int16 vs int32	2020-02-18 16:17:35 +00:00
VictorSanh	0dbddba6d2	fix typo in hans example call	2020-02-17 20:19:57 +00:00
Manuel Romero	4e597c8e4d	Fix typo	2020-02-14 09:07:42 -05:00
Julien Chaumond	4d36472b96	[run_ner] Don't crash if fine-tuning local model that doesn't end with digit	2020-02-14 03:25:29 +00:00
Lysandre	f54a5bd37f	Raise error when using an mlm flag for a clm model + correct TextDataset	2020-02-12 13:23:14 -05:00
Lysandre	569897ce2c	Fix a few issues regarding the language modeling script	2020-02-12 13:23:14 -05:00
VictorSanh	ee5a6856ca	distilbert-base-cased weights + Readmes + omissions	2020-02-07 15:28:13 -05:00
Julien Chaumond	42f08e596f	[examples] rename run_lm_finetuning to run_language_modeling	2020-02-07 09:15:28 -05:00
Julien Chaumond	4f7bdb0958	[examples] Fix broken markdown	2020-02-07 09:15:28 -05:00
Peter Izsak	6fc3d34abd	Fix multi-gpu evaluation in run_glue.py	2020-02-06 16:38:55 -05:00
Julien Chaumond	ada24def22	[run_lm_finetuning] Tweak fix for non-long tensor, close #2728 see `1ebfeb7946` and #2728 Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2020-02-05 12:49:18 -05:00
Yuval Pinter	d1ab1fab1b	pass langs parameter to certain XLM models (#2734 ) * pass langs parameter to certain XLM models Adding an argument that specifies the language the SQuAD dataset is in so language-sensitive XLMs (e.g. `xlm-mlm-tlm-xnli15-1024`) don't default to language `0`. Allows resolution of issue #1799 . * fixing from `make style` * fixing style (again)	2020-02-04 17:12:42 -05:00
Lysandre	3bf5417258	Revert erroneous fix	2020-02-04 16:31:07 -05:00
Lysandre	1ebfeb7946	Cast to long when masking tokens	2020-02-04 15:56:16 -05:00
Lysandre	239dd23f64	[Follow up 213] Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten	2020-02-03 16:08:05 -05:00
Antonio Carlos Falcão Petri	2ba147ecff	Fix typo in examples/utils_ner.py "%s-%d".format() -> "{}-{}".format()	2020-02-01 11:10:57 -05:00
Lysandre	d18d47be67	run_generation style	2020-01-31 12:05:48 -05:00
Lysandre	7365f01d43	do_sample should be set to True in run_generation.py	2020-01-31 11:49:32 -05:00
Jared Nielsen	71a382319f	Correct documentation	2020-01-30 18:41:24 -05:00
Hang Le	f0a4fc6cd6	Add Flaubert	2020-01-30 10:04:18 -05:00
Jared Nielsen	adb8c93134	Remove lines causing a KeyError	2020-01-29 14:01:16 -05:00
Lysandre	335dd5e68a	Default save steps 50 to 500 in all scripts	2020-01-28 09:42:11 -05:00
Julien Chaumond	6b4c3ee234	[run_lm_finetuning] GPT2 tokenizer doesn't have a pad_token ping @lysandrejik	2020-01-27 20:14:02 -05:00
VictorSanh	1ce3fb5cc7	update correct eval metrics (distilbert & co)	2020-01-24 11:45:22 -05:00
Julien Chaumond	1a8e87be4e	Line-by-line text dataset (including padding)	2020-01-21 16:57:38 -05:00
Julien Chaumond	b94cf7faac	change order	2020-01-21 16:57:38 -05:00
Julien Chaumond	2eaa8b6e56	Easier to not support this, as it could be confusing cc @lysandrejik	2020-01-21 16:57:38 -05:00
Julien Chaumond	801aaa5508	make style	2020-01-21 16:57:38 -05:00
Julien Chaumond	56d4ba8ddb	[run_lm_finetuning] Train from scratch	2020-01-21 16:57:38 -05:00
jiyeon_baek	6d5049a24d	Fix typo in examples/run_squad.py Rul -> Run	2020-01-17 11:22:51 -05:00
Lysandre	6e2c28a14a	Run SQuAD warning when the doc stride may be too high	2020-01-16 13:59:26 -05:00
thomwolf	258ed2eaa8	adding details in readme	2020-01-16 13:21:30 +01:00
thomwolf	50ee59578d	update formating - make flake8 happy	2020-01-16 13:21:30 +01:00
thomwolf	1c9333584a	formating	2020-01-16 13:21:30 +01:00
thomwolf	e25b6fe354	updating readme	2020-01-16 13:21:30 +01:00
thomwolf	27c7b99015	adding details in readme - moving file	2020-01-16 13:21:30 +01:00
Nafise Sadat Moosavi	99d4515572	HANS evaluation	2020-01-16 13:21:30 +01:00
Julien Chaumond	83a41d39b3	💄 super	2020-01-15 18:33:50 -05:00
Julien Chaumond	715fa638a7	Merge branch 'master' into from_scratch_training	2020-01-14 18:58:21 +00:00
Julien Chaumond	b803b067bf	Config to Model mapping	2020-01-13 20:05:20 +00:00
IWillPull	a3085020ed	Added repetition penalty to PPLM example (#2436 ) * Added repetition penalty * Default PPLM repetition_penalty to neutral * Minor modifications to comply with reviewer's suggestions. (j -> token_idx) * Formatted code with `make style`	2020-01-10 23:00:07 -05:00
VictorSanh	e83d9f1c1d	cleaning - change ' to " (black requirements)	2020-01-10 19:34:25 -05:00
VictorSanh	ebba9e929d	minor spring cleaning - missing configs + processing	2020-01-10 19:14:58 -05:00
Victor SANH	331065e62d	missing import	2020-01-10 11:42:53 +01:00
Victor SANH	414e9e7122	indents test	2020-01-10 11:42:53 +01:00
Victor SANH	3cdb38a7c0	indents	2020-01-10 11:42:53 +01:00
Victor SANH	ebd45980a0	Align with `run_squad` + fix some errors	2020-01-10 11:42:53 +01:00
Victor SANH	45634f87f8	fix Sampler in distributed training - evaluation	2020-01-10 11:42:53 +01:00
Victor SANH	af1ee9e648	Move `torch.nn.utils.clip_grad_norm_`	2020-01-10 11:42:53 +01:00
Lysandre	164c794eb3	New SQuAD API for distillation script	2020-01-10 11:42:53 +01:00
Lysandre	16ce15ed4b	DistilBERT token type ids removed from inputs in run_squad	2020-01-08 13:18:30 +01:00
Lysandre Debut	f24232cd1b	Fix error with global step in run_squad.py	2020-01-08 11:39:00 +01:00
Oren Amsalem	43114b89ba	spelling correction (#2434 )	2020-01-07 17:25:25 +01:00
Lysandre Debut	27c1b656cc	Fix error with global step in run_lm_finetuning.py	2020-01-07 16:16:12 +01:00
Simone Primarosa	176d3b3079	Add support for Albert and XLMRoberta for the Glue example (#2403 ) * Add support for Albert and XLMRoberta for the Glue example	2020-01-07 14:55:55 +01:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
karajan1001	f01b3e6680	fix #2399 an ImportError in official example (#2400 ) * fix #2399 an ImportError in official example * style Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-01-05 12:50:20 -05:00
Julien Chaumond	629b22adcf	[run_lm_finetuning] mask_tokens: document types	2020-01-01 12:55:10 -05:00
Thomas Wolf	0412f3d929	Merge pull request #2291 from aaugustin/fix-flake8-F841 Fix F841 flake8 warning	2019-12-25 22:37:42 +01:00
Aymeric Augustin	a8d34e534e	Remove [--editable] in install instructions. Use -e only in docs targeted at contributors. If a user copy-pastes command line with [--editable], they will hit an error. If they don't know the --editable option, we're giving them a choice to make before they can move forwards, but this isn't a choice they need to make right now.	2019-12-24 08:46:08 +01:00
Aymeric Augustin	81422c4e6d	Remove unused variables in examples.	2019-12-23 22:29:02 +01:00
Aymeric Augustin	c3783399db	Remove redundant requirements with transformers.	2019-12-23 19:17:27 +01:00
Aymeric Augustin	9fc8dcb2a0	Standardize import. Every other file uses this pattern.	2019-12-23 18:45:42 +01:00
Aymeric Augustin	1c62e87b34	Use built-in open(). On Python 3, `open is io.open`.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	d6eaf4e6d2	Update comments mentioning Python 2.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	75a23d24af	Remove import fallbacks.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	798b3b3899	Remove sys.version_info[0] == 2 or 3.	2019-12-22 18:38:42 +01:00
Aymeric Augustin	6b2200fc88	Remove u-prefixes.	2019-12-22 17:47:54 +01:00
Aymeric Augustin	c824d15aa1	Remove __future__ imports.	2019-12-22 17:47:54 +01:00
Aymeric Augustin	7e98e211f0	Remove unittest.main() in test modules. This construct isn't used anymore these days. Running python tests/test_foo.py puts the tests/ directory on PYTHONPATH, which isn't representative of how we run tests. Use python -m unittest tests/test_foo.py instead.	2019-12-22 14:42:03 +01:00
Aymeric Augustin	ced0a94204	Switch test files to the standard test_*.py scheme.	2019-12-22 14:15:13 +01:00
Aymeric Augustin	c11b3e2926	Sort imports for optional third-party libraries. These libraries aren't always installed in the virtual environment where isort is running. Declaring them properly avoids mixing these third-party imports with local imports.	2019-12-22 11:19:13 +01:00
Aymeric Augustin	939148b050	Fix F401 flake8 warning (x28). Do manually what autoflake couldn't manage.	2019-12-22 10:59:08 +01:00
Aymeric Augustin	783a616999	Fix F401 flake8 warning (x88 / 116). This change is mostly autogenerated with: $ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py I made minor changes in the generated diff.	2019-12-22 10:59:08 +01:00
Aymeric Augustin	80327a13ea	Fix F401 flake8 warning (x152 / 268). This change is mostly autogenerated with: $ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py I made minor changes in the generated diff.	2019-12-22 10:59:08 +01:00
Aymeric Augustin	fa2ccbc081	Fix E266 flake8 warning (x90).	2019-12-22 10:59:08 +01:00
Aymeric Augustin	2ab78325f0	Fix F821 flake8 warning (x47). Ignore warnings related to Python 2, because it's going away soon.	2019-12-22 10:59:07 +01:00
Aymeric Augustin	631be27078	Fix E722 flake8 warnings (x26).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	b0f7db73cd	Fix E741 flake8 warning (x14).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	fd2f17a7a1	Fix E714 flake8 warning (x8).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	5eab3cf6bc	Fix W605 flake8 warning (x5).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	7dce8dc7ac	Fix E731 flake8 warning (x3).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	357db7098c	Fix E712 flake8 warning (x1).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	f9c5317db2	Fix E265 flake8 warning (x1).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	28e608a2c2	Remove trailing whitespace from all Python files. Fixes flake8 warning W291 (x224).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	158e82e061	Sort imports with isort. This is the result of: $ isort --recursive examples templates transformers utils hubconf.py setup.py	2019-12-22 10:57:46 +01:00
Aymeric Augustin	fa84ae26d6	Reformat source code with black. This is the result of: $ black --line-length 119 examples templates transformers utils hubconf.py setup.py There's a lot of fairly long lines in the project. As a consequence, I'm picking the longest widely accepted line length, 119 characters. This is also Thomas' preference, because it allows for explicit variable names, to make the code easier to understand.	2019-12-21 17:52:29 +01:00
Thomas Wolf	73f6e9817c	Merge pull request #2115 from suvrat96/add_mmbt_model [WIP] Add MMBT Model to Transformers Repo	2019-12-21 15:26:08 +01:00
thomwolf	344126fe58	move example to mm-imdb folder	2019-12-21 15:06:52 +01:00
Thomas Wolf	5b7fb6a4a1	Merge pull request #2134 from bkkaggle/saving-and-resuming closes #1960 Add saving and resuming functionality for remaining examples	2019-12-21 15:03:53 +01:00
Thomas Wolf	6f68d559ab	Merge pull request #2130 from huggingface/ignored-index-coherence [BREAKING CHANGE] Setting all ignored index to the PyTorch standard	2019-12-21 14:55:40 +01:00
thomwolf	1ab25c49d3	Merge branch 'master' into pr/2115	2019-12-21 14:54:30 +01:00
thomwolf	b03872aae0	fix merge	2019-12-21 14:49:54 +01:00
Thomas Wolf	518ba748e0	Merge branch 'master' into saving-and-resuming	2019-12-21 14:41:39 +01:00
Thomas Wolf	18601c3b6e	Merge pull request #2173 from erenup/master run_squad with roberta	2019-12-21 14:33:16 +01:00
Thomas Wolf	eeb70cdd77	Merge branch 'master' into saving-and-resuming	2019-12-21 14:29:59 +01:00
Thomas Wolf	ed9b84816e	Merge pull request #1840 from huggingface/generation_sampler [WIP] Sampling sequence generator for transformers	2019-12-21 14:27:35 +01:00
thomwolf	cfa0380515	Merge branch 'master' into generation_sampler	2019-12-21 14:12:52 +01:00
thomwolf	300ec3003c	fixing run_generation example - using torch.no_grad	2019-12-21 14:02:19 +01:00
thomwolf	1c37746892	fixing run_generation	2019-12-21 13:52:49 +01:00
thomwolf	8a2be93b4e	fix merge	2019-12-21 13:31:28 +01:00
Thomas Wolf	562f864038	Merge branch 'master' into fix-xlnet-squad2.0	2019-12-21 12:48:10 +01:00
Thomas Wolf	59941c5d1f	Merge pull request #2189 from stefan-it/xlmr Add support for XLM-RoBERTa	2019-12-20 13:26:38 +01:00
Julien Chaumond	a5a06a851e	[doc] Param name consistency	2019-12-19 16:24:20 -05:00
Aidan Kierans	1718fb9e74	Minor/basic text fixes (#2229 ) * Small clarification Matches line 431 to line 435 for additional clarity and consistency. * Fixed minor typo The letter "s" was previously omitted from the word "docstrings".	2019-12-19 16:23:18 -05:00
Francesco	62c1fc3c1e	Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script	2019-12-19 09:50:56 -05:00
Ejar	284572efc0	Updated typo on the link Updated documentation due to typo	2019-12-19 09:36:43 -05:00
Stefan Schweter	a26ce4dee1	examples: add XLM-RoBERTa to glue script	2019-12-19 02:23:01 +01:00
thomwolf	3d2096f516	further cleanup	2019-12-18 11:50:54 +01:00
thomwolf	83bc5235cf	Merge branch 'master' into pr/2189	2019-12-17 11:47:32 +01:00
Thomas Wolf	f061606277	Merge pull request #2164 from huggingface/cleanup-configs [SMALL BREAKING CHANGE] Cleaning up configuration classes - Adding Model Cards	2019-12-17 09:10:16 +01:00
Lysandre	18a879f475	fix #2180	2019-12-16 16:44:29 -05:00
Lysandre	d803409215	Fix run squad evaluate during training	2019-12-16 16:31:38 -05:00
Stefan Schweter	71b4750517	examples: add support for XLM-RoBERTa to run_ner script	2019-12-16 16:37:27 +01:00
thomwolf	dc667ce1a7	double check cc @LysandreJik	2019-12-14 09:56:27 +01:00
thomwolf	7140363e09	update bertabs	2019-12-14 09:44:53 +01:00
Thomas Wolf	a52d56c8d9	Merge branch 'master' into cleanup-configs	2019-12-14 09:43:07 +01:00
erenup	c7780700f5	Merge branch 'refs/heads/squad_roberta' # Conflicts: # transformers/data/processors/squad.py	2019-12-14 08:53:59 +08:00
erenup	8e9526b4b5	add multiple processing	2019-12-14 08:43:58 +08:00
Lysandre	c8ed1c82c8	[SQUAD] Load checkpoint when evaluating without training	2019-12-13 12:13:48 -05:00
Pierric Cistac	5a5c4349e8	Fix summarization `to_cpu` doc	2019-12-13 10:02:33 -05:00
thomwolf	47f0e3cfb7	cleaning up configuration classes	2019-12-13 14:33:24 +01:00
erenup	9b312f9d41	initial version for roberta squad	2019-12-13 14:51:40 +08:00
LysandreJik	7296f1010b	Cleanup squad and add allow train_file and predict_file usage	2019-12-12 13:01:04 -05:00
LysandreJik	3fd71c4431	Update example scripts	2019-12-12 12:08:54 -05:00
Alan deLevie	fbf5455a86	Fix typo in examples/run_glue.py args declaration. deay -> decay	2019-12-12 11:16:19 -05:00
Bilal Khan	6aa919469d	Update run_xnli to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:31:22 -06:00
Bilal Khan	89896fe04f	Update run_ner to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:31:22 -06:00
Bilal Khan	fdc05cd68f	Update run_squad to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:31:22 -06:00
Bilal Khan	854ec5784e	Update run_glue to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:30:36 -06:00
LysandreJik	b72f9d340e	Correct index in script	2019-12-10 18:33:17 -05:00
LysandreJik	6a73382706	Complete warning + cleanup	2019-12-10 14:33:24 -05:00
Lysandre	dc4e9e5cb3	DataParallel for SQuAD + fix XLM	2019-12-10 19:21:20 +00:00
Rémi Louf	07bc8efbc3	add greedy decoding and sampling	2019-12-10 17:27:50 +01:00
Rémi Louf	4b82c485de	remove misplaced summarization documentation	2019-12-10 09:13:33 -05:00
Thomas Wolf	e57d00ee10	Merge pull request #1984 from huggingface/squad-refactor [WIP] Squad refactor	2019-12-10 11:07:26 +01:00
Suvrat Bhooshan	df3961121f	Add MMBT Model to Transformers Repo	2019-12-09 18:36:48 -08:00
Julien Chaumond	1d18930462	Harmonize `no_cuda` flag with other scripts	2019-12-09 20:37:55 -05:00
Rémi Louf	f7eba09007	clean for release	2019-12-09 20:37:55 -05:00
Rémi Louf	2a64107e44	improve device usage	2019-12-09 20:37:55 -05:00
Rémi Louf	c0707a85d2	add README	2019-12-09 20:37:55 -05:00
Rémi Louf	ade3cdf5ad	integrate ROUGE	2019-12-09 20:37:55 -05:00
Rémi Louf	076602bdc4	prevent BERT weights from being downloaded twice	2019-12-09 20:37:55 -05:00
Rémi Louf	a1994a71ee	simplified model and configuration	2019-12-09 20:37:55 -05:00
Rémi Louf	3a9a9f7861	default output dir to documents dir	2019-12-09 20:37:55 -05:00
Rémi Louf	693606a75c	update the docs	2019-12-09 20:37:55 -05:00
Rémi Louf	2403a66598	give transformers API to BertAbs	2019-12-09 20:37:55 -05:00
Rémi Louf	ba089c780b	share pretrained embeddings	2019-12-09 20:37:55 -05:00
Rémi Louf	9660ba1cbd	Add beam search	2019-12-09 20:37:55 -05:00
Rémi Louf	1c71ecc880	load the pretrained weights for encoder-decoder We currently save the pretrained_weights of the encoder and decoder in two separate directories `encoder` and `decoder`. However, for the `from_pretrained` function to operate with automodels we need to specify the type of model in the path to the weights. The path to the encoder/decoder weights is handled by the `PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice there is no easy way to infer the type of model that was initialized for the encoder and decoder we add a parameter `model_type` to the function. This is not an ideal solution as it is error prone, and the model type should be carried by the Model classes somehow. This is a temporary fix that should be changed before merging.	2019-12-09 20:37:55 -05:00
Rémi Louf	07f4cd73f6	update function to add special tokens Since I started my PR the `add_special_token_single_sequence` function has been deprecated for another; I replaced it with the new function.	2019-12-09 20:37:55 -05:00
Bilal Khan	79526f82f5	Remove unnecessary epoch variable	2019-12-09 16:24:35 -05:00
Bilal Khan	9626e0458c	Add functionality to continue training from last saved global_step	2019-12-09 16:24:35 -05:00
Bilal Khan	2d73591a18	Stop saving current epoch	2019-12-09 16:24:35 -05:00
Bilal Khan	0eb973b0d9	Use saved optimizer and scheduler states if available	2019-12-09 16:24:35 -05:00
Bilal Khan	a03fcf570d	Save tokenizer after each epoch to be able to resume training from a checkpoint	2019-12-09 16:24:35 -05:00
Bilal Khan	f71b1bb05a	Save optimizer state, scheduler state and current epoch	2019-12-09 16:24:35 -05:00
LysandreJik	2a4ef098d6	Add ALBERT and XLM to SQuAD script	2019-12-09 10:46:47 -05:00
Lysandre Debut	00c4e39581	Merge branch 'master' into squad-refactor	2019-12-09 10:41:15 -05:00
Thomas Wolf	5482822a2b	Merge pull request #2046 from jplu/tf2-ner-example Add NER TF2 example.	2019-12-06 12:12:22 +01:00
LysandreJik	e9217da5ff	Cleanup Improve global visibility on the run_squad script, remove unused files and fixes related to XLNet.	2019-12-05 16:01:51 -05:00
LysandreJik	9ecd83dace	Patch evaluation for impossible values + cleanup	2019-12-05 14:44:57 -05:00
VictorSanh	35ff345fc9	update requirements	2019-12-05 12:07:04 -05:00
VictorSanh	552c44a9b1	release distilm-bert	2019-12-05 10:14:58 -05:00
Rosanne Liu	ee53de7aac	Pr for pplm (#2060 ) * license * changes * ok * Update paper link and commands to run * pointer to uber repo	2019-12-05 09:20:07 -05:00
Julien Plu	9200a759d7	Add few tests on the TF optimization file with some info in the documentation. Complete the README.	2019-12-05 12:56:43 +01:00
thomwolf	75a97af6bc	fix #1450 - add doc	2019-12-05 11:26:55 +01:00
LysandreJik	f7e4a7cdfa	Cleanup	2019-12-04 16:24:15 -05:00
LysandreJik	cca75e7884	Kill the demon spawn	2019-12-04 15:42:29 -05:00
LysandreJik	9ddc3f1a12	Naming update + XLNet/XLM evaluation	2019-12-04 10:37:00 -05:00
thomwolf	5bfcd0485e	fix #1991	2019-12-04 14:53:11 +01:00
Julien Plu	ecb923da9c	Create a NER example similar to the Pytorch one. It takes the same options, and can be run the same way.	2019-12-04 09:43:15 +01:00
LysandreJik	de276de1c1	Working evaluation	2019-12-03 17:15:51 -05:00
Julien Chaumond	7edb51f3a5	[pplm] split classif head into its own file	2019-12-03 22:07:25 +00:00
VictorSanh	48cbf267c9	Use full dataset for eval (SequentialSampler in Distributed setting)	2019-12-03 11:01:37 -05:00
Julien Chaumond	f434bfc623	[pplm] Update S3 links Co-Authored-By: Piero Molino <w4nderlust@gmail.com>	2019-12-03 10:53:02 -05:00
Ethan Perez	96e83506d1	Always use SequentialSampler during evaluation When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.	2019-12-03 10:15:39 -05:00
Julien Chaumond	3b48806f75	[pplm] README: add setup + tweaks	2019-12-03 10:14:02 -05:00
Julien Chaumond	0cb2c90890	readme Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>	2019-12-03 10:14:02 -05:00
Julien Chaumond	1efb2ae7fc	[pplm] move scripts under examples/pplm/	2019-12-03 10:14:02 -05:00
Piero Molino	a59fdd1627	generate_text_pplm now works with batch_size > 1	2019-12-03 10:14:02 -05:00
w4nderlust	893d0d64fe	Changed order of some parameters to be more consistent. Identical results.	2019-12-03 10:14:02 -05:00
w4nderlust	f42816e7fc	Added additional check for url and path in discriminator model params	2019-12-03 10:14:02 -05:00
w4nderlust	f10b925015	Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample	2019-12-03 10:14:02 -05:00
w4nderlust	75904dae66	Removed global variable device	2019-12-03 10:14:02 -05:00
piero	7fd54b55a3	Added support for generic discriminators	2019-12-03 10:14:02 -05:00
piero	b0eaff36e6	Added a +1 to epoch when saving weights	2019-12-03 10:14:02 -05:00
piero	611961ade7	Added tqdm to preprocessing	2019-12-03 10:14:02 -05:00
piero	afc7dcd94d	Now run_pplm works on cpu. Identical output as before (when using gpu).	2019-12-03 10:14:02 -05:00
piero	61399e5afe	Cleaned perturb_past. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	ffc2935405	Fix for making unditioned generation work. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	9f693a0c48	Cleaned generate_text_pplm. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	61a12f790d	Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	ef47b2c03a	Removed commented code. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	7ea12db3f5	Removed commented code. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	08c6e456a3	Cleaned full_text_generation. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	6c9c131780	More cleanup for run_model. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	7ffe47c888	Improved device specification	2019-12-03 10:14:02 -05:00
piero	4f2164e40e	First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	821de121e8	Minor changes	2019-12-03 10:14:02 -05:00
w4nderlust	7469d03b1c	Fixed minor bug when running training on cuda	2019-12-03 10:14:02 -05:00
piero	0b51fba20b	Added script for training a discriminator for pplm to use	2019-12-03 10:14:02 -05:00
Piero Molino	34a83faabe	Let's make PPLM great again	2019-12-03 10:14:02 -05:00
Julien Chaumond	d5faa74cd6	tokenizer white space: revert to previous behavior	2019-12-03 10:14:02 -05:00
Julien Chaumond	0b77d66a6d	rm extraneous import	2019-12-03 10:14:02 -05:00
Rosanne Liu	83b1e6ac9e	fix the loss backward issue (cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)	2019-12-03 10:14:02 -05:00
Julien Chaumond	572c24cfa2	PPLM (squashed) Co-authored-by: piero <piero@uber.com> Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>	2019-12-03 10:14:02 -05:00
Thomas Wolf	f19a78a634	Merge pull request #1903 from valohai/master Valohai integration	2019-12-03 16:13:01 +01:00
maxvidal	b0ee7c7df3	Added Camembert to available models	2019-11-29 14:17:02 -05:00
Juha Kiili	41aa0e8003	Refactor logs and fix loss bug	2019-11-29 15:33:25 +02:00
Lysandre	bd41e8292a	Cleanup & Evaluation now works	2019-11-28 16:03:56 -05:00
Stefan Schweter	8c276b9c92	Merge branch 'master' into distilbert-german	2019-11-27 18:11:49 +01:00
VictorSanh	d5478b939d	add distilbert + update run_xnli wrt run_glue	2019-11-27 11:07:22 -05:00
VictorSanh	73fe2e7385	remove fstrings	2019-11-27 11:07:22 -05:00
VictorSanh	3e7656f7ac	update readme	2019-11-27 11:07:22 -05:00
VictorSanh	abd397e954	uniformize w/ the cache_dir update	2019-11-27 11:07:22 -05:00
VictorSanh	d5910b312f	move xnli processor (and utils) to transformers/data/processors	2019-11-27 11:07:22 -05:00
VictorSanh	289cf4d2b7	change default for XNLI: dev --> test	2019-11-27 11:07:22 -05:00
VictorSanh	84a0b522cf	mbert reproducibility results	2019-11-27 11:07:22 -05:00
VictorSanh	c4336ecbbd	xnli - output_mode consistency	2019-11-27 11:07:22 -05:00
VictorSanh	d52e98ff9a	add xnli examples/README.md	2019-11-27 11:07:22 -05:00
VictorSanh	71f71ddb3e	run_xnli + utils_xnli	2019-11-27 11:07:22 -05:00
Julien Chaumond	b5d884d25c	Uniformize #1952	2019-11-27 11:05:55 -05:00
Lysandre	4374eaea78	ALBERT for SQuAD	2019-11-26 13:08:12 -05:00
Lysandre	c110c41fdb	Run GLUE and remove LAMB	2019-11-26 13:08:12 -05:00
manansanghi	5d3b8daad2	Minor bug fixes on run_ner.py	2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci	aa92a184d2	resize model when special tokenizer present	2019-11-25 15:06:32 -05:00
Lysandre	7485caefb0	fix #1894	2019-11-25 09:33:39 -05:00
Julien Chaumond	176cd1ce1b	[doc] homogenize instructions slightly	2019-11-23 11:18:54 -05:00
Lysandre	c3ba645237	Works for XLNet	2019-11-22 16:27:37 -05:00
Lysandre	72e506b22e	wip	2019-11-22 16:26:00 -05:00
Rémi Louf	26db31e0c0	update the documentation	2019-11-21 14:41:19 -05:00
Juha Kiili	2cf3447e0a	Glue: log in Valohai-compatible JSON format too	2019-11-21 12:35:25 +02:00
Thomas Wolf	0cdfcca24b	Merge pull request #1860 from stefan-it/camembert-for-token-classification [WIP] Add support for CamembertForTokenClassification	2019-11-21 10:56:07 +01:00
Jin Young Sohn	e70cdf083d	Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested.	2019-11-20 17:54:34 -05:00
Lysandre	454455c695	fix #1879	2019-11-20 09:42:48 -05:00
Stefan Schweter	e7cf2ccd15	distillation: add German distilbert model	2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda	f3386d9383	typo "deay" -> "decay"	2019-11-18 11:50:06 -05:00
Stefan Schweter	56c84863a1	camembert: add support for CamemBERT in run_ner example	2019-11-18 17:06:57 +01:00

... 4 5 6 7 8 ...

1163 Commits