transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-19 20:48:22 +06:00

Author	SHA1	Message	Date
Stefan Schweter	e7cf2ccd15	distillation: add German distilbert model	2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda	f3386d9383	typo "deay" -> "decay"	2019-11-18 11:50:06 -05:00
Stefan Schweter	56c84863a1	camembert: add support for CamemBERT in run_ner example	2019-11-18 17:06:57 +01:00
Julien Chaumond	26858f27cb	[camembert] Upload to s3 + rename script	2019-11-16 00:11:07 -05:00
Louis MARTIN	3e20c2e871	Update demo_camembert.py with new classes	2019-11-16 00:11:07 -05:00
Louis MARTIN	f12e4d8da7	Move demo_camembert.py to examples/contrib	2019-11-16 00:11:07 -05:00
Louis MARTIN	6e72fd094c	Add demo_camembert.py	2019-11-16 00:11:07 -05:00
Thomas Wolf	74ce8de7d8	Merge pull request #1792 from stefan-it/distilbert-for-token-classification DistilBERT for token classification	2019-11-14 22:47:53 +01:00
Thomas Wolf	05db5bc1af	added small comparison between BERT, RoBERTa and DistilBERT	2019-11-14 22:40:22 +01:00
Thomas Wolf	9629e2c676	Merge pull request #1804 from ronakice/master fix multi-gpu eval in torch examples	2019-11-14 22:24:05 +01:00
Thomas Wolf	df99f8c5a1	Merge pull request #1832 from huggingface/memory-leak-schedulers replace LambdaLR scheduler wrappers by function	2019-11-14 22:10:31 +01:00
Rémi Louf	2276bf69b7	update the examples, docs and template	2019-11-14 20:38:02 +01:00
Lysandre	d7929899da	Specify checkpoint in saved file for run_lm_finetuning.py	2019-11-14 10:49:00 -05:00
ronakice	2e31176557	fix multi-gpu eval	2019-11-12 05:55:11 -05:00
Stefan Schweter	2b07b9e5ee	examples: add DistilBert support for NER fine-tuning	2019-11-11 16:19:34 +01:00
Adrian Bauer	7a9aae1044	Fix run_bertology.py Make imports and args.overwrite_cache match run_glue.py	2019-11-08 16:28:40 -05:00
Julien Chaumond	f88c104d8f	[run_tf_glue] Add comment for context	2019-11-05 19:56:43 -05:00
Julien Chaumond	30968d70af	misc doc	2019-11-05 19:06:12 -05:00
Thomas Wolf	e99071f105	Merge pull request #1734 from orena1/patch-1 add progress bar to convert_examples_to_features	2019-11-05 11:34:20 +01:00
Thomas Wolf	ba973342e3	Merge pull request #1553 from WilliamTambellini/timeSquadInference Add speed log to examples/run_squad.py	2019-11-05 11:13:12 +01:00
Thomas Wolf	237fad339c	Merge pull request #1709 from oneraghavan/master Fixing mode in evaluate during training	2019-11-05 10:55:33 +01:00
Oren Amsalem	d7906165a3	add progress bar for convert_examples_to_features It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this	2019-11-05 10:34:27 +02:00
thomwolf	89d6272898	Fix #1623	2019-11-04 16:21:12 +01:00
Thomas Wolf	9a3b173cd3	Merge branch 'master' into master	2019-11-04 11:41:26 +01:00
thomwolf	ad90868627	Update example readme	2019-11-04 11:27:22 +01:00
Raghavan	e5b1048bae	Fixing mode in evaluate during training	2019-11-03 16:14:46 +05:30
Lysandre	1a2b40cb53	run_tf_glue MRPC evaluation only for MRPC	2019-10-31 18:00:51 -04:00
Timothy Liu	be36cf92fb	Added mixed precision support to benchmarks.py	2019-10-31 17:24:37 -04:00
Julien Chaumond	f96ce1c241	[run_generation] Fix generation with batch_size>1	2019-10-31 18:27:11 +00:00
Julien Chaumond	3c1b6f594e	Merge branch 'master' into fix_top_k_top_p_filtering	2019-10-31 13:53:51 -04:00
Victor SANH	fa735208c9	update readme - fix example command distil*	2019-10-30 14:27:28 -04:00
Thomas Wolf	c7058d8224	Merge pull request #1608 from focox/master Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu	2019-10-30 17:14:07 +01:00
Thomas Wolf	04c69db399	Merge pull request #1628 from huggingface/tfglue run_tf_glue works with all tasks	2019-10-30 17:04:03 +01:00
Thomas Wolf	3df4367244	Merge pull request #1601 from huggingface/clean-roberta Clean roberta model & all tokenizers now add special tokens by default (breaking change)	2019-10-30 17:00:40 +01:00
Thomas Wolf	36174696cc	Merge branch 'master' into clean-roberta	2019-10-30 16:51:06 +01:00
Thomas Wolf	228cdd6a6e	Merge branch 'master' into conditional-generation	2019-10-30 16:40:35 +01:00
Rémi Louf	070507df1f	format utils for summarization	2019-10-30 11:24:12 +01:00
Rémi Louf	da10de8466	fix bug with padding mask + add corresponding test	2019-10-30 11:19:58 +01:00
Rémi Louf	3b0d2fa30e	rename seq2seq to encoder_decoder	2019-10-30 10:54:46 +01:00
Rémi Louf	9c1bdb5b61	revert renaming of lm_labels to ltr_lm_labels	2019-10-30 10:43:13 +01:00
Rémi Louf	098a89f312	update docstrings; rename lm_labels to more explicit ltr_lm_labels	2019-10-29 20:08:03 +01:00
Rémi Louf	dfce409691	resolve PR comments	2019-10-29 17:10:20 +01:00
altsoph	079bfb32fb	Evaluation fixed.	2019-10-28 10:18:58 -04:00
altsoph	438f2730a0	Evaluation code fixed.	2019-10-28 10:18:58 -04:00
Rémi Louf	4c3ac4a7d8	here's one big commit	2019-10-28 10:49:50 +01:00
Rémi Louf	932543f77e	fix test of truncation function	2019-10-28 10:49:49 +01:00
Rémi Louf	a67413ccc8	extend works in-place	2019-10-28 10:49:49 +01:00
Rémi Louf	b915ba9dfe	pad sequence with 0, mask with -1	2019-10-28 10:49:49 +01:00
Lysandre	bab6ad01aa	run_tf_glue works with all tasks	2019-10-24 21:41:45 +00:00
Matt Maybeno	ae1d03fc51	Add roberta to doc	2019-10-24 14:32:48 -04:00
Matt Maybeno	4e5f88b74f	Add Roberta to run_ner.py	2019-10-24 14:32:48 -04:00
VictorSanh	5b6cafb11b	[release] fix table weirdness	2019-10-23 10:35:16 -04:00
VictorSanh	8ad5c591cd	[RELEASE] DistilRoBERTa	2019-10-23 10:29:47 -04:00
focox@qq.com	bd847ce7d7	fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.	2019-10-23 20:27:13 +08:00
Julien Chaumond	ef1b8b2ae5	[CTRL] warn if generation prompt does not start with a control code see also https://github.com/salesforce/ctrl/pull/50	2019-10-22 21:30:32 +00:00
Lysandre	7d709e55ed	Remove	2019-10-22 14:12:33 -04:00
Lysandre	1cfd974868	Option to benchmark only one of the two libraries	2019-10-22 13:32:23 -04:00
Pasquale Minervini	abd7110e21	gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well	2019-10-21 19:56:52 +01:00
Pasquale Minervini	3775550c4b	gradient norm clipping should be done right before calling the optimiser	2019-10-20 22:33:56 +01:00
LysandreJik	7dd29ed2f1	Benchmarks example script	2019-10-18 10:53:04 -04:00
William Tambellini	0919389d9a	Add speed log to examples/run_squad.py Add a speed estimate log (time per example) for evaluation to examples/run_squad.py	2019-10-17 14:41:04 -07:00
leo-du	ecd15667f3	fix repetition penalty	2019-10-17 14:47:14 -04:00
thomwolf	8cd56e3036	fix data processing in script	2019-10-17 16:33:26 +02:00
Rémi Louf	578d23e061	add training pipeline (formatting temporary)	2019-10-17 14:02:27 +02:00
Rémi Louf	47a06d88a0	use two different tokenizers for storyand summary	2019-10-17 13:04:26 +02:00
Rémi Louf	bfb9b540d4	add Model2Model to __init__	2019-10-17 12:59:51 +02:00
Rémi Louf	c1bc709c35	correct the truncation and padding of dataset	2019-10-17 10:41:53 +02:00
Rémi Louf	e4e0ee14bd	add separator between data import and train	2019-10-16 20:05:32 +02:00
Rémi Louf	0d81fc853e	specify in readme that both datasets are required	2019-10-15 15:26:33 +02:00
Rémi Louf	1aec940587	test the full story processing	2019-10-15 15:18:07 +02:00
Rémi Louf	22e1af6859	truncation function is fully tested	2019-10-15 14:43:50 +02:00
Rémi Louf	260ac7d9a8	wip commit, switching computers	2019-10-15 12:24:35 +02:00
thomwolf	be916cb3fb	Merge branch 'master' of https://github.com/huggingface/transformers	2019-10-15 10:37:13 +02:00
thomwolf	5875aaf762	install tensorboard	2019-10-15 10:36:46 +02:00
Thomas Wolf	40f14ff545	Merge pull request #1513 from slayton58/amp_fp16_einsum Force einsum to run in fp16	2019-10-15 10:25:00 +02:00
Thomas Wolf	d147671c6c	Merge pull request #1508 from tlkh/master Added performance enhancements (XLA, AMP) to examples	2019-10-15 09:57:18 +02:00
thomwolf	2c1d5564ad	add readme information	2019-10-15 09:56:52 +02:00
thomwolf	c55badcee0	Add NER finetuning details by @stefan-it in example readme	2019-10-15 09:33:52 +02:00
Julien Chaumond	788e632622	[ner] Honor args.overwrite_cache	2019-10-15 09:17:31 +02:00
thomwolf	0f9ebb0b43	add seqeval as requirement for examples	2019-10-15 09:17:31 +02:00
thomwolf	66adb71734	update to transformers	2019-10-15 09:17:31 +02:00
Marianne Stecklina	5ff9cd158a	Add option to predict on test set	2019-10-15 09:17:31 +02:00
Marianne Stecklina	7f5367e0b1	Add cli argument for configuring labels	2019-10-15 09:17:31 +02:00
Marianne Stecklina	e1d4179b64	Make file reading more robust	2019-10-15 09:17:31 +02:00
Marianne Stecklina	383ef96747	Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:17:31 +02:00
Marianne Stecklina	5adb39e757	Add option to predict on test set	2019-10-15 09:14:53 +02:00
Marianne Stecklina	99b189df6d	Add cli argument for configuring labels	2019-10-15 09:14:53 +02:00
Marianne Stecklina	3e9420add1	Make file reading more robust	2019-10-15 09:14:53 +02:00
Marianne Stecklina	cde42c4354	Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:14:53 +02:00
hlums	74c5035808	Fix token order in xlnet preprocessing.	2019-10-14 21:27:11 +00:00
Rémi Louf	fe25eefc15	add instructions to fetch the dataset	2019-10-14 20:45:39 +02:00
Rémi Louf	412793275d	delegate the padding with special tokens to the tokenizer	2019-10-14 20:45:16 +02:00
Rémi Louf	447fffb21f	process the raw CNN/Daily Mail dataset the data provided by Li Dong et al. were already tokenized, which means that they are not compatible with all the models in the library. We thus process the raw data directly and tokenize them using the models' tokenizers.	2019-10-14 18:12:20 +02:00
Simon Layton	4e6a55751a	Force einsum to fp16	2019-10-14 11:12:41 -04:00
Rémi Louf	67d10960ae	load and prepare CNN/Daily Mail data We write a function to load an preprocess the CNN/Daily Mail dataset as provided by Li Dong et al. The issue is that this dataset has already been tokenized by the authors, so we actually need to find the original, plain-text dataset if we want to apply it to all models.	2019-10-14 14:11:20 +02:00
Timothy Liu	376e65a674	Added automatic mixed precision and XLA options to run_tf_glue.py	2019-10-13 13:19:06 +00:00
Timothy Liu	86f23a1944	Minor enhancements to run_tf_glue.py	2019-10-13 10:21:35 +00:00
VictorSanh	d844db4005	Add citation bibtex	2019-10-11 16:55:42 -04:00
Rémi Louf	b3261e7ace	read parameters from CLI, load model & tokenizer	2019-10-11 18:40:38 +02:00
Rémi Louf	d889e0b71b	add base for seq2seq finetuning	2019-10-11 17:36:12 +02:00
Thomas Wolf	4428aefc63	Merge pull request #1488 from huggingface/pytorch-tpu GLUE on TPU	2019-10-11 16:33:00 +02:00
Luran He	f382a8decd	convert int to str before adding to a str	2019-10-10 19:20:39 -04:00
Lysandre	639f4b7190	Don't save/load when on TPU	2019-10-10 19:17:25 +00:00
Lysandre	d4e7934ac3	GLUE on TPU	2019-10-10 19:03:06 +00:00
Rémi Louf	1e68c28670	add test for initialization of Bert2Rnd	2019-10-10 18:07:11 +02:00
Thomas Wolf	6596e3d566	Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard Change tensorboard imports to use built-in tensorboard if available	2019-10-10 11:56:55 +02:00
thomwolf	177a721205	move back to simple space spliting	2019-10-10 11:45:47 +02:00
thomwolf	a5997dd81a	better error messages	2019-10-10 11:31:01 +02:00
Lysandre Debut	2431fea98a	Merge pull request #1383 from keskarnitish/master Adding CTRL	2019-10-09 11:31:05 -04:00
thomwolf	d9e60f4f0d	Merge branch 'master' into pr/1383	2019-10-09 17:25:08 +02:00
Lysandre Debut	e84470ef81	Merge pull request #1384 from huggingface/encoding-qol Quality of life enhancements in encoding + patch MLM masking	2019-10-09 11:18:24 -04:00
jinoobaek-qz	69629c4f0f	Improve naming and only do regex when necessary	2019-10-09 08:48:40 -04:00
jinoobaek-qz	bf34a252b8	Golden path	2019-10-09 08:48:40 -04:00
jinoobaek-qz	528d3f327b	Improve readability and improve make less assumptions about checkpoint format	2019-10-09 08:48:40 -04:00
jinoobaek-qz	56301bd9e8	Extract method	2019-10-09 08:48:40 -04:00
jinoobaek-qz	d6c5469712	Delete older checkpoint after saving new checkpoint	2019-10-09 08:48:40 -04:00
jinoobaek-qz	54a31f50fb	Add save_total_limit	2019-10-09 08:48:40 -04:00
Thomas Wolf	439fac723a	Merge pull request #1409 from brian41005/master Evaluation result.txt path changing #1286	2019-10-09 03:14:34 +02:00
Bilal Khan	5ce8d29abe	Change tensorboard imports to use built-in tensorboard if available	2019-10-08 16:29:43 -05:00
VictorSanh	7ce83b4931	update weights for distilgpt2	2019-10-07 12:30:27 -04:00
LysandreJik	f3e0218fbb	Correct device assignment in run_generation	2019-10-05 21:05:16 -04:00
thomwolf	78ef1a9930	fixes	2019-10-04 17:59:44 -04:00
thomwolf	6c1d0bc066	update encode_plus - add truncation strategies	2019-10-04 17:38:38 -04:00
VictorSanh	0820bb0555	unecessary carriage return	2019-10-04 17:23:15 -04:00
VictorSanh	f5891c3821	run_squad --> run_squad_w_distillation	2019-10-04 17:23:15 -04:00
VictorSanh	764a7923ec	add distillation+finetuning option in run_squad	2019-10-04 17:23:15 -04:00
thomwolf	92c0f2fb90	Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol	2019-10-04 15:48:06 -04:00
Julien Chaumond	9e136ff57c	Honor args.overwrite_cache (h/t @erenup)	2019-10-04 15:00:56 -04:00
keskarnitish	dbed1c5d94	Adding CTRL (squashed commit) adding conversion script adding first draft of modeling & tokenization adding placeholder for test files bunch of changes registering the tokenizer/model/etc tests change link; something is very VERY wrong here weird end-of-word thingy going on i think the tokenization works now ; wrote the unit tests overall structure works;load w next the monster is alive! works after some cleanup as well adding emacs autosave to gitignore currently only supporting the 48 layer one; seems to infer fine on my macbook cleanup fixing some documentation fixing some documentation tests passing? now works on CUDA also adding greedy? adding greedy sampling works well	2019-10-03 22:29:03 -07:00
Lysandre Debut	d3f24dfad7	Merge branch 'master' into master	2019-10-03 22:43:09 +00:00
LysandreJik	ecc4f1bdfa	XLM use_lang_embedding flag in run_generation	2019-10-03 17:42:16 -04:00
LysandreJik	c2c2ca0fdb	Added XLM to run_generation, with prompt language selection.	2019-10-03 17:18:48 -04:00
LysandreJik	aebd83230f	Update naming + remove f string in run_lm_finetuning example	2019-10-03 11:31:36 -04:00
LysandreJik	5ed50a93fb	LM finetuning won't mask special tokens anymore	2019-10-03 11:31:36 -04:00
Brian Ma	7af0777910	Update run_glue.py add DistilBert model shortcut into ALL_MODELS	2019-10-03 15:31:11 +00:00
VictorSanh	5f07d8f11a	prepare release	2019-10-03 10:27:11 -04:00
VictorSanh	35071007cb	incoming release 🔥 update links to arxiv preprint	2019-10-03 10:27:11 -04:00
VictorSanh	2a91f6071f	upddate README - TODO updadte link to paper	2019-10-03 10:27:11 -04:00
VictorSanh	c51e533a5f	update train.py	2019-10-03 10:27:11 -04:00
VictorSanh	a76c3f9cb0	update requirements	2019-10-03 10:27:11 -04:00
VictorSanh	bb9c5ead54	update distiller	2019-10-03 10:27:11 -04:00
VictorSanh	a12ab0a8db	update binarized_data	2019-10-03 10:27:11 -04:00
VictorSanh	4d6dfbd376	update extract	2019-10-03 10:27:11 -04:00
VictorSanh	23edebc079	update extract_distilbert	2019-10-03 10:27:11 -04:00
VictorSanh	cbfcfce205	update token_counts	2019-10-03 10:27:11 -04:00
VictorSanh	19e4ebbe3f	grouped_batch_sampler	2019-10-03 10:27:11 -04:00
VictorSanh	594202a934	lm_seqs_dataset	2019-10-03 10:27:11 -04:00
VictorSanh	38084507c4	add distillation_configs	2019-10-03 10:27:11 -04:00
Brian Ma	2195c0d5f9	Evaluation result.txt path changing #1286	2019-10-03 12:49:12 +08:00
Thomas Wolf	963529e29b	Merge pull request #1288 from echan00/master Typo with LM Fine tuning script	2019-10-01 18:46:07 -04:00

1 2 3 4 5 ...

714 Commits