transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 14:20:04 +06:00

Author	SHA1	Message	Date
Thomas Wolf	a615499076	Merge pull request #797 from yzy5630/fix-examples fix some errors for distributed lm_finetuning	2019-07-18 23:32:33 +02:00
yzy5630	a1fe4ba9c9	use new API for save and load	2019-07-18 15:45:23 +08:00
yzy5630	a7ba27b1b4	add parser for adam	2019-07-18 08:52:51 +08:00
yzy5630	d6522e2873	change loss and optimizer to new API	2019-07-17 21:22:34 +08:00
thomwolf	71d597dad0	fix #800	2019-07-17 13:51:09 +02:00
yzy5630	123da5a2fa	fix errors for lm_finetuning examples	2019-07-17 09:56:07 +08:00
yzy5630	60a1bdcdac	fix some errors for distributed lm_finetuning	2019-07-17 09:16:20 +08:00
thomwolf	e848b54730	fix #792	2019-07-16 21:22:19 +02:00
thomwolf	1849aa7d39	update readme and pretrained model weight files	2019-07-16 15:11:29 +02:00
thomwolf	f31154cb9d	Merge branch 'xlnet'	2019-07-16 11:51:13 +02:00
thomwolf	76da9765b6	fix run_generation test	2019-07-15 17:52:35 +02:00
thomwolf	e691fc0963	update QA models tests + run_generation	2019-07-15 17:45:24 +02:00
thomwolf	15d8b1266c	update tokenizer - update squad example for xlnet	2019-07-15 17:30:42 +02:00
thomwolf	3b469cb422	updating squad for compatibility with XLNet	2019-07-15 15:28:37 +02:00
thomwolf	0e9825e252	small fix to run_glue	2019-07-14 23:43:28 +02:00
thomwolf	2397f958f9	updating examples and doc	2019-07-14 23:20:10 +02:00
thomwolf	c490f5ce87	added generation examples in tests	2019-07-13 15:26:58 +02:00
thomwolf	7d4b200e40	good quality generation example for GPT, GPT-2, Transfo-XL, XLNet	2019-07-13 15:25:03 +02:00
thomwolf	7322c314a6	remove python2 testing for examples	2019-07-12 14:24:08 +02:00
thomwolf	936e813c84	clean up examples - added squad example and test	2019-07-12 14:16:06 +02:00
thomwolf	762ded9b1c	wip examples	2019-07-12 11:28:52 +02:00
LysandreJik	3821ecbf4a	Byte order mark management in TSV glue reading.	2019-07-11 20:16:28 -04:00
thomwolf	c6bf1a400d	fix test examples et model pretrained	2019-07-11 22:29:08 +02:00
thomwolf	92a782b108	fix run_glue test	2019-07-11 22:20:10 +02:00
thomwolf	ccb6947dc1	optimization tests	2019-07-11 17:39:47 +02:00
thomwolf	b21d84b027	update examples	2019-07-11 15:37:34 +02:00
thomwolf	ec07cf5a66	rewamp optimization	2019-07-11 14:48:22 +02:00
thomwolf	4fef5919a5	updating examples	2019-07-11 12:03:08 +02:00
thomwolf	50b7e52a7f	WIP examples	2019-07-10 15:33:34 +02:00
thomwolf	ed6c8d37f4	fix merge	2019-07-09 17:14:52 +02:00
thomwolf	4ce237c880	update run_glue	2019-07-09 17:00:32 +02:00
thomwolf	3b7cb7bf44	small update to run_glue	2019-07-09 16:12:15 +02:00
thomwolf	d0efbd3cd1	update sequencesummary module	2019-07-09 15:46:43 +02:00
thomwolf	d5481cbe1b	adding tests to examples - updating summary module - coverage update	2019-07-09 15:29:42 +02:00
thomwolf	b19786985d	unified tokenizer api and serialization + tests	2019-07-09 10:25:18 +02:00
thomwolf	3d5f291386	updates to run_glue	2019-07-05 17:22:15 +02:00
thomwolf	99b90edab1	cleaning up run_glue example	2019-07-05 17:09:35 +02:00
thomwolf	1113f97f33	clean up glue example	2019-07-05 16:31:13 +02:00
thomwolf	162ba383b0	fix model loading	2019-07-05 15:57:14 +02:00
thomwolf	36bca545ff	tokenization abstract class - tests for examples	2019-07-05 15:02:59 +02:00
Thomas Wolf	78462aad61	Merge pull request #733 from ceremonious/parallel-generation Added option to use multiple workers to create training data	2019-07-05 12:04:30 +02:00
thomwolf	0bab55d5d5	[BIG] name change	2019-07-05 11:55:36 +02:00
thomwolf	c41f2bad69	WIP XLM + refactoring	2019-07-03 22:54:39 +02:00
Lei Mao	64b2a828c0	fix evaluation bug	2019-07-01 14:56:24 -07:00
thomwolf	2b56e98892	standardizing API across models - XLNetForSeqClass working	2019-06-28 16:35:09 +02:00
thomwolf	3a00674cbf	fix imports	2019-06-27 17:18:46 +02:00
Mayhul Arora	08ff056c43	Added option to use multiple workers to create training data for lm fine tuning	2019-06-26 16:16:12 -07:00
thomwolf	59cefd4f98	fix #726 - get_lr in examples	2019-06-26 11:28:27 +02:00
thomwolf	092dacfd62	changing is_regression to unified API	2019-06-26 09:54:05 +02:00
thomwolf	e55d4c4ede	various updates to conversion, models and examples	2019-06-26 00:57:53 +02:00
thomwolf	7334bf6c21	pad on left for xlnet	2019-06-24 15:05:11 +02:00
thomwolf	c888663f18	overwrite output directories if needed	2019-06-24 14:38:24 +02:00
thomwolf	62d78aa37e	updating GLUE utils for compatibility with XLNet	2019-06-24 14:36:11 +02:00
thomwolf	24ed0b9346	updating run_xlnet_classifier	2019-06-24 12:00:09 +02:00
thomwolf	f6081f2255	add xlnetforsequence classif and run_classifier example for xlnet	2019-06-24 10:01:07 +02:00
Rocketknight1	c7b2808ed7	Update LM finetuning README to include a literature reference	2019-06-22 15:04:01 +01:00
thomwolf	181075635d	updating model loading and adding special tokens ids	2019-06-21 23:23:37 +02:00
thomwolf	ebd2cb8d74	update from_pretrained to load XLNetModel as well	2019-06-21 21:08:44 +02:00
thomwolf	edfe91c36e	first version bertology ok	2019-06-19 23:43:04 +02:00
thomwolf	7766ce66dd	update bertology	2019-06-19 22:29:51 +02:00
thomwolf	e4b46d86ce	update head pruning	2019-06-19 22:16:30 +02:00
thomwolf	0f40e8d6a6	debugger	2019-06-19 15:38:46 +02:00
thomwolf	0e1e8128bf	more logging	2019-06-19 15:35:49 +02:00
thomwolf	909d4f1af2	cuda again	2019-06-19 15:32:10 +02:00
thomwolf	14f0e8e557	fix cuda	2019-06-19 15:29:28 +02:00
thomwolf	34d706a0e1	pruning in bertology	2019-06-19 15:25:49 +02:00
thomwolf	dc8e0019b7	updating examples	2019-06-19 13:23:20 +02:00
thomwolf	68ab9599ce	small fix and updates to readme	2019-06-19 09:38:38 +02:00
thomwolf	f7e2ac01ea	update barrier	2019-06-18 22:43:35 +02:00
thomwolf	4d8c4337ae	test barrier in distrib training	2019-06-18 22:41:28 +02:00
thomwolf	3359955622	updating run_classif	2019-06-18 22:23:10 +02:00
thomwolf	29b7b30eaa	updating evaluation on a single gpu	2019-06-18 22:20:21 +02:00
thomwolf	7d2001aa44	overwrite_output_dir	2019-06-18 22:13:30 +02:00
thomwolf	16a1f338c4	fixing	2019-06-18 17:06:31 +02:00
thomwolf	92e0ad5aba	no numpy	2019-06-18 17:00:52 +02:00
thomwolf	4e6edc3274	hop	2019-06-18 16:57:15 +02:00
thomwolf	f55b60b9ee	fixing again	2019-06-18 16:56:52 +02:00
thomwolf	8bd9118294	quick fix	2019-06-18 16:54:41 +02:00
thomwolf	3e847449ad	fix out_label_ids	2019-06-18 16:53:31 +02:00
thomwolf	aad3a54e9c	fix paths	2019-06-18 16:48:04 +02:00
thomwolf	40dbda6871	updating classification example	2019-06-18 16:45:52 +02:00
thomwolf	7388c83b60	update run_classifier for distributed eval	2019-06-18 16:32:49 +02:00
thomwolf	9727723243	fix pickle	2019-06-18 16:02:42 +02:00
thomwolf	9710b68dbc	fix pickles	2019-06-18 16:01:15 +02:00
thomwolf	15ebd67d4e	cache in run_classifier + various fixes to the examples	2019-06-18 15:58:22 +02:00
thomwolf	e6e5f19257	fix	2019-06-18 14:45:14 +02:00
thomwolf	a432b3d466	distributed traing t_total	2019-06-18 14:39:09 +02:00
thomwolf	c5407f343f	split squad example in two	2019-06-18 14:29:03 +02:00
thomwolf	335f57baf8	only on main process	2019-06-18 14:03:46 +02:00
thomwolf	326944d627	add tensorboard to run_squad	2019-06-18 14:02:42 +02:00
thomwolf	d82e5deeb1	set find_unused_parameters=True in DDP	2019-06-18 12:13:14 +02:00
thomwolf	a59abedfb5	DDP update	2019-06-18 12:06:26 +02:00
thomwolf	2ef5e0de87	switch to pytorch DistributedDataParallel	2019-06-18 12:03:13 +02:00
thomwolf	9ce37af99b	oups	2019-06-18 11:47:54 +02:00
thomwolf	a40955f071	no need to duplicate models anymore	2019-06-18 11:46:14 +02:00
thomwolf	382e2d1e50	spliting config and weight files for bert also	2019-06-18 10:37:16 +02:00
Thomas Wolf	cad88e19de	Merge pull request #672 from oliverguhr/master Add vocabulary and model config to the finetune output	2019-06-14 17:02:47 +02:00
Thomas Wolf	460d9afd45	Merge pull request #640 from Barqawiz/master Support latest multi language bert fine tune	2019-06-14 16:57:02 +02:00
Thomas Wolf	277c77f1c5	Merge pull request #630 from tguens/master Update run_squad.py	2019-06-14 16:56:26 +02:00
Thomas Wolf	659af2cbd0	Merge pull request #604 from samuelbroscheit/master Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556	2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah	e02ce4dc79	[hotfix] Fix frozen pooler parameters in SWAG example.	2019-06-11 15:13:53 -07:00
Oliver Guhr	5c08c8c273	adds the tokenizer + model config to the output	2019-06-11 13:46:33 +02:00
jeonsworld	a3a604cefb	Update pregenerate_training_data.py apply Whole Word Masking technique. referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)	2019-06-10 12:17:23 +09:00
Ahmad Barqawi	c4fe56dcc0	support latest multi language bert fine tune fix issue of bert-base-multilingual and add support for uncased multilingual	2019-05-27 11:27:41 +02:00
tguens	9e7bc51b95	Update run_squad.py Indentation change so that the output "nbest_predictions.json" is not empty.	2019-05-22 17:27:59 +08:00
samuelbroscheit	94247ad6cb	Make num_train_optimization_steps int	2019-05-13 12:38:22 +02:00
samuel.broscheit	49a77ac16f	Clean up a little bit	2019-05-12 00:31:10 +02:00
samuel.broscheit	3bf3f9596f	Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556 Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances. Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).	2019-05-12 00:13:45 +02:00
burcturkoglu	00c7fd2b79	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:57:03 +03:00
burcturkoglu	fa37b4da77	Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT	2019-05-09 10:55:24 +03:00
burcturkoglu	5289b4b9e0	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:51:38 +03:00
Thomas Wolf	0198399d84	Merge pull request #570 from MottoX/fix-1 Create optimizer only when args.do_train is True	2019-05-08 16:07:50 +02:00
MottoX	18c8aef9d3	Fix documentation typo	2019-05-02 19:23:36 +08:00
MottoX	74dbba64bc	Prepare optimizer only when args.do_train is True	2019-05-02 19:09:29 +08:00
Aneesh Pappu	365fb34c6c	small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model	2019-04-30 13:53:04 -07:00
Thomas Wolf	2dee86319d	Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss Update example files so that tr_loss is not affected by args.gradient…	2019-04-30 11:12:55 +02:00
Mathieu Prouveur	87b9ec3843	Fix tr_loss rescaling factor using global_step	2019-04-29 12:58:29 +02:00
Mathieu Prouveur	ed8fad7390	Update example files so that tr_loss is not affected by args.gradient_accumulation_step	2019-04-24 14:07:00 +02:00
thomwolf	d94c6b0144	fix training schedules in examples to match new API	2019-04-23 11:17:06 +02:00
Thomas Wolf	c36cca075a	Merge pull request #515 from Rocketknight1/master Fix --reduce_memory in finetune_on_pregenerated	2019-04-23 10:30:23 +02:00
Matthew Carrigan	b8e2a9c584	Made --reduce_memory actually do something in finetune_on_pregenerated	2019-04-22 14:01:48 +01:00
Sangwhan Moon	14b1f719f4	Fix indentation weirdness in GPT-2 example.	2019-04-22 02:20:22 +09:00
Thomas Wolf	8407429d74	Merge pull request #494 from SudoSharma/patch-1 Fix indentation for unconditional generation	2019-04-17 11:11:36 +02:00
Ben Mann	87677fcc4d	[run_gpt2.py] temperature should be a float, not int	2019-04-16 15:23:21 -07:00
Abhi Sharma	07154dadb4	Fix indentation for unconditional generation	2019-04-16 11:11:49 -07:00
Thomas Wolf	3d78e226e6	Merge pull request #489 from huggingface/tokenization_serialization Better serialization for Tokenizers and Configuration classes - Also fix #466	2019-04-16 08:49:54 +02:00
thomwolf	3571187ef6	fix saving models in distributed setting examples	2019-04-15 16:43:56 +02:00
thomwolf	2499b0a5fc	add ptvsd to run_squad	2019-04-15 15:33:04 +02:00
thomwolf	7816f7921f	clean up distributed training logging in run_squad example	2019-04-15 15:27:10 +02:00
thomwolf	1135f2384a	clean up logger in examples for distributed case	2019-04-15 15:22:40 +02:00
thomwolf	60ea6c59d2	added best practices for serialization in README and examples	2019-04-15 15:00:33 +02:00
thomwolf	179a2c2ff6	update example to work with new serialization semantic	2019-04-15 14:33:23 +02:00
thomwolf	3e65f255dc	add serialization semantics to tokenizers - fix transfo-xl tokenizer	2019-04-15 11:47:25 +02:00
Thomas Wolf	aff44f0c08	Merge branch 'master' into master	2019-04-15 10:58:34 +02:00
Thomas Wolf	bb61b747df	Merge pull request #474 from jiesutd/master Fix tsv read error in Windows	2019-04-15 10:56:48 +02:00
Matthew Carrigan	dbbd6c7500	Replaced some randints with cleaner randranges, and added a helpful error for users whose corpus is just one giant document.	2019-04-12 15:07:58 +01:00
Thomas Wolf	616743330e	Merge pull request #462 from 8enmann/master fix run_gpt2.py	2019-04-11 21:54:46 +02:00
Thomas Wolf	2cdfb8b254	Merge pull request #467 from yaroslavvb/patch-2 Update README.md	2019-04-11 21:53:23 +02:00
Jie Yang	c49ce3c722	fix tsv read error in Windows	2019-04-11 15:40:19 -04:00
thomwolf	4bc4c69af9	finetuning any BERT model - fixes #455	2019-04-11 16:57:59 +02:00
Yaroslav Bulatov	8fffba5f47	Update README.md Fix for ```> > > > 04/09/2019 21:39:38 - INFO - __main__ - device: cuda n_gpu: 1, distributed training: False, 16-bits training: False Traceback (most recent call last): File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module> main() File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.") ValueError: Training is currently the only implemented execution option. Please set `do_train`. ```	2019-04-09 14:45:47 -07:00
Benjamin Mann	fd8a3556f0	fix run_gpt2.py	2019-04-08 17:20:35 -07:00
Dhanajit Brahma	6c4c7be282	Merge remote-tracking branch 'upstream/master'	2019-04-07 16:59:36 +05:30
Dhanajit Brahma	4d3cf0d602	removing some redundant lines	2019-04-07 16:59:07 +05:30
Thomas Wolf	9ca25ce828	Merge pull request #427 from jeonsworld/patch-1 fix sample_doc	2019-04-03 11:26:58 +02:00
thomwolf	846b1fd6f8	Fix #419	2019-04-03 10:50:38 +02:00
Thomas Wolf	2f80dbbc0d	Merge pull request #430 from MottoX/master Fix typo in example code	2019-04-02 10:41:56 +02:00
Mike Arpaia	8b5c63e4de	Fixes to the TensorFlow conversion tool	2019-04-01 13:17:54 -06:00
Weixin Wang	d07db28f52	Fix typo in example code Modify 'unambigiously' to 'unambiguously'	2019-03-31 01:20:18 +08:00
jeonsworld	60005f464d	Update pregenerate_training_data.py If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted. example: cumsum_max = {int64} 30 doc_cumsum = {ndarray} [ 5 7 11 19 30] doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11] if current_idx = 1, rand_start = 7 rand_end = 35 sentence_index = randint(7, 35) % cumsum_max if randint return 35, sentence_index becomes 5. if sentence_index is 5, np.searchsorted returns 1 equal to current_index.	2019-03-30 14:50:17 +09:00
dhanajitb	f872eb98c2	making unconditional generation work The unconditional generation works now but if the seed is fixed, the sample is the same every time. n_samples > 1 will give different samples though. I am giving the start token as '<\|endoftext\|>' for the unconditional generation.	2019-03-28 22:46:15 +05:30
Thomas Wolf	694e2117f3	Merge pull request #388 from ananyahjha93/master Added remaining GLUE tasks to 'run_classifier.py'	2019-03-28 09:06:53 +01:00
Thomas Wolf	cc8c2d2332	Merge pull request #396 from IndexFziQ/IndexFziQ add tqdm to the process of eval in examples/run_swag.py	2019-03-27 12:03:26 +01:00
thomwolf	361aff6de5	typos	2019-03-27 11:54:59 +01:00
thomwolf	cea8ba1d59	adjusted formating and some wording in the readme	2019-03-27 11:53:44 +01:00
Matthew Carrigan	24e67fbf75	Minor README update	2019-03-25 12:33:30 +00:00
Matthew Carrigan	8d1d1ffde2	Corrected the displayed loss when gradient_accumulation_steps > 1	2019-03-25 12:15:19 +00:00
Matthew Carrigan	abb7d1ff6d	Added proper context management to ensure cleanup happens in the right order.	2019-03-21 17:50:03 +00:00
Matthew Carrigan	06a30cfdf3	Added a --reduce_memory option to the training script to keep training data on disc as a memmap rather than in memory	2019-03-21 17:04:12 +00:00
Matthew Carrigan	7d1ae644ef	Added a --reduce_memory option to the training script to keep training data on disc as a memmap rather than in memory	2019-03-21 17:02:18 +00:00
Matthew Carrigan	2bba7f810e	Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory.	2019-03-21 16:50:16 +00:00
Matthew Carrigan	8733ffcb5e	Removing a couple of other old unnecessary comments	2019-03-21 14:09:57 +00:00
Matthew Carrigan	8a861048dd	Fixed up the notes on a possible future low-memory path	2019-03-21 14:08:39 +00:00
Matthew Carrigan	a8a577ba93	Reduced memory usage for pregenerating the data a lot by writing it out on the fly without shuffling - the Sampler in the finetuning script will shuffle for us.	2019-03-21 14:05:52 +00:00
Matthew Carrigan	0ae59e662d	Reduced memory usage for pregenerating the data a lot by writing it out on the fly without shuffling - the Sampler in the finetuning script will shuffle for us.	2019-03-21 14:04:17 +00:00
Matthew Carrigan	6a9038ba53	Removed an old irrelevant comment	2019-03-21 13:36:41 +00:00
Yuqiang Xie	77944d1b31	add tqdm to the process of eval Maybe better.	2019-03-21 20:59:33 +08:00
Matthew Carrigan	29a392fbcf	Small README changes	2019-03-20 17:35:17 +00:00
Matthew Carrigan	832b2b0058	Adding README	2019-03-20 17:31:49 +00:00
Matthew Carrigan	934d3f4d2f	Syncing up argument names between the scripts	2019-03-20 17:23:23 +00:00
Matthew Carrigan	f19ba35b2b	Move old finetuning script into the new folder	2019-03-20 16:47:06 +00:00
Matthew Carrigan	7de5c6aa5e	PEP8 and formatting cleanups	2019-03-20 16:44:04 +00:00
Matthew Carrigan	1798e98e5a	Added final TODOs	2019-03-20 16:42:37 +00:00
Matthew Carrigan	c64c2fc4c2	Fixed embarrassing indentation problem	2019-03-20 15:42:57 +00:00
Matthew Carrigan	0540d360f2	Fixed logging	2019-03-20 15:36:51 +00:00
Matthew Carrigan	976554a472	First commit of the new LM finetuning	2019-03-20 14:23:51 +00:00
Ananya Harsh Jha	e5b63fb542	Merge branch 'master' of https://github.com/ananyahjha93/pytorch-pretrained-BERT pull current master to local	2019-03-17 08:30:13 -04:00
Ananya Harsh Jha	8a4e90ff40	corrected folder creation error for MNLI-MM, verified GLUE results	2019-03-17 08:16:50 -04:00
Ananya Harsh Jha	e0bf01d9a9	added hack for mismatched MNLI	2019-03-16 14:10:48 -04:00
Ananya Harsh Jha	4c721c6b6a	added eval time metrics for GLUE tasks	2019-03-15 23:21:24 -04:00
tseretelitornike	83857ffeaa	Added missing imports.	2019-03-15 12:45:48 +01:00
Yongbo Wang	d1e4fa98a9	typo in annotation modify `heruistic` to `heuristic` in line 660, `charcter` to `character` in line 661.	2019-03-14 17:32:15 +08:00
Yongbo Wang	3d6452163d	typo modify `mull` to `null` in line 474 annotation.	2019-03-14 17:03:38 +08:00
thomwolf	a98dfe4ced	fixing #377 (empty nbest_predictions.json)	2019-03-14 09:57:06 +01:00
Ananya Harsh Jha	043c8781ef	added code for all glue task processors	2019-03-14 04:24:04 -04:00
Yongbo Wang	22a465a91f	Simplify code, delete redundancy line delete redundancy line `if args.train`, simplify code.	2019-03-13 09:42:06 +08:00
Elon Musk	66d8206809	Update run_gpt2.py	2019-03-08 11:59:08 -05:00
thomwolf	7cc35c3104	fix openai gpt example and updating readme	2019-03-06 11:43:21 +01:00
thomwolf	994d86609b	fixing PYTORCH_PRETRAINED_BERT_CACHE use in examples	2019-03-06 10:21:24 +01:00
thomwolf	5c85fc3977	fix typo - logger info	2019-03-06 10:05:21 +01:00
Thomas Wolf	8e36da7acb	Merge pull request #347 from jplehmann/feature/sst2-processor Processor for SST-2 task	2019-03-06 09:48:27 +01:00
Thomas Wolf	3c01dfb775	Merge pull request #338 from CatalinVoss/patch-3 Fix top k generation for k != 0	2019-03-06 09:47:33 +01:00
John Lehmann	0f96d4b1f7	Run classifier processor for SST-2.	2019-03-05 13:38:28 -06:00
Catalin Voss	4b4b079272	Fix top k generation for k != 0	2019-03-02 21:54:44 -08:00
Catalin Voss	c0cf0a04d5	Fix typo	2019-02-27 18:01:06 -08:00
Ben Johnson	8607233679	Update run_openai_gpt.py	2019-02-20 13:58:54 -05:00
thomwolf	0202da0271	remove unnecessary example	2019-02-18 13:51:42 +01:00
thomwolf	690a0dbf36	fix example - masking	2019-02-18 10:50:30 +01:00
thomwolf	fbb248a2e4	examples testing	2019-02-18 01:28:18 +01:00
thomwolf	b65f07d8c0	adding examples	2019-02-18 00:55:33 +01:00
wlhgtc	8efaf8f176	fix 'best_non_null_entry' is None error	2019-02-15 15:57:25 +08:00
Davide Fiocco	65df0d78ed	--do_lower_case is duplicated in parser args Deleting one repetition (please review!)	2019-02-13 15:30:05 +01:00
Thomas Wolf	03cdb2a390	Merge pull request #254 from huggingface/python_2 Adding OpenAI GPT and Transformer-XL models, compatibility with Python 2	2019-02-11 14:19:26 +01:00
thomwolf	d38caba169	typo in run_squad	2019-02-11 14:10:27 +01:00
thomwolf	af62cc5f20	fix run_squad example	2019-02-11 14:06:32 +01:00
thomwolf	eebc8abbe2	clarify and unify model saving logic in examples	2019-02-11 14:04:19 +01:00
thomwolf	32fea876bb	add distant debugging to run_transfo_xl	2019-02-11 12:53:32 +01:00
thomwolf	b31ba23913	cuda on in the examples by default	2019-02-11 12:15:43 +01:00
thomwolf	6cd769957e	update transfo xl example	2019-02-09 16:59:17 +01:00
thomwolf	1320e4ec0c	mc_token_mask => mc_token_ids	2019-02-09 16:58:53 +01:00
thomwolf	f4a07a392c	mems not splitted	2019-02-09 16:14:31 +01:00
thomwolf	43b9af0cac	mems initialized to None in run_transfo	2019-02-09 16:12:19 +01:00
thomwolf	b80684b23f	fixing run openai gpt example	2019-02-08 22:31:32 +01:00
thomwolf	7b4b0cf966	logging	2019-02-08 11:16:29 +01:00
thomwolf	4bbb9f2d68	log loss - helpers	2019-02-08 11:14:29 +01:00
thomwolf	5d7e845712	fix model on cuda	2019-02-08 11:08:43 +01:00
thomwolf	eccb2f0163	hot fix	2019-02-08 11:05:20 +01:00
thomwolf	5adc20723b	add distant debugging	2019-02-08 11:03:59 +01:00
thomwolf	777459b471	run openai example running	2019-02-08 10:33:14 +01:00
thomwolf	6bc082da0a	updating examples	2019-02-08 00:02:26 +01:00
thomwolf	e77721e4fe	renamed examples	2019-02-07 23:15:15 +01:00
thomwolf	d482e3d79d	adding examples for openai and transformer-xl	2019-02-07 17:06:41 +01:00
tholor	9aebc711c9	adjust error message related to args.do_eval	2019-02-07 11:49:38 +01:00
tholor	4a450b25d5	removing unused argument eval_batch_size from LM finetuning #256	2019-02-07 10:06:38 +01:00
Baoyang Song	7ac3311e48	Fix the undefined variable in squad example	2019-02-06 19:36:08 +01:00
thomwolf	ed47cb6cba	fixing transfo eval script	2019-02-06 16:22:17 +01:00
Thomas Wolf	848aae49e1	Merge branch 'master' into python_2	2019-02-06 00:13:20 +01:00
thomwolf	448937c00d	python 2 compatibility	2019-02-06 00:07:46 +01:00
thomwolf	d609ba24cb	resolving merge conflicts	2019-02-05 16:14:25 +01:00
Thomas Wolf	64ce900974	Merge pull request #248 from JoeDumoulin/squad1.1-fix fix prediction on run-squad.py example	2019-02-05 16:00:51 +01:00
Thomas Wolf	e9e77cd3c4	Merge pull request #218 from matej-svejda/master Fix learning rate problems in run_classifier.py	2019-02-05 15:40:44 +01:00
thomwolf	1579c53635	more explicit notation: num_train_step => num_train_optimization_steps	2019-02-05 15:36:33 +01:00
joe dumoulin	aa90e0c36a	fix prediction on run-squad.py example	2019-02-01 10:15:44 -08:00
Thomas Wolf	8f8bbd4a4c	Merge pull request #244 from deepset-ai/prettify_lm_masking Avoid confusion of inplace LM masking	2019-02-01 12:17:50 +01:00
tholor	ce75b169bd	avoid confusion of inplace masking of tokens_a / tokens_b	2019-01-31 11:42:06 +01:00
Surya Kasturi	9bf528877e	Update run_squad.py	2019-01-30 15:09:31 -05:00
Surya Kasturi	af2b78601b	Update run_squad2.py	2019-01-30 15:08:56 -05:00
Matej Svejda	5169069997	make examples consistent, revert error in num_train_steps calculation	2019-01-30 11:47:25 +01:00
Matej Svejda	9c6a48c8c3	fix learning rate/fp16 and warmup problem for all examples	2019-01-27 14:07:24 +01:00
Matej Svejda	01ff4f82ba	learning rate problems in run_classifier.py	2019-01-22 23:40:06 +01:00
liangtaiwan	be9fa192f0	don't save if do not train	2019-01-18 00:41:55 +08:00
thomwolf	a28dfc8659	fix eval for wt103	2019-01-16 11:18:19 +01:00
thomwolf	8831c68803	fixing various parts of model conversion, loading and weights sharing	2019-01-16 10:31:16 +01:00
thomwolf	bcd4aa8fe0	update evaluation example	2019-01-15 23:32:34 +01:00
thomwolf	a69ec2c722	improved corpus and tokenization conversion - added evaluation script	2019-01-15 23:17:46 +01:00
Thomas Wolf	4e0cba1053	Merge pull request #191 from nhatchan/20190113_py35_finetune lm_finetuning compatibility with Python 3.5	2019-01-14 09:40:07 +01:00
nhatchan	6c65cb2492	lm_finetuning compatibility with Python 3.5 dicts are not ordered in Python 3.5 or prior, which is a cause of #175. This PR replaces one with a list, to keep its order.	2019-01-13 21:09:13 +09:00
Li Dong	a2da2b4109	[bug fix] args.do_lower_case is always True The "default=True" makes args.do_lower_case always True. ```python parser.add_argument("--do_lower_case", default=True, action='store_true') ```	2019-01-13 19:51:11 +08:00
tholor	506e5bb0c8	add do_lower_case arg and adjust model saving for lm finetuning.	2019-01-11 08:32:46 +01:00
Thomas Wolf	e485829a41	Merge pull request #174 from abeljim/master Added Squad 2.0	2019-01-10 23:40:45 +01:00
Sang-Kil Park	64326dccfb	Fix it to run properly even if without `--do_train` param. It was modified similar to `run_classifier.py`, and Fixed to run properly even if without `--do_train` param.	2019-01-10 21:51:39 +09:00
thomwolf	e5c78c6684	update readme and few typos	2019-01-10 01:40:00 +01:00
thomwolf	fa5222c296	update readme	2019-01-10 01:25:28 +01:00
Unknown	b3628f117e	Added Squad 2.0	2019-01-08 15:13:13 -08:00
thomwolf	ab90d4cddd	adding docs and example for OpenAI GPT	2019-01-09 00:12:43 +01:00
thomwolf	2e4db64cab	add do_lower_case tokenizer loading optino in run_squad and ine_tuning examples	2019-01-07 13:06:42 +01:00
thomwolf	c9fd350567	remove default when action is store_true in arguments	2019-01-07 13:01:54 +01:00
Thomas Wolf	d3d56f9a0b	Merge pull request #166 from likejazz/patch-1 Fix error when `bert_model` param is path or url.	2019-01-07 12:40:55 +01:00
Thomas Wolf	766c6b2ce3	Merge pull request #159 from jaderabbit/master Allow do_eval to be used without do_train and to use the pretrained model in the output folder	2019-01-07 12:31:06 +01:00
Thomas Wolf	77966a43a4	Merge pull request #156 from rodgzilla/cl_args_doc Adding new pretrained model to the help of the `bert_model` argument.	2019-01-07 12:27:16 +01:00
Thomas Wolf	2e8c5c00ec	Merge pull request #141 from SinghJasdeep/patch-1 loading saved model when n_classes != 2	2019-01-07 12:21:13 +01:00
Sang-Kil Park	ca4e7aaa72	Fix error when `bert_model` param is path or url. Error occurs when `bert_model` param is path or url. Therefore, if it is path, specify the last path to prevent error.	2019-01-05 11:42:54 +09:00
Jade Abbott	193e2df8ba	Remove rogue comment	2019-01-03 13:13:06 +02:00
Jade Abbott	c64de50ea4	nb_tr_steps is not initialized	2019-01-03 12:34:57 +02:00
Jade Abbott	b96149a19b	Training loss is not initialized if only do_eval is specified	2019-01-03 10:32:10 +02:00
Jade Abbott	be3b9bcf4d	Allow one to use the pretrained model in evaluation when do_train is not selected	2019-01-03 09:02:33 +02:00
Grégory Châtel	186f75342e	Adding new pretrained model to the help of the `bert_model` argument.	2019-01-02 14:00:59 +01:00
Jasdeep Singh	99709ee61d	loading saved model when n_classes != 2 Required to for: Assertion `t >= 0 && t < n_classes` failed, if your default number of classes is not 2.	2018-12-20 13:55:47 -08:00
tholor	e5fc98c542	add exemplary training data. update to nvidia apex. refactor 'item -> line in doc' mapping. add warning for unknown word.	2018-12-20 18:30:52 +01:00
deepset	a58361f197	Add example for fine tuning BERT language model (#1 ) Adds an example for loading a pre-trained BERT model and fine tune it as a language model (masked tokens & nextSentence) on your target corpus.	2018-12-18 10:32:25 +01:00
thomwolf	ae88eb88a4	set encoding to 'utf-8' in calls to open	2018-12-14 13:48:58 +01:00
thomwolf	e1eab59aac	no fp16 on evaluation	2018-12-13 14:54:02 +01:00
thomwolf	087798b7fa	fix reloading model for evaluation in examples	2018-12-13 14:48:12 +01:00
thomwolf	0f544625f4	fix swag example for work with apex	2018-12-13 13:35:59 +01:00
thomwolf	0cf88ff084	make examples work without apex	2018-12-13 13:28:00 +01:00
thomwolf	d3fcec1a3e	add saving and loading model in examples	2018-12-13 12:50:44 +01:00
thomwolf	b3caec5a56	adding save checkpoint and loading in examples	2018-12-13 12:48:13 +01:00
Thomas Wolf	91aab2a6d3	Merge pull request #116 from FDecaYed/deyuf/fp16_with_apex Change to use apex for better fp16 and multi-gpu support	2018-12-13 12:32:37 +01:00
Thomas Wolf	ffe9075f48	Merge pull request #96 from rodgzilla/multiple-choice-code BertForMultipleChoice and Swag dataset example.	2018-12-13 12:05:11 +01:00
Deyu Fu	c8ea286048	change to apex for better fp16 and multi-gpu support	2018-12-11 17:13:58 -08:00
Thomas Wolf	e622790a93	Merge pull request #91 from rodgzilla/convert-examples-code-improvement run_classifier.py improvements	2018-12-11 05:12:04 -05:00
Grégory Châtel	df34f22854	Removing the dependency to pandas and using the csv module to load data.	2018-12-10 17:45:23 +01:00
Grégory Châtel	d429c15f25	Removing old code from copy-paste.	2018-12-06 19:19:21 +01:00
Grégory Châtel	63c45056aa	Finishing the code for the Swag task.	2018-12-06 18:53:05 +01:00
Grégory Châtel	c45d8ac554	Storing the feature of each choice as a dict for readability.	2018-12-06 16:01:28 +01:00
Grégory Châtel	0812aee2c3	Fixing problems in convert_examples_to_features.	2018-12-06 15:53:07 +01:00
Grégory Châtel	f2b873e995	convert_examples_to_features code and small improvements.	2018-12-06 15:40:47 +01:00
Grégory Châtel	83fdbd6043	Adding read_swag_examples to load the dataset.	2018-12-06 14:02:46 +01:00
Grégory Châtel	7183cded4e	SwagExample class.	2018-12-06 13:39:44 +01:00
Grégory Châtel	fa7daa247d	Fixing the commentary of the `SquadExample` class.	2018-12-06 13:14:33 +01:00
Grégory Châtel	a994bf4076	Fixing related to issue #83 .	2018-12-05 18:16:30 +01:00
Grégory Châtel	c6d9d5394e	Simplifying code for easier understanding.	2018-12-05 17:53:09 +01:00
Grégory Châtel	793262e8ec	Removing trailing whitespaces.	2018-12-05 17:52:39 +01:00
Davide Fiocco	e60e8a6068	Correct assignement for logits in classifier example I tried to address https://github.com/huggingface/pytorch-pretrained-BERT/issues/76 should be correct, but there's likely a more efficient way.	2018-12-02 12:38:26 +01:00
Davide Fiocco	dc13e276ee	Point typo fix	2018-12-01 01:02:16 +01:00
thomwolf	89d47230d7	clean up classification model output	2018-11-30 22:54:53 +01:00
thomwolf	c588453a0f	fix run_squad	2018-11-30 14:22:40 +01:00
thomwolf	0541442558	add do_lower_case in examples	2018-11-30 13:47:33 +01:00
Li Li	0aaedcc02f	Bug fix in examples;correct t_total for distributed training;run prediction for full dataset	2018-11-27 01:08:37 -08:00
thomwolf	32167cdf4b	remove convert_to_unicode and printable_text from examples	2018-11-26 23:33:22 +01:00
thomwolf	05053d163c	update cache_dir in readme and examples	2018-11-26 10:45:13 +01:00
thomwolf	6b2136a8a9	fixing weights decay in run_squad example	2018-11-20 10:12:44 +01:00
Thomas Wolf	061eeca84a	Merge pull request #32 from xiaoda99/master Fix ineffective no_decay bug when using BERTAdam	2018-11-20 10:11:46 +01:00
thomwolf	2f21497d3e	fixing param.grad is None in fp16 examples	2018-11-20 10:01:21 +01:00
xiaoda99	6c4789e4e8	Fix ineffective no_decay bug	2018-11-18 16:16:21 +08:00
thomwolf	27ee0fff3c	add no_cuda args in extract_features	2018-11-17 23:04:44 +01:00
thomwolf	aa50fd196f	remove unused arguments in example scripts	2018-11-17 23:01:05 +01:00
thomwolf	47a7d4ec14	update examples from master	2018-11-17 12:21:35 +01:00
thomwolf	c8cba67742	clean up readme and examples	2018-11-17 12:19:16 +01:00
thomwolf	757750d6f6	fix tests	2018-11-17 11:58:14 +01:00
thomwolf	4e46affc34	updating examples	2018-11-17 10:30:54 +01:00
thomwolf	cba85a67b9	fix nan in optimizer_on_cpu	2018-11-15 21:47:41 +01:00
thomwolf	1de35b624b	preparing for first release	2018-11-15 20:56:10 +01:00

... 45 46 47 48 49 ...

2613 Commits