transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 19:18:24 +06:00

Author	SHA1	Message	Date
Pasquale Minervini	3775550c4b	gradient norm clipping should be done right before calling the optimiser	2019-10-20 22:33:56 +01:00
Simon Layton	4e6a55751a	Force einsum to fp16	2019-10-14 11:12:41 -04:00
Bilal Khan	5ce8d29abe	Change tensorboard imports to use built-in tensorboard if available	2019-10-08 16:29:43 -05:00
VictorSanh	0820bb0555	unecessary carriage return	2019-10-04 17:23:15 -04:00
VictorSanh	f5891c3821	run_squad --> run_squad_w_distillation	2019-10-04 17:23:15 -04:00
VictorSanh	764a7923ec	add distillation+finetuning option in run_squad	2019-10-04 17:23:15 -04:00
thomwolf	31c23bd5ee	[BIG] pytorch-transformers => transformers	2019-09-26 10:15:53 +02:00
LysandreJik	de8e14b6c0	Added DistilBERT to run_squad script	2019-09-19 10:55:06 +02:00
Andreas Daiminger	1d15a7f278	swap order of optimizer.step() and scheduler.step()	2019-08-28 19:18:27 +02:00
VictorSanh	b006a7a12f	fix for squad	2019-08-22 00:25:42 -04:00
Peng Qi	3bffd2e8e5	more fixes	2019-08-20 10:59:28 -07:00
Peng Qi	a368b87791	Fix #1015	2019-08-19 13:07:00 -07:00
Chi-Liang Liu	40acf6b52a	don't save model without training	2019-08-18 05:02:25 -04:00
Thomas Wolf	d43dc48b34	Merge branch 'master' into auto_models	2019-08-05 19:17:35 +02:00
thomwolf	70c10caa06	add option mentioned in #940	2019-08-05 17:09:37 +02:00
thomwolf	b90e29d52c	working on automodels	2019-08-05 16:06:34 +02:00
Thomas Wolf	c054b5ee64	Merge pull request #896 from zijunsun/master fix multi-gpu training bug when using fp16	2019-07-26 19:31:02 +02:00
zijunsun	f0aeb7a814	multi-gpu training also should be after apex fp16（squad）	2019-07-26 15:23:29 +08:00
Chi-Liang Liu	a7fce6d917	fix squad v1 error (na_prob_file should be None)	2019-07-24 16:11:36 +08:00
thomwolf	6070b55443	fix #868	2019-07-23 17:46:01 +02:00
Peiqin Lin	76be189b08	typos	2019-07-21 20:39:42 +08:00
thomwolf	71d597dad0	fix #800	2019-07-17 13:51:09 +02:00
thomwolf	15d8b1266c	update tokenizer - update squad example for xlnet	2019-07-15 17:30:42 +02:00
thomwolf	3b469cb422	updating squad for compatibility with XLNet	2019-07-15 15:28:37 +02:00
thomwolf	2397f958f9	updating examples and doc	2019-07-14 23:20:10 +02:00
thomwolf	936e813c84	clean up examples - added squad example and test	2019-07-12 14:16:06 +02:00
thomwolf	762ded9b1c	wip examples	2019-07-12 11:28:52 +02:00
thomwolf	50b7e52a7f	WIP examples	2019-07-10 15:33:34 +02:00
thomwolf	36bca545ff	tokenization abstract class - tests for examples	2019-07-05 15:02:59 +02:00
thomwolf	f6081f2255	add xlnetforsequence classif and run_classifier example for xlnet	2019-06-24 10:01:07 +02:00
thomwolf	68ab9599ce	small fix and updates to readme	2019-06-19 09:38:38 +02:00
thomwolf	f7e2ac01ea	update barrier	2019-06-18 22:43:35 +02:00
thomwolf	7d2001aa44	overwrite_output_dir	2019-06-18 22:13:30 +02:00
thomwolf	15ebd67d4e	cache in run_classifier + various fixes to the examples	2019-06-18 15:58:22 +02:00
thomwolf	e6e5f19257	fix	2019-06-18 14:45:14 +02:00
thomwolf	a432b3d466	distributed traing t_total	2019-06-18 14:39:09 +02:00
thomwolf	c5407f343f	split squad example in two	2019-06-18 14:29:03 +02:00
thomwolf	335f57baf8	only on main process	2019-06-18 14:03:46 +02:00
thomwolf	326944d627	add tensorboard to run_squad	2019-06-18 14:02:42 +02:00
thomwolf	d82e5deeb1	set find_unused_parameters=True in DDP	2019-06-18 12:13:14 +02:00
thomwolf	a59abedfb5	DDP update	2019-06-18 12:06:26 +02:00
thomwolf	2ef5e0de87	switch to pytorch DistributedDataParallel	2019-06-18 12:03:13 +02:00
thomwolf	9ce37af99b	oups	2019-06-18 11:47:54 +02:00
thomwolf	a40955f071	no need to duplicate models anymore	2019-06-18 11:46:14 +02:00
Thomas Wolf	277c77f1c5	Merge pull request #630 from tguens/master Update run_squad.py	2019-06-14 16:56:26 +02:00
tguens	9e7bc51b95	Update run_squad.py Indentation change so that the output "nbest_predictions.json" is not empty.	2019-05-22 17:27:59 +08:00
samuelbroscheit	94247ad6cb	Make num_train_optimization_steps int	2019-05-13 12:38:22 +02:00
samuel.broscheit	49a77ac16f	Clean up a little bit	2019-05-12 00:31:10 +02:00
samuel.broscheit	3bf3f9596f	Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556 Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances. Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).	2019-05-12 00:13:45 +02:00
burcturkoglu	00c7fd2b79	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:57:03 +03:00

1 2 3

106 Commits