transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-27 08:18:58 +06:00

Author	SHA1	Message	Date
Nikolay Korolev	53282b5bd0	Change attention mask dtype to be bool. Fix #1119	2019-08-27 14:19:03 +03:00
LysandreJik	e08c01aa1a	fix #1102	2019-08-26 18:13:06 -04:00
Abhishek Rao	c603d099aa	reraise EnvironmentError in from_pretrained functions of Model and Tokenizer	2019-08-22 15:25:40 -07:00
Abhishek Rao	14eef67eb2	Fix at config rather than model	2019-08-21 15:48:43 -07:00
Abhishek Rao	296df2b18c	reraise exception	2019-08-21 15:29:30 -07:00
thomwolf	fdc487d8b3	Add max length	2019-08-21 02:35:01 +02:00
thomwolf	aa05dc8935	adding gpt-2 large	2019-08-21 02:29:34 +02:00
Thomas Wolf	e4515faf54	Merge pull request #1057 from huggingface/fixes Add a few of typos corrections, bugs fixes and small improvements	2019-08-21 01:54:05 +02:00
Thomas Wolf	41789c6c3d	Merge pull request #1059 from GuillemGSubies/master Better use of spacy tokenizer in open ai and xlm tokenizers	2019-08-21 01:53:48 +02:00
Thomas Wolf	d30cbaf5dc	Merge branch 'master' into iterative_split_on_token	2019-08-21 01:33:02 +02:00
Thomas Wolf	e753f249e1	Merge pull request #806 from wschin/fix-a-path Fix a path so that a test can run on Windows	2019-08-21 01:14:40 +02:00
thomwolf	43489756ad	adding proxies options for the from_pretrained methods	2019-08-20 16:59:11 +02:00
Guillem García Subies	388e3251fa	Update tokenization_xlm.py	2019-08-20 14:19:39 +02:00
Guillem García Subies	f5e2ed0fd8	Update tokenization_openai.py	2019-08-20 14:19:25 +02:00
Guillem García Subies	562b998366	Update tokenization_openai.py	2019-08-20 14:10:19 +02:00
Guillem García Subies	bb04446285	Update tokenization_openai.py	2019-08-20 14:07:40 +02:00
Guillem García Subies	bfd75056b0	Update tokenization_xlm.py	2019-08-20 14:06:17 +02:00
thomwolf	6d0aa73981	fix #1034	2019-08-20 12:20:21 +02:00
Julien Chaumond	b0b9b8091b	minor typo	2019-08-20 11:33:46 +02:00
thomwolf	53c8f700f4	fix #808	2019-08-20 11:29:26 +02:00
thomwolf	901dde0e45	fix #1014	2019-08-20 11:05:51 +02:00
thomwolf	fecaed0ed4	add force_download option to from_pretrained methods	2019-08-20 10:56:12 +02:00
Lysandre	c589862b78	Doc: loading from config alone does not load the model weights	2019-08-19 10:17:47 -04:00
LysandreJik	ab05280666	Order of strings in AutoModel/AutoTokenizer updated.	2019-08-16 09:53:26 -04:00
LysandreJik	83dba0b67b	Added RoBERTa tokenizer to AutoTokenizer	2019-08-15 17:07:07 -04:00
LysandreJik	e24e19ce3b	Added RoBERTa to AutoModel/AutoConfig	2019-08-15 14:02:11 -04:00
LysandreJik	fe02e45e48	Release: 1.1.0	2019-08-15 11:15:08 -04:00
Lysandre Debut	88efc65bac	Merge pull request #964 from huggingface/RoBERTa RoBERTa: model conversion, inference, tests 🔥	2019-08-15 11:11:10 -04:00
LysandreJik	8308170156	Warning for RoBERTa sequences encoded without special tokens.	2019-08-15 10:29:04 -04:00
LysandreJik	572dcfd1db	Doc	2019-08-14 14:56:14 -04:00
samvelyan	9ce36e3e4b	Re-implemented tokenize() iteratively in PreTrainedTokenizer.	2019-08-14 08:57:09 +00:00
LysandreJik	39f426be65	Added special tokens <pad> and <mask> to RoBERTa.	2019-08-13 15:19:50 -04:00
LysandreJik	3d87991f60	Fixed error with encoding	2019-08-13 12:00:24 -04:00
LysandreJik	634a3172d8	Added integration tests for sequence builders.	2019-08-12 15:14:15 -04:00
LysandreJik	22ac004a7c	Added documentation and changed parameters for special_tokens_sentences_pair.	2019-08-12 15:13:53 -04:00
Julien Chaumond	b3d83d68db	Fixup `9d0603148b`	2019-08-12 12:28:55 -04:00
thomwolf	aaedfc35a8	Merge branch 'master' of https://github.com/huggingface/pytorch-transformers	2019-08-10 20:04:37 +02:00
thomwolf	c683c3d5a5	fix #993	2019-08-10 20:04:35 +02:00
Kevin Trebing	7060766490	Corrected logger.error info Signed-off-by: Kevin Trebing <Kevin.Trebing@gmx.net>	2019-08-09 19:36:44 -04:00
LysandreJik	75d5f98fd2	Roberta tokenization + fixed tests (py3 + py2).	2019-08-09 15:02:13 -04:00
LysandreJik	14e970c271	Tokenization encode/decode class-based sequence handling	2019-08-09 15:01:38 -04:00
LysandreJik	3566d27919	Clarified PreTrainedModel.from_pretrained warning messages in documentation.	2019-08-08 19:04:34 -04:00
LysandreJik	fbd746bd06	Updated test architecture	2019-08-08 18:21:34 -04:00
LysandreJik	6c41a8f5dc	Encode and Decode are back in the superclass. They now handle sentence pairs special tokens.	2019-08-08 18:20:32 -04:00
Julien Chaumond	e367ac469c	[RoBERTa] Re-apply `39d72bcc7b` cc @lysandrejik	2019-08-08 11:26:11 -04:00
Julien Chaumond	9d0603148b	[RoBERTa] RobertaForSequenceClassification + conversion	2019-08-08 11:24:54 -04:00
LysandreJik	f2b300df6b	fix #976	2019-08-08 10:38:57 -04:00
LysandreJik	7df303f5ad	fix #971	2019-08-08 10:36:26 -04:00
LysandreJik	d2cc6b101e	Merge branch 'master' into RoBERTa	2019-08-08 09:42:05 -04:00
LysandreJik	39d72bcc7b	Fixed the RoBERTa checkpoint conversion script according to the LM head refactoring.	2019-08-07 14:21:57 -04:00

1 2 3 4

168 Commits