transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-28 08:42:23 +06:00

Author	SHA1	Message	Date
thomwolf	ce5ef4b35d	python2 doesn't spark joy	2019-08-30 13:22:43 +02:00
thomwolf	5dd7b677ad	clean up all byte-level bpe tests	2019-08-30 12:43:08 +02:00
thomwolf	ca1a00a302	fix for python2	2019-08-30 12:29:31 +02:00
thomwolf	4e6a3172ce	update roberta docstring as well	2019-08-30 12:23:37 +02:00
thomwolf	fd10d79b55	update GPT2 docstring	2019-08-30 12:23:12 +02:00
thomwolf	abe734ca1f	fix GPT-2 and RoBERTa tests to be clean now	2019-08-30 12:20:18 +02:00
thomwolf	0f5a799456	fix GPT2DoubleHeadModel docstring	2019-08-30 11:49:23 +02:00
thomwolf	d51f72d5de	adding shortcut to the ids of all the special tokens	2019-08-30 11:41:11 +02:00
thomwolf	50e6daf83a	fix Roberta tokenizer __init__	2019-08-30 11:27:43 +02:00
thomwolf	0517e7a1cb	Fix GPT2 and RoBERTa tokenizer to beging with a space - update Roberta tokenizer	2019-08-30 11:23:49 +02:00
Lysandre	55f69a11b6	OpenAI GPT tests now extend CommonTests	2019-08-21 18:09:25 -04:00
Lysandre	47267ba556	OpenAI GPT-2 now depends on CommonTests.	2019-08-21 17:50:16 -04:00
Lysandre	034aa0c2d7	Fixed GPT2DoubleHeadsModel example and weight tying	2019-08-21 17:27:38 -04:00
Lysandre	814a3f4e01	Removed `attention_mask` from GPT-2 and GPT documentation. Corrected `multiple_choice_labels` to actual name `mc_labels`	2019-08-21 14:11:14 -04:00
thomwolf	fdc487d8b3	Add max length	2019-08-21 02:35:01 +02:00
thomwolf	aa05dc8935	adding gpt-2 large	2019-08-21 02:29:34 +02:00
Thomas Wolf	e4515faf54	Merge pull request #1057 from huggingface/fixes Add a few of typos corrections, bugs fixes and small improvements	2019-08-21 01:54:05 +02:00
Thomas Wolf	41789c6c3d	Merge pull request #1059 from GuillemGSubies/master Better use of spacy tokenizer in open ai and xlm tokenizers	2019-08-21 01:53:48 +02:00
Thomas Wolf	d30cbaf5dc	Merge branch 'master' into iterative_split_on_token	2019-08-21 01:33:02 +02:00
Thomas Wolf	e753f249e1	Merge pull request #806 from wschin/fix-a-path Fix a path so that a test can run on Windows	2019-08-21 01:14:40 +02:00
thomwolf	43489756ad	adding proxies options for the from_pretrained methods	2019-08-20 16:59:11 +02:00
Guillem García Subies	388e3251fa	Update tokenization_xlm.py	2019-08-20 14:19:39 +02:00
Guillem García Subies	f5e2ed0fd8	Update tokenization_openai.py	2019-08-20 14:19:25 +02:00
Guillem García Subies	562b998366	Update tokenization_openai.py	2019-08-20 14:10:19 +02:00
Guillem García Subies	bb04446285	Update tokenization_openai.py	2019-08-20 14:07:40 +02:00
Guillem García Subies	bfd75056b0	Update tokenization_xlm.py	2019-08-20 14:06:17 +02:00
thomwolf	6d0aa73981	fix #1034	2019-08-20 12:20:21 +02:00
Julien Chaumond	b0b9b8091b	minor typo	2019-08-20 11:33:46 +02:00
thomwolf	53c8f700f4	fix #808	2019-08-20 11:29:26 +02:00
thomwolf	901dde0e45	fix #1014	2019-08-20 11:05:51 +02:00
thomwolf	fecaed0ed4	add force_download option to from_pretrained methods	2019-08-20 10:56:12 +02:00
Lysandre	c589862b78	Doc: loading from config alone does not load the model weights	2019-08-19 10:17:47 -04:00
LysandreJik	ab05280666	Order of strings in AutoModel/AutoTokenizer updated.	2019-08-16 09:53:26 -04:00
LysandreJik	83dba0b67b	Added RoBERTa tokenizer to AutoTokenizer	2019-08-15 17:07:07 -04:00
LysandreJik	e24e19ce3b	Added RoBERTa to AutoModel/AutoConfig	2019-08-15 14:02:11 -04:00
LysandreJik	fe02e45e48	Release: 1.1.0	2019-08-15 11:15:08 -04:00
Lysandre Debut	88efc65bac	Merge pull request #964 from huggingface/RoBERTa RoBERTa: model conversion, inference, tests 🔥	2019-08-15 11:11:10 -04:00
LysandreJik	8308170156	Warning for RoBERTa sequences encoded without special tokens.	2019-08-15 10:29:04 -04:00
LysandreJik	572dcfd1db	Doc	2019-08-14 14:56:14 -04:00
samvelyan	9ce36e3e4b	Re-implemented tokenize() iteratively in PreTrainedTokenizer.	2019-08-14 08:57:09 +00:00
LysandreJik	39f426be65	Added special tokens <pad> and <mask> to RoBERTa.	2019-08-13 15:19:50 -04:00
LysandreJik	3d87991f60	Fixed error with encoding	2019-08-13 12:00:24 -04:00
LysandreJik	634a3172d8	Added integration tests for sequence builders.	2019-08-12 15:14:15 -04:00
LysandreJik	22ac004a7c	Added documentation and changed parameters for special_tokens_sentences_pair.	2019-08-12 15:13:53 -04:00
Julien Chaumond	b3d83d68db	Fixup `9d0603148b`	2019-08-12 12:28:55 -04:00
thomwolf	aaedfc35a8	Merge branch 'master' of https://github.com/huggingface/pytorch-transformers	2019-08-10 20:04:37 +02:00
thomwolf	c683c3d5a5	fix #993	2019-08-10 20:04:35 +02:00
Kevin Trebing	7060766490	Corrected logger.error info Signed-off-by: Kevin Trebing <Kevin.Trebing@gmx.net>	2019-08-09 19:36:44 -04:00
LysandreJik	75d5f98fd2	Roberta tokenization + fixed tests (py3 + py2).	2019-08-09 15:02:13 -04:00
LysandreJik	14e970c271	Tokenization encode/decode class-based sequence handling	2019-08-09 15:01:38 -04:00

1 2 3 4

177 Commits