transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-24 23:08:57 +06:00

Author	SHA1	Message	Date
Morgan Funtowicz	28e64ad5a4	Raise an exception if the pipeline allocator can't determine the tokenizer from the model.	2019-12-13 14:12:54 +01:00
Morgan Funtowicz	be5bf7b81b	Added NER pipeline.	2019-12-13 14:12:17 +01:00
Morgan Funtowicz	80eacb8f16	Adding labels mapping for classification models in their respective config.	2019-12-13 14:10:22 +01:00
Morgan Funtowicz	f69dbecc38	Expose classification labels mapping (and reverse) in model config.	2019-12-12 10:25:36 +01:00
thomwolf	6709739a05	allowing from_pretrained to load from url directly	2019-12-11 18:15:45 +01:00
Morgan Funtowicz	c28273793e	Add missing DistilBert and Roberta to AutoModelForTokenClassification	2019-12-11 15:31:45 +01:00
Morgan Funtowicz	b040bff6df	Added supported model to AutoModelTokenClassification	2019-12-11 14:13:58 +01:00
Morgan Funtowicz	9a24e0cf76	Refactored qa pipeline argument handling + unittests	2019-12-11 00:33:25 +01:00
Morgan Funtowicz	63e36007ee	Make sure padding, cls and another non-context tokens cannot appear in the answer.	2019-12-10 16:47:35 +01:00
Morgan Funtowicz	40a39ab650	Reuse recent SQuAD refactored data structure inside QA pipelines.	2019-12-10 15:59:38 +01:00
Morgan Funtowicz	aae74065df	Added QuestionAnsweringPipeline unit tests.	2019-12-10 13:37:20 +01:00
Morgan Funtowicz	a7d3794a29	Remove token_type_ids for compatibility with DistilBert	2019-12-10 13:37:20 +01:00
Morgan Funtowicz	fe0f552e00	Use attention_mask everywhere.	2019-12-10 13:37:20 +01:00
Morgan Funtowicz	348e19aa21	Expose attention_masks and input_lengths arguments to batch_encode_plus	2019-12-10 13:37:18 +01:00
Morgan Funtowicz	c2407fdd88	Enable the Tensorflow backend.	2019-12-10 13:37:14 +01:00
Morgan Funtowicz	f116cf599c	Allow hidding frameworks through environment variables (NO_TF, NO_TORCH).	2019-12-10 13:37:07 +01:00
Morgan Funtowicz	6e61e06051	batch_encode_plus generates the encoder_attention_mask to avoid attending over padded values.	2019-12-10 13:37:07 +01:00
Morgan Funtowicz	02110485b0	Added batching, topk, chars index and scores.	2019-12-10 13:36:55 +01:00
Morgan Funtowicz	e1d89cb24d	Added QuestionAnsweringPipeline with batch support.	2019-12-10 13:36:55 +01:00
Morgan Funtowicz	81babb227e	Added download command through the cli. It allows to predownload models and tokenizers.	2019-12-10 12:18:59 +01:00
thomwolf	31a3a73ee3	updating CLI	2019-12-10 12:18:59 +01:00
thomwolf	7c1697562a	compatibility with sklearn and keras	2019-12-10 12:12:22 +01:00
thomwolf	b81ab431f2	updating AutoModels and AutoConfiguration - adding pipelines	2019-12-10 12:11:33 +01:00
thomwolf	2d8559731a	add pipeline - train	2019-12-10 11:34:16 +01:00
thomwolf	72c36b9ea2	[WIP] - CLI	2019-12-10 11:33:14 +01:00
Thomas Wolf	e57d00ee10	Merge pull request #1984 from huggingface/squad-refactor [WIP] Squad refactor	2019-12-10 11:07:26 +01:00
Thomas Wolf	ecabbf6d28	Merge pull request #2107 from huggingface/encoder-mask-shape create encoder attention mask from shape of hidden states	2019-12-10 10:07:56 +01:00
Julien Chaumond	1d18930462	Harmonize `no_cuda` flag with other scripts	2019-12-09 20:37:55 -05:00
Rémi Louf	f7eba09007	clean for release	2019-12-09 20:37:55 -05:00
Rémi Louf	2a64107e44	improve device usage	2019-12-09 20:37:55 -05:00
Rémi Louf	c0707a85d2	add README	2019-12-09 20:37:55 -05:00
Rémi Louf	ade3cdf5ad	integrate ROUGE	2019-12-09 20:37:55 -05:00
Rémi Louf	076602bdc4	prevent BERT weights from being downloaded twice	2019-12-09 20:37:55 -05:00
Rémi Louf	5909f71028	add py-rouge dependency	2019-12-09 20:37:55 -05:00
Rémi Louf	a1994a71ee	simplified model and configuration	2019-12-09 20:37:55 -05:00
Rémi Louf	3a9a9f7861	default output dir to documents dir	2019-12-09 20:37:55 -05:00
Rémi Louf	693606a75c	update the docs	2019-12-09 20:37:55 -05:00
Rémi Louf	c0443df593	remove beam search	2019-12-09 20:37:55 -05:00
Rémi Louf	2403a66598	give transformers API to BertAbs	2019-12-09 20:37:55 -05:00
Rémi Louf	4d18199902	cast bool tensor to long for pytorch < 1.3	2019-12-09 20:37:55 -05:00
Rémi Louf	9f75565ea8	setup training	2019-12-09 20:37:55 -05:00
Rémi Louf	4735c2af07	tweaks to the BeamSearch API	2019-12-09 20:37:55 -05:00
Rémi Louf	ba089c780b	share pretrained embeddings	2019-12-09 20:37:55 -05:00
Rémi Louf	9660ba1cbd	Add beam search	2019-12-09 20:37:55 -05:00
Rémi Louf	1c71ecc880	load the pretrained weights for encoder-decoder We currently save the pretrained_weights of the encoder and decoder in two separate directories `encoder` and `decoder`. However, for the `from_pretrained` function to operate with automodels we need to specify the type of model in the path to the weights. The path to the encoder/decoder weights is handled by the `PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice there is no easy way to infer the type of model that was initialized for the encoder and decoder we add a parameter `model_type` to the function. This is not an ideal solution as it is error prone, and the model type should be carried by the Model classes somehow. This is a temporary fix that should be changed before merging.	2019-12-09 20:37:55 -05:00
Rémi Louf	07f4cd73f6	update function to add special tokens Since I started my PR the `add_special_token_single_sequence` function has been deprecated for another; I replaced it with the new function.	2019-12-09 20:37:55 -05:00
Pierric Cistac	5c877fe94a	fix albert links	2019-12-09 18:53:00 -05:00
Bilal Khan	79526f82f5	Remove unnecessary epoch variable	2019-12-09 16:24:35 -05:00
Bilal Khan	9626e0458c	Add functionality to continue training from last saved global_step	2019-12-09 16:24:35 -05:00
Bilal Khan	2d73591a18	Stop saving current epoch	2019-12-09 16:24:35 -05:00

1 2 3 4 5 ...

2460 Commits