transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-25 07:18:58 +06:00

Author	SHA1	Message	Date
Funtowicz Morgan	efae154929	never_split on slow tokenizers should not split (#4723 ) * Ensure tokens in never_split are not splitted when using basic tokenizer before wordpiece. * never_split only use membership attempt to use a set() which is 10x faster for this operation. * Use union to concatenate two sets. * Updated docstring for never_split parameter. * Avoid set.union() if never_split is None * Added comments. * Correct docstring format.	2020-06-03 16:48:28 -04:00
Lysandre Debut	2e4de76231	Update encode documentation (#4751 )	2020-06-03 16:30:59 -04:00
Patrick von Platen	ed4df85572	fix beam search bug in tf as well (#4745 )	2020-06-03 12:53:23 -04:00
Sylvain Gugger	1b5820a565	Unify label args (#4722 ) * Deprecate masked_lm_labels argument * Apply to all models * Better error message	2020-06-03 09:36:26 -04:00
Abhishek Kumar Mishra	3e5928c57d	Adding notebooks for Fine Tuning [Community Notebook] (#4732 ) * Added links to more community notebooks Added links to 3 more community notebooks from the git repo: https://github.com/abhimishra91/transformers-tutorials Different Transformers models are fine tuned on Dataset using PyTorch * Update README.md * Update README.md * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-03 11:07:26 +02:00
Julien Chaumond	99207bd112	Pipelines: miscellanea of QoL improvements and small features... (#4632 ) * [hf_api] Attach all unknown attributes for future-proof compatibility * [Pipeline] NerPipeline is really a TokenClassificationPipeline * modelcard.py: I don't think we need to force the download * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer * FillMaskPipeline: also output token in string form * TextClassificationPipeline: option to return all scores, not just the argmax * Update docs/source/main_classes/pipelines.rst	2020-06-03 03:51:31 -04:00
David Mezzetti	8ed47aa10b	bert-small-cord19 model cards (#4730 ) * Create README.md * Create README.md * Create README.md	2020-06-03 03:40:14 -04:00
Patrick von Platen	9ca485734a	[Reformer] Improved memory if input is shorter than chunk length (#4720 ) * improve handling of short inputs for reformer * correct typo in assert statement * fix other tests	2020-06-02 23:08:39 +02:00
Jin Young Sohn	b231a413f5	Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics (#4621 ) * Glue task cleaup * Enable writing cache to cache_dir in case dataset lives in readOnly filesystem. * Differentiate match vs mismatch for MNLI metrics. * Style * Fix pytype * Fix type * Use cache_dir in mnli mismatch eval dataset * Small Tweaks Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-02 13:40:14 -04:00
Sam Shleifer	70f7423436	TFRobertaModelIntegrationTest requires tf (#4726 )	2020-06-02 12:59:00 -04:00
Lysandre	d976ef262e	Repin versions	2020-06-02 10:27:15 -04:00
Julien Chaumond	b42586ea56	Fix CI after killing archive maps (#4724 ) * 🐛 Fix model ids for BART and Flaubert	2020-06-02 10:21:09 -04:00
Lysandre	b43c78e5d3	Release: v2.11.0	2020-06-02 09:49:09 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Patrick von Platen	47a551d17b	[pipeline] Tokenizer should not add special tokens for text generation (#4686 ) * allow to not add special tokens * remove print	2020-06-02 11:03:46 +02:00
Funtowicz Morgan	f6d5046af1	Override get_vocab for fast tokenizer. (#4717 )	2020-06-02 11:02:27 +02:00
Lysandre Debut	88762a2f8c	Specify PyTorch versions for examples (#4710 )	2020-06-02 04:29:28 -04:00
Lorenzo Ampil	d3ef14f931	Add community notebook for sentiment span extraction (#4700 )	2020-06-02 09:59:53 +02:00
Sylvain Gugger	7677936316	Make docstring match args (#4711 )	2020-06-01 15:22:51 -04:00
Lysandre	6449c494d0	close #4685	2020-06-01 12:57:52 -04:00
Julien Chaumond	ec8717d5d8	[config] Ensure that id2label always takes precedence over num_labels	2020-06-01 16:54:55 +02:00
Julien Chaumond	751a1e0890	[config] Ensure that id2label always takes precedence over num_labels Fixes bug reported in https://github.com/huggingface/transformers/issues/4669 See #3967 for context	2020-06-01 16:25:56 +02:00
Rens	ec62b7d953	Fix onnx export input names order (#4641 ) * pass on tokenizer to pipeline * order input names when convert to onnx * update style * remove unused imports * make ordered inputs list needs to be mutable * add test custom bert model * remove unused imports	2020-06-01 16:12:48 +02:00
Victor SANH	bf760c80b5	finish README	2020-06-01 09:23:31 -04:00
Victor SANH	9d7d9b3ae0	weird import	2020-06-01 09:23:31 -04:00
Victor SANH	2a3c88a659	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	4ac462bfb8	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	35fa0bbca0	clarify README	2020-06-01 09:23:31 -04:00
Victor SANH	cc746a5020	flake8 compliance	2020-06-01 09:23:31 -04:00
Victor SANH	b11386e158	less prints in saving prunebert	2020-06-01 09:23:31 -04:00
Victor SANH	8b5d4003ab	complete README	2020-06-01 09:23:31 -04:00
Victor SANH	5c8e5b3709	commplying with isort	2020-06-01 09:23:31 -04:00
Victor SANH	db2a3b2e01	space	2020-06-01 09:23:31 -04:00
Victor SANH	5f8f2d849a	add floppy bert model notebok	2020-06-01 09:23:31 -04:00
Victor SANH	b41948f5cd	add requirements	2020-06-01 09:23:31 -04:00
Victor SANH	fb8f4277b2	add scripts	2020-06-01 09:23:31 -04:00
Victor SANH	d489a6d3d5	add masked_run_*	2020-06-01 09:23:31 -04:00
Victor SANH	e4c07faf0a	add sparsity modules	2020-06-01 09:23:31 -04:00
Mehrdad Farahani	667003e447	Create README.md (#4665 )	2020-06-01 08:29:09 -04:00
Mehrdad Farahani	ed23f5909e	HooshvareLab readme parsbert-armananer (#4666 ) Readme for HooshvareLab/bert-base-parsbert-armananer-uncased	2020-06-01 08:28:43 -04:00
Mehrdad Farahani	3750b9b0b0	HooshvareLab readme parsbert-peymaner (#4667 ) Readme for HooshvareLab/bert-base-parsbert-peymaner-uncased	2020-06-01 08:28:25 -04:00
Mehrdad Farahani	036c2c6b02	Update HooshvareLab/bert-base-parsbert-uncased (#4687 ) mBERT results added regarding NER datasets!	2020-06-01 08:27:00 -04:00
Manuel Romero	74872c19d3	Create README.md (#4684 )	2020-06-01 05:45:54 -04:00
Patrick von Platen	0866669e75	[EncoderDecoder] Fix initialization and save/load bug (#4680 ) * fix bug * add more tests	2020-05-30 01:25:19 +02:00
Patrick von Platen	6f82aea66b	Include `nlp` notebook for model evaluation (#4676 )	2020-05-29 19:38:56 +02:00
Wei Fang	33b7532e69	Fix longformer attention mask type casting when using apex (#4574 ) * Fix longformer attention mask casting when using apex * remove extra type casting	2020-05-29 18:13:30 +02:00
Patrick von Platen	56ee2560be	[Longformer] Better handling of global attention mask vs local attention mask (#4672 ) * better api * improve automatic setting of global attention mask * fix longformer bug * fix global attention mask in test * fix global attn mask flatten * fix slow tests * update docstring * update docs and make more robust * improve attention mask	2020-05-29 17:58:42 +02:00
Simon Böhm	e2230ba77b	Fix BERT example code for NSP and Multiple Choice (#3953 ) Change the example code to use encode_plus since the token_type_id wasn't being correctly set.	2020-05-29 11:55:55 -04:00
Zhangyx	3a5d1ea2a5	Fix two bugs: 1. Index of test data of SST-2. 2. Label index of MNLI data. (#4546 )	2020-05-29 11:12:24 -04:00
Patrick von Platen	9c17256447	[Longformer] Multiple choice for longformer (#4645 ) * add multiple choice for longformer * add models to docs * adapt docstring * add test to longformer * add longformer for mc in init and modeling auto * fix tests	2020-05-29 13:46:08 +02:00

... 32 33 34 35 36 ...

5759 Commits