transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Stefan Schweter	2a4b9e09c0	NER: Add new WNUT’17 example (#4681 ) * ner: add preprocessing script for examples that splits longer sentences * ner: example shell scripts use local preprocessing now * ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results * ner: satisfy black and isort	2020-06-04 19:13:17 -04:00
Setu Shah	0e1869cc28	Add drop_last arg for data loader	2020-06-04 18:30:31 -04:00
prajjwal1	48a05026de	removed deprecared use of Variable api from pplm example	2020-06-04 18:07:49 -04:00
Sylvain Gugger	12d0eb5f3e	Don't access pad_token_id if there is no pad_token (#4773 )	2020-06-04 17:57:04 -04:00
Manuel Romero	17a88d3192	Create model card for T5-base fine-tuned for Sentiment Span Extraction (#4737 )	2020-06-04 16:59:56 -04:00
Oren Amsalem	fb52143cf6	Create README.md (#4743 )	2020-06-04 16:59:37 -04:00
Suraj Parmar	5f077a3445	Model Card for RoBERTa trained on Sanskrit (#4763 ) * Model cad for SanBERTa Model Card for RoBERTa trained on Sanskrit * Model card for SanBERTa model card for RoBERTa trained on Sanskrit	2020-06-04 16:58:40 -04:00
Sylvain Gugger	cd4e07a85e	Add note about doc generation (#4770 )	2020-06-04 13:43:14 -04:00
Jason Phang	492b352ab6	Remove unnecessary model_type arg in example (#4771 )	2020-06-04 13:41:24 -04:00
Lysandre Debut	e645b9ab94	Codecov setup (#4768 ) * Codecov setup * Understanding codecov	2020-06-04 11:44:38 -04:00
Sam Shleifer	2b8b6c929e	[cleanup] PretrainedModel.generate: remove unused kwargs (#4761 )	2020-06-04 08:13:52 -04:00
Funtowicz Morgan	5bf9afbf35	Introduce a new tensor type for return_tensors on tokenizer for NumPy (#4585 ) * Refactor tensor creation in tokenizers. * Make sure to convert string to TensorType * Refactor convert_to_tensors_ * Introduce numpy tensor creation * Format * Add unittest for TensorType creation from str * sorting imports * Added unittests for numpy tensor conversion. * Do not use in-place version for squeeze as numpy doesn't provide such feature. * Added extra parameter prepend_batch_axis: bool on prepare_for_model. * Ensure test_np_encode_plus_sent_to_model is not executed if encoder/decoder model. * style. * numpy tests require_torch for now while flax not merged. * Hopefully will make flake8 happy. * One more time 🎶	2020-06-04 06:57:01 +02:00
Funtowicz Morgan	efae154929	never_split on slow tokenizers should not split (#4723 ) * Ensure tokens in never_split are not splitted when using basic tokenizer before wordpiece. * never_split only use membership attempt to use a set() which is 10x faster for this operation. * Use union to concatenate two sets. * Updated docstring for never_split parameter. * Avoid set.union() if never_split is None * Added comments. * Correct docstring format.	2020-06-03 16:48:28 -04:00
Lysandre Debut	2e4de76231	Update encode documentation (#4751 )	2020-06-03 16:30:59 -04:00
Patrick von Platen	ed4df85572	fix beam search bug in tf as well (#4745 )	2020-06-03 12:53:23 -04:00
Sylvain Gugger	1b5820a565	Unify label args (#4722 ) * Deprecate masked_lm_labels argument * Apply to all models * Better error message	2020-06-03 09:36:26 -04:00
Abhishek Kumar Mishra	3e5928c57d	Adding notebooks for Fine Tuning [Community Notebook] (#4732 ) * Added links to more community notebooks Added links to 3 more community notebooks from the git repo: https://github.com/abhimishra91/transformers-tutorials Different Transformers models are fine tuned on Dataset using PyTorch * Update README.md * Update README.md * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-03 11:07:26 +02:00
Julien Chaumond	99207bd112	Pipelines: miscellanea of QoL improvements and small features... (#4632 ) * [hf_api] Attach all unknown attributes for future-proof compatibility * [Pipeline] NerPipeline is really a TokenClassificationPipeline * modelcard.py: I don't think we need to force the download * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer * FillMaskPipeline: also output token in string form * TextClassificationPipeline: option to return all scores, not just the argmax * Update docs/source/main_classes/pipelines.rst	2020-06-03 03:51:31 -04:00
David Mezzetti	8ed47aa10b	bert-small-cord19 model cards (#4730 ) * Create README.md * Create README.md * Create README.md	2020-06-03 03:40:14 -04:00
Patrick von Platen	9ca485734a	[Reformer] Improved memory if input is shorter than chunk length (#4720 ) * improve handling of short inputs for reformer * correct typo in assert statement * fix other tests	2020-06-02 23:08:39 +02:00
Jin Young Sohn	b231a413f5	Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics (#4621 ) * Glue task cleaup * Enable writing cache to cache_dir in case dataset lives in readOnly filesystem. * Differentiate match vs mismatch for MNLI metrics. * Style * Fix pytype * Fix type * Use cache_dir in mnli mismatch eval dataset * Small Tweaks Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-02 13:40:14 -04:00
Sam Shleifer	70f7423436	TFRobertaModelIntegrationTest requires tf (#4726 )	2020-06-02 12:59:00 -04:00
Lysandre	d976ef262e	Repin versions	2020-06-02 10:27:15 -04:00
Julien Chaumond	b42586ea56	Fix CI after killing archive maps (#4724 ) * 🐛 Fix model ids for BART and Flaubert	2020-06-02 10:21:09 -04:00
Lysandre	b43c78e5d3	Release: v2.11.0	2020-06-02 09:49:09 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Patrick von Platen	47a551d17b	[pipeline] Tokenizer should not add special tokens for text generation (#4686 ) * allow to not add special tokens * remove print	2020-06-02 11:03:46 +02:00
Funtowicz Morgan	f6d5046af1	Override get_vocab for fast tokenizer. (#4717 )	2020-06-02 11:02:27 +02:00
Lysandre Debut	88762a2f8c	Specify PyTorch versions for examples (#4710 )	2020-06-02 04:29:28 -04:00
Lorenzo Ampil	d3ef14f931	Add community notebook for sentiment span extraction (#4700 )	2020-06-02 09:59:53 +02:00
Sylvain Gugger	7677936316	Make docstring match args (#4711 )	2020-06-01 15:22:51 -04:00
Lysandre	6449c494d0	close #4685	2020-06-01 12:57:52 -04:00
Julien Chaumond	ec8717d5d8	[config] Ensure that id2label always takes precedence over num_labels	2020-06-01 16:54:55 +02:00
Julien Chaumond	751a1e0890	[config] Ensure that id2label always takes precedence over num_labels Fixes bug reported in https://github.com/huggingface/transformers/issues/4669 See #3967 for context	2020-06-01 16:25:56 +02:00
Rens	ec62b7d953	Fix onnx export input names order (#4641 ) * pass on tokenizer to pipeline * order input names when convert to onnx * update style * remove unused imports * make ordered inputs list needs to be mutable * add test custom bert model * remove unused imports	2020-06-01 16:12:48 +02:00
Victor SANH	bf760c80b5	finish README	2020-06-01 09:23:31 -04:00
Victor SANH	9d7d9b3ae0	weird import	2020-06-01 09:23:31 -04:00
Victor SANH	2a3c88a659	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	4ac462bfb8	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	35fa0bbca0	clarify README	2020-06-01 09:23:31 -04:00
Victor SANH	cc746a5020	flake8 compliance	2020-06-01 09:23:31 -04:00
Victor SANH	b11386e158	less prints in saving prunebert	2020-06-01 09:23:31 -04:00
Victor SANH	8b5d4003ab	complete README	2020-06-01 09:23:31 -04:00
Victor SANH	5c8e5b3709	commplying with isort	2020-06-01 09:23:31 -04:00
Victor SANH	db2a3b2e01	space	2020-06-01 09:23:31 -04:00
Victor SANH	5f8f2d849a	add floppy bert model notebok	2020-06-01 09:23:31 -04:00
Victor SANH	b41948f5cd	add requirements	2020-06-01 09:23:31 -04:00
Victor SANH	fb8f4277b2	add scripts	2020-06-01 09:23:31 -04:00
Victor SANH	d489a6d3d5	add masked_run_*	2020-06-01 09:23:31 -04:00
Victor SANH	e4c07faf0a	add sparsity modules	2020-06-01 09:23:31 -04:00

1 2 3 4 5 ...

4121 Commits