transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
Sam Shleifer	3723f30a18	[cleanup] MarianTokenizer: delete unused constants (#4802 )	2020-06-05 14:57:24 -04:00
Sylvain Gugger	acaa2e6267	Clean-up code (#4790 )	2020-06-05 12:36:22 -04:00
Sylvain Gugger	fa661ce749	Add model summary (#4789 ) * Add model summary * Add link to pretrained models	2020-06-05 12:22:50 -04:00
Lysandre Debut	79ab881eb1	No silent error when d_head already in the configuration (#4747 ) * No silent error when d_head already in the configuration * Update src/transformers/configuration_xlnet.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-05 12:01:43 -04:00
Julien Chaumond	b9109f2de1	[doc] Make it clearer that `text-generation` does not involve training	2020-06-05 14:59:22 +02:00
Sylvain Gugger	ceaab8dd22	Add .vs to gitignore (#4774 )	2020-06-05 07:56:11 -04:00
Julien Plu	f9414f7553	Tensorflow improvements (#4530 ) * Better None gradients handling * Apply Style * Apply Style * Create a loss class per task to compute its respective loss * Add loss classes to the ALBERT TF models * Add loss classes to the BERT TF models * Add question answering and multiple choice to TF Camembert * Remove prints * Add multiple choice model to TF DistilBERT + loss computation * Add question answering model to TF Electra + loss computation * Add token classification, question answering and multiple choice models to TF Flaubert * Add multiple choice model to TF Roberta + loss computation * Add multiple choice model to TF XLM + loss computation * Add multiple choice and question answering models to TF XLM-Roberta * Add multiple choice model to TF XLNet + loss computation * Remove unused parameters * Add task loss classes * Reorder TF imports + add new model classes * Add new model classes * Bugfix in TF T5 model * Bugfix for TF T5 tests * Bugfix in TF T5 model * Fix TF T5 model tests * Fix T5 tests + some renaming * Fix inheritance issue in the AutoX tests * Add tests for TF Flaubert and TF XLM Roberta * Add tests for TF Flaubert and TF XLM Roberta * Remove unused piece of code in the TF trainer * bugfix and remove unused code * Bugfix for TF 2.2 * Apply Style * Divide TFSequenceClassificationAndMultipleChoiceLoss into their two respective name * Apply style * Mirror the PT Trainer in the TF one: fp16, optimizers and tb_writer as class parameter and better dataset handling * Fix TF optimizations tests and apply style * Remove useless parameter * Bugfix and apply style * Fix TF Trainer prediction * Now the TF models return the loss such as their PyTorch couterparts * Apply Style * Ignore some tests output * Take into account the SQuAD cls_index, p_mask and is_impossible parameters for the QuestionAnswering task models. * Fix names for SQuAD data * Apply Style * Fix conflicts with 2.11 release * Fix conflicts with 2.11 * Fix wrongname * Add better documentation on the new create_optimizer function * Fix isort * logging_dir: use same default as PyTorch Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-04 19:45:53 -04:00
Théophile Blard	ccd26c2862	Create model card for tblard/allocine (#4775 ) https://huggingface.co/tblard/tf-allocine	2020-06-04 19:15:07 -04:00
Stefan Schweter	2a4b9e09c0	NER: Add new WNUT’17 example (#4681 ) * ner: add preprocessing script for examples that splits longer sentences * ner: example shell scripts use local preprocessing now * ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results * ner: satisfy black and isort	2020-06-04 19:13:17 -04:00
Setu Shah	0e1869cc28	Add drop_last arg for data loader	2020-06-04 18:30:31 -04:00
prajjwal1	48a05026de	removed deprecared use of Variable api from pplm example	2020-06-04 18:07:49 -04:00
Sylvain Gugger	12d0eb5f3e	Don't access pad_token_id if there is no pad_token (#4773 )	2020-06-04 17:57:04 -04:00
Manuel Romero	17a88d3192	Create model card for T5-base fine-tuned for Sentiment Span Extraction (#4737 )	2020-06-04 16:59:56 -04:00
Oren Amsalem	fb52143cf6	Create README.md (#4743 )	2020-06-04 16:59:37 -04:00
Suraj Parmar	5f077a3445	Model Card for RoBERTa trained on Sanskrit (#4763 ) * Model cad for SanBERTa Model Card for RoBERTa trained on Sanskrit * Model card for SanBERTa model card for RoBERTa trained on Sanskrit	2020-06-04 16:58:40 -04:00
Sylvain Gugger	cd4e07a85e	Add note about doc generation (#4770 )	2020-06-04 13:43:14 -04:00
Jason Phang	492b352ab6	Remove unnecessary model_type arg in example (#4771 )	2020-06-04 13:41:24 -04:00
Lysandre Debut	e645b9ab94	Codecov setup (#4768 ) * Codecov setup * Understanding codecov	2020-06-04 11:44:38 -04:00
Sam Shleifer	2b8b6c929e	[cleanup] PretrainedModel.generate: remove unused kwargs (#4761 )	2020-06-04 08:13:52 -04:00
Funtowicz Morgan	5bf9afbf35	Introduce a new tensor type for return_tensors on tokenizer for NumPy (#4585 ) * Refactor tensor creation in tokenizers. * Make sure to convert string to TensorType * Refactor convert_to_tensors_ * Introduce numpy tensor creation * Format * Add unittest for TensorType creation from str * sorting imports * Added unittests for numpy tensor conversion. * Do not use in-place version for squeeze as numpy doesn't provide such feature. * Added extra parameter prepend_batch_axis: bool on prepare_for_model. * Ensure test_np_encode_plus_sent_to_model is not executed if encoder/decoder model. * style. * numpy tests require_torch for now while flax not merged. * Hopefully will make flake8 happy. * One more time 🎶	2020-06-04 06:57:01 +02:00
Funtowicz Morgan	efae154929	never_split on slow tokenizers should not split (#4723 ) * Ensure tokens in never_split are not splitted when using basic tokenizer before wordpiece. * never_split only use membership attempt to use a set() which is 10x faster for this operation. * Use union to concatenate two sets. * Updated docstring for never_split parameter. * Avoid set.union() if never_split is None * Added comments. * Correct docstring format.	2020-06-03 16:48:28 -04:00
Lysandre Debut	2e4de76231	Update encode documentation (#4751 )	2020-06-03 16:30:59 -04:00
Patrick von Platen	ed4df85572	fix beam search bug in tf as well (#4745 )	2020-06-03 12:53:23 -04:00
Sylvain Gugger	1b5820a565	Unify label args (#4722 ) * Deprecate masked_lm_labels argument * Apply to all models * Better error message	2020-06-03 09:36:26 -04:00
Abhishek Kumar Mishra	3e5928c57d	Adding notebooks for Fine Tuning [Community Notebook] (#4732 ) * Added links to more community notebooks Added links to 3 more community notebooks from the git repo: https://github.com/abhimishra91/transformers-tutorials Different Transformers models are fine tuned on Dataset using PyTorch * Update README.md * Update README.md * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-03 11:07:26 +02:00
Julien Chaumond	99207bd112	Pipelines: miscellanea of QoL improvements and small features... (#4632 ) * [hf_api] Attach all unknown attributes for future-proof compatibility * [Pipeline] NerPipeline is really a TokenClassificationPipeline * modelcard.py: I don't think we need to force the download * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer * FillMaskPipeline: also output token in string form * TextClassificationPipeline: option to return all scores, not just the argmax * Update docs/source/main_classes/pipelines.rst	2020-06-03 03:51:31 -04:00
David Mezzetti	8ed47aa10b	bert-small-cord19 model cards (#4730 ) * Create README.md * Create README.md * Create README.md	2020-06-03 03:40:14 -04:00
Patrick von Platen	9ca485734a	[Reformer] Improved memory if input is shorter than chunk length (#4720 ) * improve handling of short inputs for reformer * correct typo in assert statement * fix other tests	2020-06-02 23:08:39 +02:00
Jin Young Sohn	b231a413f5	Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics (#4621 ) * Glue task cleaup * Enable writing cache to cache_dir in case dataset lives in readOnly filesystem. * Differentiate match vs mismatch for MNLI metrics. * Style * Fix pytype * Fix type * Use cache_dir in mnli mismatch eval dataset * Small Tweaks Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-02 13:40:14 -04:00
Sam Shleifer	70f7423436	TFRobertaModelIntegrationTest requires tf (#4726 )	2020-06-02 12:59:00 -04:00
Lysandre	d976ef262e	Repin versions	2020-06-02 10:27:15 -04:00
Julien Chaumond	b42586ea56	Fix CI after killing archive maps (#4724 ) * 🐛 Fix model ids for BART and Flaubert	2020-06-02 10:21:09 -04:00
Lysandre	b43c78e5d3	Release: v2.11.0	2020-06-02 09:49:09 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Patrick von Platen	47a551d17b	[pipeline] Tokenizer should not add special tokens for text generation (#4686 ) * allow to not add special tokens * remove print	2020-06-02 11:03:46 +02:00
Funtowicz Morgan	f6d5046af1	Override get_vocab for fast tokenizer. (#4717 )	2020-06-02 11:02:27 +02:00
Lysandre Debut	88762a2f8c	Specify PyTorch versions for examples (#4710 )	2020-06-02 04:29:28 -04:00
Lorenzo Ampil	d3ef14f931	Add community notebook for sentiment span extraction (#4700 )	2020-06-02 09:59:53 +02:00
Sylvain Gugger	7677936316	Make docstring match args (#4711 )	2020-06-01 15:22:51 -04:00
Lysandre	6449c494d0	close #4685	2020-06-01 12:57:52 -04:00
Julien Chaumond	ec8717d5d8	[config] Ensure that id2label always takes precedence over num_labels	2020-06-01 16:54:55 +02:00
Julien Chaumond	751a1e0890	[config] Ensure that id2label always takes precedence over num_labels Fixes bug reported in https://github.com/huggingface/transformers/issues/4669 See #3967 for context	2020-06-01 16:25:56 +02:00
Rens	ec62b7d953	Fix onnx export input names order (#4641 ) * pass on tokenizer to pipeline * order input names when convert to onnx * update style * remove unused imports * make ordered inputs list needs to be mutable * add test custom bert model * remove unused imports	2020-06-01 16:12:48 +02:00
Victor SANH	bf760c80b5	finish README	2020-06-01 09:23:31 -04:00
Victor SANH	9d7d9b3ae0	weird import	2020-06-01 09:23:31 -04:00
Victor SANH	2a3c88a659	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	4ac462bfb8	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	35fa0bbca0	clarify README	2020-06-01 09:23:31 -04:00
Victor SANH	cc746a5020	flake8 compliance	2020-06-01 09:23:31 -04:00
Victor SANH	b11386e158	less prints in saving prunebert	2020-06-01 09:23:31 -04:00

1 2 3 4 5 ...

4129 Commits