transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 11:08:23 +06:00

Author	SHA1	Message	Date
Stas Bekman	d8ca57d2ce	fix/hide warnings (#7837 ) s	2020-10-16 03:19:51 -04:00
vblagoje	c6e865ac2b	Remove masked_lm_labels from returned dictionary (#7818 )	2020-10-16 03:12:10 -04:00
Sam Shleifer	96e47d9229	[cleanup] assign todos, faster bart-cnn test (#7835 ) * 2 beam output * unassign/remove TODOs * remove one more	2020-10-16 03:11:18 -04:00
rmroczkowski	7b13bd01df	Herbert polish model (#7798 ) * HerBERT transformer model for Polish language understanding. * HerbertTokenizerFast generated with HerbertConverter * Herbert base and large model cards * Herbert model cards with tags * Herbert tensorflow models * Herbert model tests based on Bert test suit * src/transformers/tokenization_herbert.py edited online with Bitbucket * src/transformers/tokenization_herbert.py edited online with Bitbucket * docs/source/model_doc/herbert.rst edited online with Bitbucket * Herbert tokenizer tests and bug fixes * src/transformers/configuration_herbert.py edited online with Bitbucket * Copyrights and tests for TFHerbertModel * model_cards/allegro/herbert-base-cased/README.md edited online with Bitbucket * model_cards/allegro/herbert-large-cased/README.md edited online with Bitbucket * Bug fixes after testing * Reformat modified_only_fixup * Proper order of configuration * Herbert proper documentation formatting * Formatting with make modified_only_fixup * Dummies fixed * Adding missing models to documentation * Removing HerBERT model as it is a simple extension of BERT * Update model_cards/allegro/herbert-base-cased/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Update model_cards/allegro/herbert-large-cased/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com> * HerbertTokenizer deprecated configuration removed Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-10-16 03:06:51 -04:00
Julien Chaumond	99898dcd27	[Pipelines] Fix links to model lists (#7826 )	2020-10-16 02:57:02 -04:00
Lysandre Debut	52c9e84285	Fix DeBERTa integration tests (#7729 )	2020-10-16 02:49:13 -04:00
Stas Bekman	2255c2c7a0	[seq2seq] get_git_info fails gracefully (#7843 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-16 00:22:43 -04:00
Katarina Slama	dfa4c26bc0	Typo and fix the input of labels to `cross_entropy` (#7841 ) The current version caused some errors. The changes fixed it for me. Hope this is helpful!	2020-10-15 19:36:31 -04:00
Stas Bekman	a5a8eeb772	fix DeprecationWarning (#7834 ) in `tests/test_utils_check_copies.py` I was getting intermittently: ``` utils/check_copies.py:52 /mnt/nvme1/code/transformers-comet/utils/check_copies.py:52: DeprecationWarning: invalid escape sequence \s while line_index < len(lines) and re.search(f"^{indent}(class\|def)\s+{name}", lines[line_index]) is None: ``` So this should fix it.	2020-10-15 16:21:09 -04:00
David S. Lim	9c71cca316	model card for bert-base-NER (#7799 ) * model card for bert-base-NER * add meta data up top Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-10-15 21:55:00 +02:00
Stas Bekman	4dbca50022	fix wandb/comet problems (#7830 ) * fix wandb/comet problems * simplify * Update src/transformers/integrations.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-15 15:23:24 -04:00
Julien Chaumond	e7aa64838c	[model_cards] facebook/bart-large-mnli: register ZSC for the inference API cc @Narsil @mfuntowicz @joeddav	2020-10-15 19:02:10 +02:00
Sylvain Gugger	2ce3ddab2d	Small fixes to NotebookProgressCallback (#7813 )	2020-10-15 10:30:34 -04:00
Julien Chaumond	6f45dd2fac	[model_cards] Fix yaml for Facebook/wmt19-* see `d99ed7ad61`	2020-10-15 16:14:08 +02:00
Julien Chaumond	d99ed7ad61	[model_cards] Facebook: add thumbnail	2020-10-15 12:53:29 +02:00
Lysandre	2485b8b0ac	Set XLA example time to 500s	2020-10-15 12:34:29 +02:00
Lysandre	2dba7d5702	Notebook catch all errors	2020-10-15 12:21:32 +02:00
Nicolas Patry	9ade8e7499	Upgrading TFAutoModelWithLMHead to (#7730 ) - TFAutoModelForCausalLM - TFAutoModelForMaskedLM - TFAutoModelForSeq2SeqLM as per deprecation warning. No tests as it simply removes current warnings from tests.	2020-10-15 05:26:08 -04:00
Sylvain Gugger	62b5622e6b	Add specific notebook ProgressCalback (#7793 )	2020-10-15 05:05:08 -04:00
Nicolas Patry	0911b6bd86	Improving Pipelines by defaulting to framework='tf' when pytorch seems unavailable. (#7728 ) * Improving Pipelines by defaulting to framework='tf' when pytorch seems unavailable. * Actually changing the default resolution order to account for model defaults Adding a new tests for each pipeline to check that pipeline(task) works too without manually adding the framework too.	2020-10-15 09:42:07 +02:00
Julien Plu	3a134f7c67	Fix TF savedmodel in Roberta (#7795 ) * Remove wrong parameter. * Same in Longformer	2020-10-14 23:48:50 +02:00
Nils Reimers	3032de9369	Model Card (#7752 ) * Create README.md * Update model_cards/sentence-transformers/LaBSE/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-10-14 13:30:58 -04:00
sarahlintang	3fdbeba83c	[model_cards] sarahlintang/IndoBERT (#7748 ) * Create README.md * Update model_cards/sarahlintang/IndoBERT/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-10-14 13:10:31 -04:00
Julien Chaumond	ba654270b3	[model_cards] rename to correct model name	2020-10-14 19:02:48 +02:00
Zhuosheng Zhang	08978487e7	Create README.md (#7722 )	2020-10-14 12:56:12 -04:00
Sagor Sarker	3557509127	added evaluation results for classification task (#7790 )	2020-10-14 12:50:43 -04:00
Sylvain Gugger	bb9559a7f9	Don't use `store_xxx` on optional bools (#7786 ) * Don't use `store_xxx` on optional bools * Refine test * Refine test	2020-10-14 12:05:02 -04:00
Sylvain Gugger	a1d1b332d0	Add predict step accumulation (#7767 ) * Add eval_accumulation_step and clean distributed eval * Add TPU test * Add TPU stuff * Fix arg name * Fix Seq2SeqTrainer * Fix total_size * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Doc and add test to TPU * Add unit test * Adapt name Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-14 11:41:45 -04:00
Sam Shleifer	8feb0cc967	fix examples/rag imports, tests (#7712 )	2020-10-14 11:35:00 -04:00
XiaoqiJiao	890e790e16	[model_cards] TinyBERT (HUAWEI Noah's Ark Lab) (#7775 )	2020-10-14 09:31:01 -04:00
Jonathan Chang	121dd4332b	Add batch inferencing support for GPT2LMHeadModel (#7552 ) * Add support for gpt2 batch inferencing * add test * remove typo Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-10-14 13:40:24 +02:00
Quentin Lhoest	0c64b18840	Fix bert position ids in DPR convert script (#7776 ) * fix bert position ids in DPR convert script * style	2020-10-14 05:30:02 -04:00
Sylvain Gugger	7968051aba	Fix typo	2020-10-13 17:30:46 -04:00
Sam Shleifer	2977bd528f	Faster pegasus tokenization test with reduced data size (#7762 )	2020-10-13 16:22:29 -04:00
François Lagunas	2d6e2ad4fa	Adding optional trial argument to model_init (#7759 ) * Adding optional trial argument to model_init Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-13 17:07:02 +02:00
Tiger	7e73c12805	fixed lots of typos. (#7758 )	2020-10-13 10:00:20 -04:00
Noam Wies	8cb4ecca25	Avoid unnecessary DDP synchronization when gradient_accumulation_steps > 1 (#7742 ) * use DDP no_sync when possible * fix is_nlp_available addition mistake * reformat trainer.py * reformat trainer.py * drop support for pytorch < 1.2 * return support for pytorch < 1.2	2020-10-13 09:46:44 -04:00
Lysandre Debut	52f7d74398	Do not softmax when num_labels==1 (#7726 ) * Do not softmax when num_labels==1 * Update src/transformers/pipelines.py Co-authored-by: Funtowicz Morgan <mfuntowicz@users.noreply.github.com> Co-authored-by: Funtowicz Morgan <mfuntowicz@users.noreply.github.com>	2020-10-13 09:42:27 -04:00
Patrick von Platen	82b09a8481	[Rag] Fix loading of pretrained Rag Tokenizer (#7756 ) * fix rag * Update tokenizer save_pretrained Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-10-13 14:34:22 +02:00
Patrick von Platen	2d4e928d97	Update PULL_REQUEST_TEMPLATE.md Putting my name on a couple more issues to directly redirect them to me	2020-10-13 12:18:31 +02:00
Felipe Curti	dcba9ee03b	Gpt1 for sequence classification (#7683 ) * Add Documentation for GPT-1 Classification * Add GPT-1 with Classification head * Add tests for GPT-1 Classification * Add GPT-1 For Classification to auto models * Remove authorized missing keys, change checkpoint to openai-gpt	2020-10-13 05:06:15 -04:00
Lysandre Debut	f34b4cd1bd	ElectraTokenizerFast (#7754 )	2020-10-13 04:50:41 -04:00
Sam Shleifer	9c2b2db2cd	[marian] Automate Tatoeba-Challenge conversion (#7709 )	2020-10-12 12:24:25 -04:00
Alex Combessie	aacac8f708	Add license info to nlptown/bert-base-multilingual-uncased-sentiment (#7738 )	2020-10-12 11:56:10 -04:00
Lysandre Debut	1f1d950b28	Fix #7331 (#7732 )	2020-10-12 09:10:52 -04:00
Julien Plu	d9ffb87efb	Fix tf text class (#7724 ) * Fix test * fix generic text classification * fix test * Fix tests	2020-10-12 08:45:15 -04:00
sgugger	d6175a4268	Fix code quality	2020-10-12 08:22:27 -04:00
Jonathan Chang	1d5ea34f6a	Fix trainer callback (#7720 ) Fix a bug that happends when subclassing Trainer and overwriting evaluate() without calling prediciton_loop()	2020-10-12 07:45:12 -04:00
Kelvin	f176e70723	The input training data files (multiple files in glob format). (#7717 ) Very often splitting large files to smaller files can prevent tokenizer going out of memory in environment like Colab that does not have swap memory	2020-10-12 07:44:02 -04:00
AndreaSottana	34fcfb44e3	Update tokenization_utils_base.py (#7696 ) Minor spelling corrections in docstrings. "information" is uncountable in English and has no plural.	2020-10-12 06:09:20 -04:00

1 2 3 4 5 ...

5522 Commits