transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-07 14:50:07 +06:00

Author	SHA1	Message	Date
Nicolas Patry	be236361f1	Adding `batch_size` support for (almost) all pipelines (#13724 ) * Tentative enabling of `batch_size` for pipelines. * Add systematic test for pipeline batching. * Enabling batch_size on almost all pipelines - Not `zero-shot` (it's already passing stuff as batched so trickier) - Not `QA` (preprocess uses squad features, we need to switch to real tensors at this boundary. * Adding `min_length_for_response` for conversational. * Making CTC, speech mappings avaiable regardless of framework. * Attempt at fixing automatic tests (ffmpeg not enabled for fast tests) * Removing ffmpeg dependency in tests. * Small fixes. * Slight cleanup. * Adding docs and adressing comments. * Quality. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improving docs. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> * N -> oberved_batch_size softmax trick. * Follow `padding_side`. * Supporting image pipeline batching (and padding). * Rename `unbatch` -> `loader_batch`. * unbatch_size forgot. * Custom padding for offset mappings. * Attempt to remove librosa. * Adding require_audio. * torchaudio. * Back to using datasets librosa. * Adding help to set a pad_token on the tokenizer. * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Quality. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2021-10-29 11:34:18 +02:00
Nicolas Patry	013bdc6d65	Fixing Backward compatiblity for zero-shot (#13855 ) Fixes #13846	2021-10-05 23:06:47 -04:00
Nicolas Patry	0eabe49204	Fixing zero-shot backward compatiblity (#13725 ) Fixes #13697	2021-09-24 07:38:17 -04:00
Nicolas Patry	aacd2123ee	Fixing #13381 (#13400 ) * Fixing #13381 * Enabling automatic LED models.	2021-09-09 14:23:52 -04:00
Nicolas Patry	b89a964d3f	Moving `zero-shot-classification` pipeline to new testing. (#13299 ) * Moving `zero-shot-classification` pipeline to new testing. * Cleaning up old mixins. * Fixing tests `sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english` is corrupted in PT. * Adding warning.	2021-08-27 15:46:11 +02:00
Sylvain Gugger	d4c834d2e0	Fix from_pretrained with corrupted state_dict (#12939 ) * Fix from_pretrained with corrupted state_dict * Adapt test * Use better checkpoint * Style * Clean up	2021-08-04 11:48:39 +02:00
Vyom Pathak	fd3b12e8c3	Fixed: Better names for nlp variables in pipelines' tests and docs. (#11752 ) * Fixed: Better names for nlp variables in pipelines' tests and docs. * Fixed: Better variable names	2021-05-18 09:47:28 -04:00
Joe Davison	966ba081c9	zero-shot pipeline multi_class -> multi_label (#10727 )	2021-03-15 16:02:46 -06:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Thomas Wolf	f4e04cd2c6	[breaking\|pipelines\|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073 ) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-15 22:50:59 +01:00
Nicolas Patry	84caa23301	Fix the behaviour of DefaultArgumentHandler (removing it). (#8180 ) * Some work to fix the behaviour of DefaultArgumentHandler by removing it. * Fixing specific pipelines argument checking.	2020-11-02 12:33:50 +01:00
Joe Davison	3e58b6b7b8	infer entailment label id on zero shot pipeline (#8059 ) * add entailment dim argument * rename dim -> id * fix last name change, style * rm arg, auto-infer only * typo * rm superfluous import	2020-10-27 14:09:55 -04:00
Thomas Wolf	3a40cdf58d	[tests\|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 ) * WIP refactoring pipeline tests - switching to fast tokenizers * fix dialog pipeline and fill-mask * refactoring pipeline tests backbone * make large tests slow * fix tests (tf Bart inactive for now) * fix doc... * clean up for merge * fixing tests - remove bart from summarization until there is TF * fix quality and RAG * Add new translation pipeline tests - fix JAX tests * only slow for dialog * Fixing the missing TF-BART imports in modeling_tf_auto * spin out pipeline tests in separate CI job * adding pipeline test to CI YAML * add slow pipeline tests * speed up tf and pt join test to avoid redoing all the standalone pt and tf tests * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/pipelines.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add require_torch and require_tf in is_pt_tf_cross_test Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-23 15:58:19 +02:00

13 Commits