transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Nicolas Patry	0aa1153ffb	Revert error back into warning for byte fallback conversion. (#22607 )	2023-04-06 14:00:29 +02:00
Nicolas Patry	1670be4bde	Adding Llama FastTokenizer support. (#22264 ) * Adding Llama FastTokenizer support. - Requires https://github.com/huggingface/tokenizers/pull/1183 version - Only support byte_fallback for llama, raise otherwise (safety net). - Lots of questions are special tokens How to test: ```python from transformers.convert_slow_tokenizer import convert_slow_tokenizer from transformers import AutoTokenizer from tokenizers import Tokenizer tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b") if False: new_tokenizer = Tokenizer.from_file("tok.json") else: new_tokenizer = convert_slow_tokenizer(tokenizer) new_tokenizer.save("tok.json") strings = [ "This is a test", "生活的真谛是", "生活的真谛是[MASK]。", # XXX: This one is problematic because of special tokens # "<s> Something something", ] for string in strings: encoded = tokenizer(string)["input_ids"] encoded2 = new_tokenizer.encode(string).ids assert encoded == encoded2, f"{encoded} != {encoded2}" decoded = tokenizer.decode(encoded) decoded2 = new_tokenizer.decode(encoded2) assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}" ``` The converter + some test script. The test script. Tmp save. Adding Fast tokenizer + tests. Adding the tokenization tests. Correct combination. Small fix. Fixing tests. Fixing with latest update. Rebased. fix copies + normalized added tokens + copies. Adding doc. TMP. Doc + split files. Doc. Versions + try import. Fix Camembert + warnings -> Error. Fix by ArthurZucker. Not a decorator. * Fixing comments. * Adding more to docstring. * Doc rewriting.	2023-04-06 09:53:03 +02:00
Matt	e577bd0f13	Use native TF checkpoints for the BLIP TF tests (#22593 ) * Use native TF checkpoints for the TF tests * Remove unneeded exceptions	2023-04-05 18:43:14 +01:00
Matt	2a91a9ef66	Fix PT-TF equivalence test for GPT1 (#22586 ) * Re-enable skipped test and fix the hidden state shape issue * Actually fix the bug instead of just doing something wrong	2023-04-05 13:16:00 +01:00
Joao Gante	861ff890d6	Generate: `TextIteratorStreamer` timeout (#22576 )	2023-04-05 09:57:46 +01:00
Sylvain Gugger	11fd2c773b	Skip failing test	2023-04-04 21:26:17 -04:00
Matt	edb704b26e	Fix inverted conditional in TF common test! (#22540 ) * Fix inverted conditional in TF common test! * Make the same change in the PT tests file * Make sure hidden states for GPT2 have the same output shape in PT/TF * Minor fix to PT implementation of token classification loss * Skip loss equivalence test for TFHubert because it keeps overflowing to inf * Compute LM loss for TF the (weird) way it's computed in PT * Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert * Fix - don't try to access the hidden states property when output is a tuple	2023-04-04 21:59:54 +01:00
Shubhamai	900677487d	Flax Regnet (#21867 ) * initial commit * review changes * post model PR merge * updating doc	2023-04-04 12:41:12 -04:00
Matt	5f3ea66bc0	Add TF port of BLIP (#22090 ) * Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 16:05:22 +01:00
Nicolas Patry	a515d0a77c	Soft error whisper. (#22475 ) * Soft error whisper. * Fix format. --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>	2023-04-04 16:21:57 +02:00
Viktor Scherbakov	871598be55	Implemented safetensors checkpoints save/load for Trainer (#22498 ) * implemented safetensors save/load * remove duplicated file * added tests * more tests * style fix * fix tf tests * change to list comprehension Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * review fixes + safe load for sharded checkpoint * style fix * remove rogue import * remove partial to avoid undefined exception * use naming alias instead of safetensors.torch * fix safe sharding in tests * grammar Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor corrections * style --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 09:05:04 -04:00
Arthur	00b5887b94	🚨🚨🚨 `[NLLB Tokenizer]` Fix the prefix tokens 🚨🚨🚨 (#22313 ) * fix the prefix tokens * update fast and test values * add legacy behaviour Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update disclaimer, linkissue PR and behaviral changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * styling * make a quote * quote this time --------- Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-04-04 14:53:06 +02:00
TheWall9	ad5e9b6c6a	[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of past_key_values when generating as a decoder (#22416 ) * fix RoFormerEncoder postion embedding when generate as decoder * make fixup * add test case for check generate with past key values * remove duplicating code	2023-04-04 12:50:33 +02:00
Joao Gante	1905384fd5	Generate: Add text streamer decoding options (#22544 )	2023-04-04 09:03:13 +01:00
Younes Belkada	159ff3342c	Update test_image_processing_pix2struct.py (#22543 )	2023-04-03 15:26:35 -04:00
Sylvain Gugger	c14d31294e	Skip failing test	2023-04-03 14:07:40 -04:00
Thibault Douzon	4e441e529c	fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695 ) LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`. Next token is wrongly assumed to also be beginning of word and isn't correctly assigned `pad_token_label`. Modify test with text that produce 'Ġ' token. Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`. solves issue: #19978	2023-04-03 10:32:36 -04:00
Joao Gante	a55a822adf	Generate: `TextIteratorStreamer` (streamer for gradio) (#22501 ) * haha text go brrr (but in gradio)	2023-04-03 15:04:37 +01:00
Mohammed Jabir	7d25c9c81e	added biogpt token classifier (#22447 ) * added biogpt token classifier * fix reviews * Updated modeling_biogpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-03 09:20:02 -04:00
Arthur	c0f99b4d2e	Fix llama tokenizer (#22402 ) * draft * update tokenization limma and conversion script * more udpates * initial commit * style * default pad to None * draft tokenization tests * update test * update tokenization tests * nits * update * versioning test * major fix * fix more testst * finish fixing special masks * last nit * more nits * add encode decode tests * add more * fix token type ids * style	2023-04-03 09:07:32 -04:00
Eli Simhayev	9eae4aa576	[Time-Series] fix past_observed_mask type (#22076 ) added > 0.5 to `past_observed_mask`	2023-04-03 09:07:21 -04:00
Sylvain Gugger	c612628045	Test fetch v2 (#22367 ) * Test fetcher v2 * Fix regexes * Remove sanity check * Fake modification to OPT * Fixes some .sep issues * Remove fake OPT change * Fake modif for BERT * Fake modif for init * Exclude SageMaker tests * Fix test and remove fake modif * Fake setup modif * Fake pipeline modif * Remove all fake modifs * Adds options to skip/force tests * [test-all-models] Fake modif for BERT * Try this way * Does the command actually work? * [test-all-models] Try again! * [skip circleci] Remove fake modif * Remove debug statements * Add the list of important models * Quality * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Address review comments * Fix and add test * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Address review comments --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-03-31 16:18:43 -04:00
Nicolas Patry	d143087d18	Making sure we can use safetensors to serialize all the time. (#22437 ) * Making sure we can use safetensors to serialize all the time. * Expanding the tests for increased coverage. * Update the test. * Getting current state of affairs. * Tentative fix. * Fixing black version. * Fixing the worst offenders. * Try to modify less files. * Fixing blip_2 (Weird solution right now). * Fixing deta. * Fix blip ? * Missing extra newline. * No deta modification. * Adding some comments. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing comments. * Addressing comments. * creating warn_once. * Warning_once ! --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-31 16:07:35 +02:00
Arthur	349e1242d9	[NLLB-MoE] `model_type` update for auto mapping (#22470 ) edit default model type and testing path set to hf-internal-testing	2023-03-30 15:36:07 +02:00
Joao Gante	228792a9dc	Generate: basic token streaming (#22449 ) * haha tokens go brrrr	2023-03-30 12:00:12 +01:00
amyeroberts	f0aeb1be17	Skip flaky NLLB Moe test for now (#22463 ) Skip flaky test for now	2023-03-30 11:30:19 +01:00
amyeroberts	154c6bb7ac	Rescale image back if it was scaled during PIL conversion (#22458 ) * Rescale image back if it was scaled during PIL conversion * do_rescale is defined if PIL image passed in	2023-03-30 11:29:11 +01:00
Younes Belkada	b844f8a9ab	[`Pix2Struct`] Fix slow test (#22448 ) fix slow test	2023-03-29 17:40:45 +02:00
Yih-Dar	8894b81742	Use real tokenizers if tiny version(s) creation has issue(s) (#22428 ) Fix some tiny model creation issues Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-29 16:16:23 +02:00
Younes Belkada	33f4cb1093	[`bnb`] fix bnb failing test (#22439 ) * fix bnb failing test * fix * fix * fixup	2023-03-29 15:13:00 +02:00
Arthur	8d9c3836be	Add clean_up_tokenization_spaces to config (#22341 ) * add draft changes * fix failing wav2vec * style * make sure that the argument is saved + add tests * style * fixup * update test * default clean_up_tokenization_spaces to False for Bloom and Llama * Update code based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * style * quality --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com>	2023-03-29 13:21:07 +02:00
Arthur	19ade2426a	[WIP]`NLLB-MoE` Adds the moe model (#22024 ) * Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-27 19:42:00 +02:00
NielsRogge	0e708178ed	[Pix2Struct] Add support to resize embeddings (#22394 ) * First draft * Fix integration test * Remove script * Fix test and typos * Fix one more test * Skip tied embeddings test * Remove line * Address comments	2023-03-27 11:38:07 -04:00
Joao Gante	7dcd8703ef	Generate: support for left-padding on GPTNeoX and Llama (#22382 )	2023-03-27 15:48:23 +01:00
Shubhamai	a0cbbba31f	Resnet flax (#21472 ) * [WIP] flax resnet * added pretrained flax models, results reproducible * Added pretrained flax models, results reproducible * working on tests * no real code change, just some comments * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * removing loss from modeling flax output class * fixing classifier tests * fixing comments, model output * cleaning comments * review changes * review changes * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * renaming Flax to PyTorch --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-24 19:45:57 +00:00
Mitch Naylor	57f25f4b7f	Add Mega: Moving Average Equipped Gated Attention (#21766 ) * add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-24 08:17:27 -04:00
Joao Gante	0fa46524ac	Generate: Add GPTNeoX integration test (#22346 )	2023-03-24 11:33:16 +00:00
Yih-Dar	e8cc02555e	Automatically create/update tiny models (#22275 ) * Automatically create or update tiny models * Skip failed tests * update workflow file * use revision --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-23 19:14:17 +01:00
Joao Gante	502fec779b	Generate: add test for left-padding support (#22322 )	2023-03-23 17:00:22 +00:00
Sylvain Gugger	80e3b36361	Really fix quality due to ruff release	2023-03-22 20:56:22 -04:00
Sylvain	ef28df0572	Fix quality due to ruff release	2023-03-22 20:45:08 -04:00
Yih-Dar	8b05ace014	Fix PipelineTests skip conditions (#22320 ) * check what tests fail * Skip failing tests * Skip failing tests * Skip failing tests * Skip failing tests * clean up * clean up --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-22 20:02:24 +01:00
Luc CAILLIAU	d62e7d8842	Chunkable token classification pipeline (#21771 ) * Chunkable classification pipeline The TokenClassificationPipeline is now able to process sequences longer than 512. No matter the framework, the model, the tokenizer. We just have to pass process_all=True and a stride number (optional). The behavior remains the same if you don't pass these optional parameters. For overlapping parts when using stride above 0, we consider only the max scores for each overlapped token in all chunks where the token is. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * update with latest black format * update black format * Update token_classification.py * Update token_classification.py * format correction * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update comments * Update src/transformers/pipelines/token_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * Update token_classification.py Correct spaces, remove process_all and keep only stride. If stride is provided, the pipeline is applied to the whole text. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update chunk aggregation Update the chunk aggregation strategy based on entities aggregation. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py Remove unnecessary pop from outputs dict * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add chunking tests * correct formating * correct formatting * correct model id for test chunking * update scores with nested simplify * Update test_pipelines_token_classification.py * Update test_pipelines_token_classification.py * update model to a tiny one * Update test_pipelines_token_classification.py * Adding smaller test for chunking. * Fixup * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 14:13:20 -04:00
Younes Belkada	0f68a7f408	Add Pix2Struct (#21400 ) * v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-22 16:53:52 +01:00
Joao Gante	fd3eb3e3cd	Beef up Llama tests (#22314 ) * tmp commit * beef up llama tests	2023-03-22 15:20:48 +00:00
Joao Gante	12febc20db	Generate: Export TF generate with a TF tokenizer (#22310 ) * Export TF generate with a TF tokenizer * remove unused lines	2023-03-22 15:00:20 +00:00
silentghoul-spec	48bef3a734	Fixed bug to calculate correct xpath_sub_list in MarkupLMTokenizer (#22302 ) Fixed bug to calculate correct xpath_sub_list in MarkupLMTokenizer. Earlier xpath_sub_list was same as xpath_tags_list Co-authored-by: dusejat <dusejat@amazon.com>	2023-03-22 12:07:49 +00:00
Alara Dirik	0558914dff	Add MaskedImageModelingOutput (#22212 ) * Add MaskedImageModelingOutput	2023-03-22 07:35:47 +03:00
Yih-Dar	67c2dbdb54	Time to Say Goodbye, torch 1.7 and 1.8 (#22291 ) * time to say goodbye, torch 1.7 and 1.8 * clean up torch_int_div * clean up is_torch_less_than_1_8-9 * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-21 19:22:01 +01:00
Gerald Cuder	5a2b77a6c1	Fix error in mixed precision training of `TFCvtModel` (#22267 ) * Make sure CVT can be trained using mixed precision * Add test for keras-fit with mixed-precision * Update tests/models/cvt/test_modeling_tf_cvt.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: gcuder <Gerald.Cuder@iacapps.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2023-03-21 12:12:57 +00:00
lewtun	f251441387	Add LlamaForSequenceClassification (#22209 ) * Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-17 14:39:26 +01:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Jason Phang	0041be5b3d	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:00:53 -04:00
Yih-Dar	52a57f7c7c	Update expected values in `MgpstrModelIntegrationTest` (#22195 ) Update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 11:48:52 +00:00
Anahita Bhiwandiwalla	16121bae5c	Update BridgeTowerForContrastiveLearning (#22145 ) * Use return_loss for BridgeTowerForContrastiveLearning, add example * fix tests * Update example in BridgeTowerForContrastiveLearning * Update test_modeling_bridgetower.py * update model output format * minor update * Update src/transformers/models/bridgetower/modeling_bridgetower.py * make style --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-15 20:54:38 +01:00
Sylvain Gugger	42ad693b7b	Regression pipeline device (#22190 ) * Fix regression in pipeline when device=-1 is passed * Add regression test	2023-03-15 14:13:38 -04:00
amyeroberts	737681477c	Revert 22152 MaskedImageCompletionOutput changes (#22187 ) Revert changes	2023-03-15 18:37:23 +01:00
amyeroberts	c6318c3788	to_pil - don't rescale if int and in range 0-255 (#22158 ) * Don't rescale if in and in range 0-255 * Raise value error if int values too large * Update tests/test_image_transforms.py * Update tests/test_image_transforms.py	2023-03-14 15:43:44 +00:00
Alara Dirik	3b22bfbc6a	Create MaskedImageCompletionOutput and fix ViT docs (#22152 ) * create MaskedImageCompletionOutput * fix bugs * fix bugs	2023-03-14 13:55:18 +00:00
Alara Dirik	cdddfbffa1	Add ConvNeXT V2 (#21679 ) * Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues	2023-03-14 12:08:14 +03:00
Yih-Dar	6c2ad00c46	Move `is_pipeline_test_to_skip` to specific model test classes (#21999 ) * Move `is_pipeline_test_to_skip` to specific model test classes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 10:03:02 +01:00
Patrick von Platen	f780557a34	[Safetensors] Add explicit flag to from pretrained (#22083 ) * [Safetensors] Add explicit flag to from pretrained * add test * remove @ * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 21:39:06 +01:00
Younes Belkada	d979cf6efd	[`Whiper`] add `get_input_embeddings` to `WhisperForAudioClassification` (#22133 ) * add `get_input_embeddings` to `WhisperForAudioClassification` * add common tests * fix another common test * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-13 19:46:01 +01:00
Younes Belkada	6652e7da0d	[`Blip2`] skip accelerate test (#22124 ) skip accelerate test	2023-03-13 15:03:21 +01:00
wangpeng	102b5ff4a8	add new model of MGP-STR (#21418 ) * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-13 10:11:31 +00:00
Yih-Dar	2f320661f3	Revert "[GPT2] Propose fix for #21080 " (#22093 ) Revert "[GPT2] Propose fix for #21080 (#21853)" to avoid CI failure This reverts commit `a3fef89b26`.	2023-03-10 22:08:21 +01:00
Dean Wyatte	2f4cdd97f5	handle numpy inputs in whole word mask data collator (#22032 )	2023-03-10 10:50:29 -05:00
Arthur	a3fef89b26	[GPT2] Propose fix for #21080 (#21853 ) * Make sure position ids are masked * test that padded input produce the same results * fix failing tests * fixup * fix batch test	2023-03-10 07:15:25 -05:00
Yih-Dar	ab81d31d20	Skip 3 tests for `WhisperEncoderModelTest` (#22060 ) * skip 3 tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-09 19:09:23 +01:00
Stas Bekman	ec24132b6c	[deepspeed] offload + non-cpuadam optimizer exception (#22043 ) * [deepspeed] offload + non-cpuadam optimizer exception * flip * revert min version	2023-03-09 08:12:57 -08:00
Lucain	923110b74f	Remove set_access_token usage + fail tests if FutureWarning (#22051 ) * Remove set_access_token usage + fail tests if FutureWarning * do not fail on FutureWarning in CI --------- Co-authored-by: testbot <lucainp@hf.co>	2023-03-09 09:23:48 -05:00
Yih-Dar	1cbac6867b	Mark all `BridgeTower` tests slow for now (#22039 ) * slow me --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-08 21:48:29 +01:00
Anahita Bhiwandiwalla	de81adf978	[WIP] Add BridgeTowerForContrastiveLearning (#21964 ) * Add BridgeTower for ITC * Fix review feedback * Rename BridgeTowerForITC, cleanup * Fix style and quality * implement tests --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-03-08 09:00:54 -05:00
Yih-Dar	dfe9a31973	Update `AudioClassificationPipelineTests::test_small_model_pt` for PT 2.0.0 (#22023 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-08 13:56:47 +01:00
Yih-Dar	b338414e61	Update tiny model creation script and some others files (#22006 ) * Update 1 * Update 2 * Update 3 * Update 4 * Update 5 * Update 6 * Update 7 * Update 8 * Update 9 * Update 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 22:31:14 +01:00
Eli Simhayev	8abe4930d3	[Time-Series] informer model (#21099 ) * added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit `11a081e09e`. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commit `b4cbddfa05`. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-07 21:36:38 +01:00
NielsRogge	dde718e7a6	[DETR and friends] Remove is_timm_available (#21814 ) * First draft * Fix to_dict * Improve conversion script * Update config * Remove timm dependency * Fix dummies * Fix typo, add integration test * Upload 101 model as well * Remove timm dummies * Fix style --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2023-03-07 15:19:39 -05:00
Arthur	2156662dea	[TF] Fix creating a PR while pushing in TF framework (#21968 ) * add create pr arg * style * add test * ficup * update test * last nit fix typo * add `is_pt_tf_cross_test` marker for the tsts	2023-03-07 17:32:08 +01:00
Sanchit Gandhi	7c39318136	[Whisper] Add model for audio classification (#21754 ) * [Whisper] Add model for audio classification * make fix-copies * add to docs * add docstring * empty returns * add code example * switch to fleurs * stick everything on one line	2023-03-07 16:20:21 +01:00
Yih-Dar	9402788b34	Skip `test_multi_gpu_data_parallel_forward` for some model tests (#21991 ) skip test_multi_gpu_data_parallel_forward for some model tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 14:23:36 +01:00
NielsRogge	95408e9953	[DETR, YOLOS] Fix device bug (#21974 ) * Fix integration test * Add test * Add test	2023-03-07 07:34:04 -05:00
Elad Segal	eec46b4f75	Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens (#21959 ) * Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens * fix docs * Empty commit * formatting	2023-03-07 11:59:22 +00:00
amyeroberts	4063fd9cba	Add check before int casting for PIL conversion (#21969 ) * Add check before int casting for PIL conversion * Line length * Tidier logic	2023-03-07 11:14:09 +00:00
Yih-Dar	5b28b78332	Update `Jukebox` tests (#21984 ) * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-07 04:20:14 +01:00
Yih-Dar	f2a2616b74	Update expected values for `test_xglm_sample` (#21975 ) update expected values for xglm Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 18:07:31 +01:00
Yih-Dar	9474abdf47	Use larger atol in `torch.allclose` for some tests (#21966 ) Use larger atol Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 17:41:00 +01:00
Yih-Dar	fcf813417a	Update expected values in `XLMProphetNetModelIntegrationTest` (#21957 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-06 09:15:44 +01:00
Arthur	718e9d777f	[CLAP] Support batched inputs for CLAP. Fixes pipeline issues (#21931 ) * fix pipeline * fix feature_extraction clap * you can now batch the `is_longer` attribute * add tests * fixup * add expected scores * comment on is_longert	2023-03-03 18:42:18 +01:00
Yih-Dar	d4306daea1	Fix `AlignModelTest` tests (#21923 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-03 14:47:09 +01:00
Yih-Dar	fa9d2ad7ec	Update `model_split_percents` for `WhisperModelTest` (#21922 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-03 14:35:08 +01:00
Yih-Dar	9f5bfe1b99	Avoid modeling tests run in pipeline CI jobs (#21911 ) * rework is_pipeline_test * bring back 3 tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 21:23:06 +01:00
Kashif Rasul	db979f7588	[time series] Add Time series inputs tests (#21846 ) * intial test of inputs * added test for generation * remove asserts * fixed test * Update tests/models/time_series_transformer/test_modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-02 20:43:35 +01:00
Yih-Dar	88e5c51a15	Temporarily skip 3 tests in `BridgeTowerModelTest` (#21908 ) skip for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 19:16:03 +01:00
Yih-Dar	e6de918676	Add Blip and Blip2 for pipeline tests (#21904 ) * fix * add to tests * style and quality * add missing --------- Co-authored-by: NielsRogge <NielsRogge@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-02 18:20:34 +01:00
Nicolas Patry	1325459105	Refactor whisper asr pipeline to include language too. (#21427 ) * [WIP] whisper refacto to support language output. * Handling merges. * A bit more cleanup and comments. * Many improvements. Lots of details everywhere. * Cleanup old code and tests. * Handle lone timestamp tokens (just recover when something bad happens). * Adding return_language example. * No ffmpeg. * Hmm. * Some corrections. * Both fast and slow. * New black. * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove print. * Undoing tests modifications. * Smaller test modifications. * Rename. * Remove maxDiff. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-02 18:12:19 +01:00
Connor Henderson	8e5a1b2abb	Make schedulers picklable by making lr_lambda fns global (#21768 ) * Make schedulers picklable by making lr_lambda fns global * add unused _get_constant_schedule_lr_lambda arg * remove unneeded _get_constant_schedule_lr_lamda * add test * make style * rebase, remove torch dep, put lambda back * repo-consistency and style	2023-03-02 12:08:43 -05:00
Kian Sierra McGettigan	6bf885375a	Prophetnet batch dimension inversion fix (#21870 ) * decoder forward pass is working * no model has forward pass returning attentions * decoder ngram changed to not mix batch size * current basic forward pass returns identical result * passed test_model attentions * passed test_encoder_decoder_model_generate * passed test_headmasking * removed old block * removed comments bug/fixme * removed bug comments * applied styling * applied fix-copies * applied ngram forward comments * corrected dimension notation * applied styling and comment fixes * changed asserts for raise ValueError * changed question gen test * updated hidden_states integration test * applied styling	2023-03-02 12:07:45 -05:00
Sylvain Gugger	50a8ed3ee0	Mark pipeline tests to skip them easily (#21887 ) * Mark pipeline tests to skip them easily * Mark the mixin as pipeline test * Update src/transformers/testing_utils.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-03-02 10:55:36 -05:00
Arthur	c87654dca1	[Whisper] Add rescaling function with `do_normalize` (#21263 ) * add `zero_mean_unit_var_norm` function * normalize before MEL computation * fixup * add simple test * quality * Update tests/models/whisper/test_feature_extraction_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixup * use attention masks if padding was applied * Update based on review Co-authored-by: bofeng huang <bofenghuang7@gmail.com> --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: bofeng huang <bofenghuang7@gmail.com>	2023-03-02 14:17:21 +01:00
Yih-Dar	36ee128375	Fix `WhisperModelTest` (#21883 ) * force on the same device * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 20:41:27 +01:00

1 2 3 4 5 ...

2693 Commits