transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

Author	SHA1	Message	Date
Ondřej Cífka	bcd42c4af9	Fix `kwargs` handling in `generate_with_fallback` (#29225 ) * Fix generate_with_fallback *kwargs Change pop to get * Delete keys from kwargs to prevent overriding generation_config * Revert to passing kwargs by reference, but make a (shallow) copy * dict -> copy.copy * Add test_whisper_longform_multi_batch_beam	2024-04-03 17:51:03 +02:00
Ren Xuancheng	851f253f4d	Fix Qwen2Tokenizer (#29929 ) qwen2: fixed tokens starting with # in slow tokenizer; add tests Co-authored-by: jklj077 <17811943+jklj077@users.noreply.github.com>	2024-04-03 17:42:43 +02:00
Yih-Dar	b44df05bc0	Update `tests/utils/tiny_model_summary.json` (#29941 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-03 09:25:01 +02:00
Nicolas Patry	9b0a8ea7d1	Hard error when ignoring tensors. (#27484 ) (#29906 ) * Hard error when ignoring tensors. (#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-02 16:59:05 +02:00
Minsub Lee (Matt)	15cd68713d	Fix `skip_special_tokens` for `Wav2Vec2CTCTokenizer._decode` (#29311 ) * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode * Exclude pad_token filtering since it is used as CTC-blank token * Add small test for skip_special_tokens * Update decoding test for added new token	2024-04-02 16:55:11 +02:00
Yoach Lacombe	0d04b1e25a	Add Flash Attention 2 support to Musicgen and Musicgen Melody (#29939 ) * add FA2 to o.g Musicgen * make style * add FA2 support to Musicgen Melody * add generation FA2 tests to o.g Musicgen * make style and fix copies * add Musicgen to FA2 docs + deprecate list * add sdpa supports to Musicgen's * make style and fix copies * refactor attention implementation arguments * add Copied from to sdpa tests * add copied form in sdpa tests melody * add copied for FA2 generation tests * add FA2 inference copied from * make style	2024-04-02 11:23:49 +01:00
théo gigant	fed27ffc7e	Adding FlaxNoRepeatNGramLogitsProcessor (#29677 ) * fix issue with logit processor in beam search in Flax * adding FlaxNoRepeatNGramLogitsProcessor class + unit test * style correction and code verification * add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests * fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams * replace non-jit compatible masking of ngrams that are not yet generated with jittable version * Revert "fix issue with logit processor in beam search in Flax" This reverts commit `09b70d7e4d`. * add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor * change the method of casting to boolean of banned tokens indices * fix code style * remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop * remove useless loop iterations * set some variables that were calculated and used multiple times * fix format	2024-04-02 11:39:33 +02:00
Hovnatan Karapetyan	416711c3ea	Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904 ) * Fix sinusoidal_embeddings in FlaubertModel * Fix for Informer * Fix for XLM * Move sinusoidal emb for XLM * Move sinusoidal emb for Flaubert * Small cleanup * Add comments on tests code copied from * Add with Distilbert->	2024-04-02 10:27:26 +02:00
Arthur	83b26dd79d	[`generate`] fix breaking change for patch (#29976 ) * fix bug and add tests * nit * otherway to get the cur len instead of attention mask * more places where this might have been broken * nit * oups * inputs_embeds vs input_embeds * test generated outptus * style * nit * fix * skip failing biogpt	2024-04-02 09:51:45 +02:00
Joao Gante	c9f6e5e351	Generate: move misplaced test (#29902 )	2024-04-01 12:45:25 +01:00
Fanli Lin	e4f5b57a3b	[tests] fix the wrong output in `ImageToTextPipelineTests.test_conditional_generation_llava` (#29975 ) bug fix	2024-04-01 13:08:39 +02:00
Arthur	fa2c49b00b	Fix copies main ci (#29979 ) * fix copies * nit * style * Update utils/check_copies.py	2024-04-01 12:43:58 +02:00
Yoach Lacombe	569f6c7d43	Fix FA2 tests (#29909 ) * fix FA2 tests * refactor inference test name	2024-04-01 07:51:00 +00:00
Zach Mueller	3b8e2932ce	Rework tests to compare trainer checkpoint args (#29883 ) * Start rework * Fix failing test * Include max * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-30 22:19:17 -04:00
Yih-Dar	43d17c1836	Mark `test_eager_matches_sdpa_generate` flaky for some models (#29479 ) * fix * revert for qwen2 * revert for qwen2 * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-29 11:51:20 +01:00
Arthur	536ea2aca2	[`LlamaSlowConverter`] Slow to Fast better support (#29797 ) * fix * fix test * style * nit * rather rely on concert token to id * fix quality * Update src/transformers/convert_slow_tokenizer.py	2024-03-28 16:19:32 +01:00
Yu Chin Fabian Lim	4df5b9b4b2	Allow GradientAccumulationPlugin to be configured from AcceleratorConfig (#29589 ) * add gradient_accumulation_kwargs to AcceleratorConfig * add suggestions from @muellerzr to docstrings, new behavior and tests * Documentation suggestions from @muellerz Co-authored-by: Zach Mueller <muellerzr@gmail.com> * addressed @muellerzr comments regarding tests and test utils * moved accelerate version to top of file. * @muellerzr's variable fix Co-authored-by: Zach Mueller <muellerzr@gmail.com> * address @amyeroberts. fix tests and docstrings * address @amyeroberts additional suggestions --------- Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 14:01:40 +00:00
Arthur	a2a7f71604	[ `TokenizationLlama`] fix the way we convert tokens to strings to keep leading spaces 🚨 breaking fix (#29453 ) * nit * update test and fix test * fixup	2024-03-28 13:58:40 +01:00
Joao Gante	441de62f49	RoPE models: add numerical sanity-check test for RoPE scaling (#29808 ) * add hard rope scaling test * make fixup * quick rope scaling tests * add copy statements	2024-03-28 11:25:50 +00:00
Christopher Keibel	aac7099c92	add functions to inspect model and optimizer status to trainer.py (#29838 ) * add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py * add tests and raise ValueError when optimizer is None * add second layer to test and freeze its weigths * check if torch is available before running tests * use decorator to check if torch is available Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix test indentation Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 10:37:16 +00:00
Joao Gante	248d5d23a2	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 ) * replace torch.testing.assert_allclose by torch.testing.assert_close * missing atol rtol	2024-03-28 09:53:31 +00:00
Eduardo Pacheco	22d159ddf9	Adding Flash Attention 2 Support for GPT2 (#29226 ) * First commit to add flash attention 2 for GPT-2 * more improvements * Make GPT2 pass tests and fixed Decison Transformers copies * Fixed missing arg * fix copies * Added expected speedup * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Added test * Fixed attn attribute * Update docs/source/en/model_doc/gpt2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/gpt2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update Decision transformer attentions * More updates * Passing tests * Fix copies * Fix copies part 2 * Decision transformer updates * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix copies * Decision transformer not supporting flash attn * Addressed comments * Addressed comments * Addressed comments --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-28 09:31:24 +00:00
Lorenzo Verardo	a25037beb9	MixtralSparseMoeBlock: add gate jitter (#29865 ) This commit adds gate jitter to MixtralSparseMoeBlock's input data before passing it through the MoE layer, if turned on.	2024-03-27 16:14:26 +01:00
Raushan Turganbay	0efcf32351	Move `eos_token_id` to stopping criteria (#29459 ) * add eos stopping criteria * minor fix * Update tests/generation/test_stopping_criteria.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * check eos is not None and fix tests * make style and fixup * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * camel case everywhere * call stopping criteria list for candidate ids * make style and fixup * Empty commit * Empty commit to pass flaky test * set max length in PromptLookupCandidateGenerator * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * lets fix this typo in docs * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update PR * empty commit --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-27 12:18:10 +00:00
Lysandre Debut	4d8427f739	Reimplement "Automatic safetensors conversion when lacking these files" (#29846 ) * Automatic safetensors conversion when lacking these files (#29390) * Automatic safetensors conversion when lacking these files * Remove debug * Thread name * Typo * Ensure that raises do not affect the main thread * Catch all errors	2024-03-27 08:58:08 +01:00
Hovnatan Karapetyan	a81cf9ee90	Fix 29807, sinusoidal positional encodings overwritten by post_init() (#29813 ) * Check for requires_grad when initing weights * Add unit test * Move sinusoidal positional encoding generation after post_init() * Add modules to skip init list * Move create_sinusoidal_embeddings to _init_weights	2024-03-27 06:28:00 +01:00
Anton Vlasjuk	cefb819f7a	Mamba `slow_forward` gradient fix (#29563 ) * FIX: Cached slow forward in mamba - additionally added mamba cached test - added unused test (mamba causal lm forward and backward) - fixed typo: "causl" --> "causal" * formatting * fix: use real `slow_forward` call instead of torch module's * add shape assertion for mixer block test * adjust shape assertion	2024-03-27 04:52:12 +01:00
Bo Zheng	1c39974a4c	Add Qwen2MoE (#29377 ) * add support for qwen2 MoE models * update docs * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix style * fix test when there are sparse and non sparse layers * fixup * Update README.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * fixup * add archive back * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fixup * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * fix style * fix test when there are sparse and non sparse layers * fixup * add archive back * fix integration test * fixup --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-27 02:11:55 +01:00
Yanyi Liu	ef60995858	Add `cosine_with_min_lr` scheduler in Trainer (#29341 ) * Add cosine_with_min_lr scheduler * Update error message for missing min_lr or min_lr_rate	2024-03-26 13:57:07 +01:00
Zhihao Lin	998b5bb56f	Allow `bos_token_id is None` during the generation with `inputs_embeds` (#29772 ) * update * add ut * update	2024-03-26 12:51:00 +00:00
yunxiangtang	b32bf85b58	Replace 'decord' with 'av' in VideoClassificationPipeline (#29747 ) * replace the 'decord' with 'av' in VideoClassificationPipeline * fix the check of backend in VideoClassificationPipeline * adjust the order of imports * format 'video_classification.py' * format 'video_classification.py' with ruff --------- Co-authored-by: wanqiancheng <13541261013@163.com>	2024-03-26 10:12:24 +00:00
Jonathan Flynn	b5a6d6eeab	Add warnings if training args differ from checkpoint trainer state (#29255 ) * add warnings if training args differ from checkpoint args stored in trainer_state.json * run formatting and styling * add a test * format and styling --------- Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>	2024-03-26 07:13:13 +01:00
Yuki Watanabe	8e9a2207b3	Populate torch_dtype from model to pipeline (#28940 ) * Populate torch_dtype from model to pipeline Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * use property Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * lint Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Remove default handling Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-03-25 10:46:40 +01:00
Lysandre Debut	39114c0383	Remove static pretrained maps from the library's internals (#29112 ) * [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-25 10:33:38 +01:00
fxmarty	13b23704a8	Correct llava mask & fix missing setter for `vocab_size` (#29389 ) * correct llava mask * fix vipllava as wlel * mask out embedding for padding tokens * add test * fix style * add setter * fix test on suggestion	2024-03-22 19:57:08 +08:00
Raushan Turganbay	fadb053379	Change in-place operations to out-of-place in LogitsProcessors (#29680 ) * change in-place -> out-of-place * add tests * add more tests * naming consistency * fix doctest * forgot min-length processors * empty * Revert "fix doctest" This reverts commit `4772768457`. * revert change in docstring * Update tests/generation/test_logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/generation/test_logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-21 16:37:33 +00:00
Raushan Turganbay	b469ebc5cf	Prepend `bos token` to Blip generations (#29642 ) * prepend "bos" to blip generation * minor changes * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/instructblip/modeling_instructblip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add generation tester mixin --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-21 16:33:18 +00:00
Matt	de627f5a14	Cast bfloat16 to float32 for Numpy conversions (#29755 ) * Cast bfloat16 to float32 for Numpy conversions * Add test	2024-03-21 14:04:11 +00:00
Arthur	ff841900e4	[`BC 4.37 -> 4.38`] for Llama family, memory and speed (#29753 ) * attempt to fix * the actual fix that works with compilation! * this? * temporary update * nit? * dispatcg to memory efficient? * update both models that have static cache support * fix copies fix compile * make sure fix * fix cohere and gemma * fix beams? * nit * slipped through the cracks * nit * nits * update * fix-copies * skip failing tests * nits	2024-03-20 23:47:01 +01:00
Zach Mueller	c78f57729f	Update test reqs to include sentencepiece (#29756 ) * Update test reqs * Clean	2024-03-20 15:53:42 +00:00
NielsRogge	d91fd7f92c	Add LLaVa-1.6, bis (#29586 ) * First draft * Fix tests, add docs * Improve docstrings * Fix test * Address comments * Address comments * Remove vocab_size attribute * Remove batch_size * Address comment * Add image processor tests * Support fx * Update docstring * Add support for 34b * Convert 34b model * Add integration tests * Update checkpoints * Convert vicuna-13b, remove doc tests * Remove script * Remove file * Address comments * Improve docstrings * Deprecate vocab_size * Remove aspect_ratio_setting * Address comments * Update READMEs * Add tips about chat templates * Fix tests * Deprecate vocab_size safely * Update tests --------- Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 15:51:12 +00:00
Matt	9d999481b2	Add correct batched handling for apply_chat_template (#29222 ) * Add correct batched handling for apply_chat_template * Fix warning method * Add error for incompatible options * expand tests * Add a skip for markuplm * Add skips for other layout models * Skip for LayoutLMv2 * Slightly update the warning message * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * typo fix * Update docstring for conversation kwarg * Update return docstring * Remove the warning, improve error message * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_tokenization_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_tokenization_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove return_dict=None * Fix up some merge cruft * More merge cruft * Add another skip * Add another skip --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 15:50:22 +00:00
amyeroberts	3c17c529cc	SuperPointModel -> SuperPointForKeypointDetection (#29757 )	2024-03-20 15:41:03 +00:00
Matt	11ef35e828	Support sharded safetensors in TF (#29350 ) * Initial commit (still lots of unfinished bits) * (Still untested) add safetensors sharding to save_pretrained * Fix savetensors saving, update default shard size to match PT * Add proper loading of TF-format safetensors * Revert default size in case that changes things * Fix incorrect index name * Update loading priority * Update tests * Make the tests a little more stringent * Expand tests * Add sharded cross-test * Fix argument name * One more test fix * Adding mlx to the list of allowed formats * Remove irrelevant block for safetensors * Refactor warning logging into a separate function * Remove unused skip_logger_warnings arg * Update src/transformers/modeling_tf_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Move function def --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 14:22:35 +00:00
NielsRogge	776c9d3af8	[Tests] Remove unused code (#29737 ) Remove unused code	2024-03-20 13:26:00 +01:00
Joao Gante	1a5c500f12	Tests: Musicgen tests + `make fix-copies` (#29734 ) * make fix-copies * some tests fixed * tests fixed	2024-03-20 08:45:53 +01:00
Joao Gante	4294f0c358	Llama: partial 4d masks (#29731 ) * partial 4d masks * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-19 17:32:01 +00:00
Raushan Turganbay	425ba56cdf	Clean-up generation tests after moving methods to private (#29582 ) * clean-up tests * refine comments * fix musicgen tests * make style * remove slow decorator from a test * more clean-up * fix other failing tests	2024-03-19 17:03:31 +00:00
StevenBucaille	56baa03380	Implementation of SuperPoint and AutoModelForKeypointDetection (#28966 ) * Added SuperPoint docs * Added tests * Removed commented part * Commit to create and fix add_superpoint branch with a new branch * Fixed dummy_pt_objects * Committed missing files * Fixed README.md * Apply suggestions from code review Fixed small changes Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py * Removed AutoModelForKeypointDetection and related stuff * Fixed inconsistencies in image_processing_superpoint.py * Moved infer_on_model logic simply in test_inference * Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py * Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale * Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixed from (w, h) to (h, w) as input for tests * Removed unnecessary condition * Moved last_hidden_state to be the first returned * Moved last_hidden_state to be the first returned (bis) * Moved last_hidden_state to be the first returned (ter) * Switched image_width and image_height in tests to match recent changes * Added config as first SuperPointConvBlock init argument * Reordered README's after merge * Added missing first config argument to SuperPointConvBlock instantiations * Removed formatting error * Added SuperPoint to README's de, pt-br, ru, te and vi * Checked out README_fr.md * Fixed README_fr.md * Test fix README_fr.md * Test fix README_fr.md * Last make fix-copies ! * Updated checkpoint path * Removed unused SuperPoint doc * Added missing image * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Removed unnecessary import * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added SuperPoint to _toctree.yml --------- Co-authored-by: steven <steven.bucaillle@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>	2024-03-19 14:43:02 +00:00
Arthur	2f9a3edbb9	[`GemmaConverter`] use user_defined_symbols (#29473 ) * use user_defined_symbols * fixup * nit * add a very robust test * make sure all models are tested with the `pretrained_tokenizer_to_test` * should we make sure we test all of them? * merge * remove the id * fix test * update * ousies * oups * fixup * fix copies check * remove `pretrained_tokenizer_to_test`	2024-03-19 15:13:56 +01:00

1 2 3 4 5 ...

3558 Commits