transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
amyeroberts	c53fcd8381	Mark MobileNetV1ModelTest::test_batching_equivalence as flaky (#31258 ) * Mark MobileNetV1ModelTest::test_batching_equivalence as flaky * Add link to issue * woops	2024-06-06 14:47:58 +01:00
Omar Salman	681183974a	Enable dynamic resolution input for Beit (#31053 ) * Initial attempt * Updates: PR suggestions * Interpolate the relative position bias when interpolate_pos_encoding is True * Add slow tag for the added tests * Add in DATA2VEC_VISION_INPUTS_DOCSTRING	2024-06-06 14:47:41 +01:00
Marc Sun	99895ae5e2	fix accelerate tests for roberta xl (#31288 ) * fix accelerate tests for roberta xl * style	2024-06-06 14:44:35 +01:00
Baole Ai	5ba8ac54f5	Fix _save_tpu: use _maybe_convert_to_cpu instead of to cpu. (#31264 ) * Fix _save_tpu: use _maybe_convert_to_cpu instead of to cpu. * fix lint	2024-06-06 09:42:55 -04:00
dependabot[bot]	14ff5dd962	Bump transformers from 3.5.1 to 4.38.0 in /examples/research_projects/bertology (#31256 ) Bump transformers in /examples/research_projects/bertology Bumps [transformers](https://github.com/huggingface/transformers) from 3.5.1 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v3.5.1...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-06 12:42:40 +01:00
Huazhong Ji	9e9679c022	fix: `str` should be used not `int` when setting env variables (#31272 )	2024-06-06 12:41:31 +01:00
Lucain	9ef93fccad	Switch from `cached_download` to `hf_hub_download` in remaining occurrences (#31284 ) Switch from hf_hub_url to hf_hub_download in remaining occurences	2024-06-06 12:05:59 +01:00
Raushan Turganbay	5fabd1e83b	Generation: fix handling of special tokens (#31254 ) * fix special tokens in generatioon * fix test * add warning * fix the check * warn once * fix	2024-06-06 15:21:32 +05:00
Raushan Turganbay	7729b77478	Make mamba use cache (#31116 ) * make mamba use cache * uss cache naming as in mamba * fix musicgen	2024-06-06 13:37:29 +05:00
Zhiyuan Chen	f5c0fa9f6f	fix loading special_tokens_map_file (#31012 )	2024-06-06 09:15:27 +02:00
Ranggi Hwang	9b85e405ab	[`SwitchTransformer`] Significant performance improvement on MoE blocks (#31173 ) * SwitchTransformer MoE layer performance improvement * make fixup * comments about shapes * make fixup	2024-06-06 09:10:12 +02:00
graham	8177aa0e1a	no need for explicit EXTRA_TOKENS in processing_paligemma.py (#31022 ) no need for explicit EXTRA_TOKENS	2024-06-06 08:41:41 +02:00
amyeroberts	940fde8daf	Skip failing JetMOE generation tests (#31266 ) Skip failing tests for now	2024-06-05 19:06:46 +01:00
Cyril Vallez	bd5091df8d	Reduce by 2 the memory requirement in `generate()` 🔥🔥🔥 (#30536 ) * Fix contrastive_search for new cache structure, and improve performance by removing inneficient torch.stack(torch.split(x, top_k, dim=0)) * Fix _contrastive_search for non-standard cache using ellipsis slicing * Fix all outputs.logits memory leaks for all decoding strategies! * Fix small error in _contrastive_search() * Make all necessary change and revert for the new class * Apply coding style * Remove pipes in type hints for compatibility * correct type hint * apply style * Use DynamicCache by default and solve conflicts * Fix rebase issues * Add `_supports_dynamic_cache_class` in models for models that support DynamicCache but not other caches to make DynamicCache the default for more models * Create generation config to return legacy format by default, or to choose not to * style * Fix case when use_cache is False * Remove default DynamicCache in assiste_decoding if assistant_model does not support it + fix _seen_tokens when cropping cache * Update prepare_inputs_for_generation() for case with empty DynamicCache * Correct return of args in _assisted_decoding * Remove EfficientDynamicCache as it is no longer needed * Correct mistake in generation config * Move cache logic of assisted decoding to AssistedCandidateGenerator.__init__ * change DynamicCache function names from "split" to "batch_split" for readability + apply coding style * Remove `_supports_dynamic_cache_class` attribute after rebase * Correct missing line lost in conflict resolution during rebasing * Add special case for Jamba * Fix jamba test * Coding style * coding style * Correct missing import in rebasing * Simplify _validate_model_kwargs based on removal of _supports_dynamic_cache attribute * Simplify code paths in _contrastive_search * coding style * Update docstrings of cache methods * Update prepare_inputs_for_generation() -> past_key_values are always Cache objects	2024-06-05 17:05:01 +02:00
Yih-Dar	d6276f0fc5	Add condition to `benchmark` job in `push-important-models.yml` (#31259 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-05 15:19:16 +02:00
Dhaivat Bhatt	b72752f068	Fix circular reference issue in CLIPTokenizerFast (#31075 )	2024-06-05 14:01:13 +02:00
bastrob	464d986b6c	Add missing Flaubert tokenizer tests (#30492 ) * add flaubert tokenization test, enrich inheritance in FlaubertTokenizer. * fix quality code ci * ensure parameter consistency * fix ci * fix copyright year and flatten vocab list. * fix style	2024-06-05 13:52:16 +02:00
Huazhong Ji	41cf4097f7	enable deterministic mode for npu (#31253 )	2024-06-05 07:35:35 -04:00
Vaibhav Srivastav	4a6024921f	doc: add info about wav2vec2 bert in older wav2vec2 models. (#31120 ) * doc: add info about wav2vec2 bert in older wav2vec2 models. * apply suggestions from review. * forward contrib credits from review --------- Co-authored-by: Sanchit Gandhi <sanchit-gandhi@users.noreply.github.com>	2024-06-05 11:56:11 +01:00
dependabot[bot]	c39aaea972	Bump transformers from 3.5.1 to 4.38.0 in /examples/research_projects/deebert (#31244 ) Bump transformers in /examples/research_projects/deebert Bumps [transformers](https://github.com/huggingface/transformers) from 3.5.1 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v3.5.1...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-05 11:12:58 +01:00
amyeroberts	54659048a2	Early labels validation (#31240 ) * Move label validation checks - fail early * Remove some formatting changes - add back labels change wav2vec2	2024-06-05 10:50:55 +01:00
Yih-Dar	03ea160937	Benchmark GitHub Actions workflow (#31163 ) * benchmark workflow * benchmark workflow * benchmark workflow * benchmark workflow * build * build * build * build * build * build * build * build * build * build * build * build * build * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-05 10:39:00 +02:00
James Braza	63fb253df0	Fixing `name 'torch' is not defined` in `bitsandbytes` integration (#31243 ) Fixed torch definition error	2024-06-05 08:00:30 +02:00
Yury Sulsky	66875ac070	Specify dtype=torch.bool to avoid xla error (#31191 ) The StoppingCriteriaList allocates is_done without specifying dtype=torch.bool. On XLA this allocates a float tensor and causes a failure on the following line: is_done = is_done \| criteria(input_ids, scores, **kwargs) by attempting to OR float with bool.	2024-06-05 07:50:54 +02:00
dependabot[bot]	8685b3c5d2	Bump transformers from 4.26.0 to 4.38.0 in /examples/research_projects/vqgan-clip (#31242 ) Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.26.0 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.26.0...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-04 22:11:45 +01:00
Yih-Dar	3714f3f86b	Upload (daily) CI results to Hub (#31168 ) * build * build * build * build * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-04 21:20:54 +02:00
amyeroberts	99de3a844b	Move out common backbone config param validation (#31144 ) * Move out common validation * Add missing backbone config arguments	2024-06-04 18:15:37 +01:00
Younes Belkada	485d913dfb	Blip: Deprecate `BlipModel` (#31235 ) * deprecate blip * mention deprecation on docs	2024-06-04 18:29:45 +02:00
Yih-Dar	fd3238b4b0	Fix `MistralIntegrationTest` (#31231 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-04 18:04:08 +02:00
Manuel Faysse	2965b20459	add no split modules for xlmrobertaxl (#31223 )	2024-06-04 15:46:19 +01:00
Jacklanda	821b772ab9	Add new line switch before logging *** Running {description} * (#31225 ) ✨ Add new line switch before logging "* Running {description} ***". Signed-off-by: jacklanda <yonyonlau@gmail.com>	2024-06-04 13:38:17 +01:00
amyeroberts	4ba66fdb4c	Fix pipeline tests - torch imports (#31227 ) * Fix pipeline tests - torch imports * Frameowrk dependant float conversion	2024-06-04 12:30:23 +01:00
Chujie Zheng	6b22a8f2d8	fix bf16 issue in text classification pipeline (#30996 ) * fix logits dtype * Add bf16/fp16 tests for text_classification pipeline * Update test_pipelines_text_classification.py * fix * fix	2024-06-04 11:20:48 +01:00
Kristen Pereira	de460e28e1	Add dynamic resolution input/interpolate position embedding to deit (#31131 ) * Added interpolate pos encoding feature and test to deit * Added interpolate pos encoding feature and test for deit TF model * readded accidentally delted test for multi_gpu * storing only patch_size instead of entire config and removed commented code * Update modeling_tf_deit.py to remove extra line Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-04 10:29:01 +01:00
Raushan Turganbay	d64e4da713	Video-LLaVa: handle any number of frames (#31221 ) video-llava can handle more frames	2024-06-04 14:20:03 +05:00
Max Strobel	36ade4a32b	fix(PatchTST): Wrong dropout used for PretainHead (#31117 ) * fix(PatchTST): Wrong dropout used for PretainHead * feat(PatchTST): remove unused config.dropout --------- Co-authored-by: Strobel Maximilian (IFAG PSS SIS SCE ACM) <Maximilian.Strobel@infineon.com>	2024-06-04 10:11:36 +01:00
DomHudson	e83cf58145	Fix sentence fragment within test comments (#31218 )	2024-06-04 10:09:24 +01:00
Raushan Turganbay	83238eeebc	Pass device in Logits Processor's init (#29804 ) * add device in logits processor * remove device when not needed * codestyle * tests * forgot `melody` version * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * codestyle * updates --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-06-04 10:19:19 +05:00
Aaron Jimenez	c73ee1333d	[docs] Spanish translation of tokenizer_summary.md (#31154 ) * add tokenizer_summary to es/_toctree.yml * add tokenizer_summary to es/ * fix link to Transformes XL in en/ * translate until Subword tokenization section * fix GPT link in en/ * fix other GPT link in en/ * fix typo in en/ * translate the doc * run make fixup * Remove .md in Transformer XL link * fix some link issues in es/ * fix typo	2024-06-03 16:52:23 -07:00
Yih-Dar	8a1a23ae4d	Fix GPU OOM for `mistral.py::Mask4DTestHard` (#31212 ) * build * build * build * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-03 19:25:15 +02:00
miivanov90	df5abae894	Set greater_is_better to False if metric_for_best_model ends with "loss" (#31142 ) * update to not(endswith(loss)) * ruff formatting	2024-06-03 17:52:28 +01:00
Younes Belkada	924c46d40c	Cohere: Fix copied from (#31213 ) Update modeling_cohere.py	2024-06-03 18:29:31 +02:00
Jade Choghari	98dd842339	Wrong translation FR : Contents = Contenu (#31186 ) Update index.md - Contents = Contenu French typo - Contents = Contenu	2024-06-03 17:40:14 +02:00
Qubitium	c6c78733d7	Rename sanity_evaluation to eval_on_start (#31192 ) * Rename sanity_evaluation to eval_on_start * move arg back to last	2024-06-03 16:32:21 +01:00
Bojun Feng	c230504b36	Fix typo in utils (#31169 ) fix typo	2024-06-03 17:27:53 +02:00
Sangbum Daniel Choi	874ac129bb	fix the get_size_with_aspect_ratio in max_size situation (#30902 ) * fix the get_size_with_aspect_ratio in max_size situation * make fix-up * add more general solution * consider when max_size is not defined * fix typo * fix typo * simple fix * fix error * fix if else error * fix error of size overwrite * fix yolos image processing * fix detr image processing * make * add longest related test script * Update src/transformers/models/yolos/image_processing_yolos.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more test * add test script about longest size * remove deprecated --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-03 16:12:08 +01:00
Isotr0py	e4628434d8	Add Qwen2 GGUF loading support (#31175 ) * add qwen2 gguf support * Update docs * fix qwen2 tokenizer * add qwen2 gguf test * fix typo in qwen2 gguf test * format code * Remove mistral, clarify the error message * format code * add typing and update docstring	2024-06-03 14:55:10 +01:00
Yih-Dar	df848acc5d	Fix `test_compile_static_cache` (#30991 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-03 15:16:28 +02:00
NielsRogge	70c8713872	🚨 [Mistral and friends] Update MLP (#31057 ) Update MLP	2024-06-03 14:57:07 +02:00
Joao Gante	d475f76745	SlidingWindowCache: reduce differences to other Cache classes (#30970 ) * tmp commit * sliding window with fewer differences * make fixup + rebase * missing overwrite	2024-06-03 14:04:24 +02:00

1 2 3 4 5 ...

16093 Commits