transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Lucain	9ef93fccad	Switch from `cached_download` to `hf_hub_download` in remaining occurrences (#31284 ) Switch from hf_hub_url to hf_hub_download in remaining occurences	2024-06-06 12:05:59 +01:00
Raushan Turganbay	5fabd1e83b	Generation: fix handling of special tokens (#31254 ) * fix special tokens in generatioon * fix test * add warning * fix the check * warn once * fix	2024-06-06 15:21:32 +05:00
Raushan Turganbay	7729b77478	Make mamba use cache (#31116 ) * make mamba use cache * uss cache naming as in mamba * fix musicgen	2024-06-06 13:37:29 +05:00
Zhiyuan Chen	f5c0fa9f6f	fix loading special_tokens_map_file (#31012 )	2024-06-06 09:15:27 +02:00
Ranggi Hwang	9b85e405ab	[`SwitchTransformer`] Significant performance improvement on MoE blocks (#31173 ) * SwitchTransformer MoE layer performance improvement * make fixup * comments about shapes * make fixup	2024-06-06 09:10:12 +02:00
graham	8177aa0e1a	no need for explicit EXTRA_TOKENS in processing_paligemma.py (#31022 ) no need for explicit EXTRA_TOKENS	2024-06-06 08:41:41 +02:00
amyeroberts	940fde8daf	Skip failing JetMOE generation tests (#31266 ) Skip failing tests for now	2024-06-05 19:06:46 +01:00
Cyril Vallez	bd5091df8d	Reduce by 2 the memory requirement in `generate()` 🔥🔥🔥 (#30536 ) * Fix contrastive_search for new cache structure, and improve performance by removing inneficient torch.stack(torch.split(x, top_k, dim=0)) * Fix _contrastive_search for non-standard cache using ellipsis slicing * Fix all outputs.logits memory leaks for all decoding strategies! * Fix small error in _contrastive_search() * Make all necessary change and revert for the new class * Apply coding style * Remove pipes in type hints for compatibility * correct type hint * apply style * Use DynamicCache by default and solve conflicts * Fix rebase issues * Add `_supports_dynamic_cache_class` in models for models that support DynamicCache but not other caches to make DynamicCache the default for more models * Create generation config to return legacy format by default, or to choose not to * style * Fix case when use_cache is False * Remove default DynamicCache in assiste_decoding if assistant_model does not support it + fix _seen_tokens when cropping cache * Update prepare_inputs_for_generation() for case with empty DynamicCache * Correct return of args in _assisted_decoding * Remove EfficientDynamicCache as it is no longer needed * Correct mistake in generation config * Move cache logic of assisted decoding to AssistedCandidateGenerator.__init__ * change DynamicCache function names from "split" to "batch_split" for readability + apply coding style * Remove `_supports_dynamic_cache_class` attribute after rebase * Correct missing line lost in conflict resolution during rebasing * Add special case for Jamba * Fix jamba test * Coding style * coding style * Correct missing import in rebasing * Simplify _validate_model_kwargs based on removal of _supports_dynamic_cache attribute * Simplify code paths in _contrastive_search * coding style * Update docstrings of cache methods * Update prepare_inputs_for_generation() -> past_key_values are always Cache objects	2024-06-05 17:05:01 +02:00
Yih-Dar	d6276f0fc5	Add condition to `benchmark` job in `push-important-models.yml` (#31259 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-05 15:19:16 +02:00
Dhaivat Bhatt	b72752f068	Fix circular reference issue in CLIPTokenizerFast (#31075 )	2024-06-05 14:01:13 +02:00
bastrob	464d986b6c	Add missing Flaubert tokenizer tests (#30492 ) * add flaubert tokenization test, enrich inheritance in FlaubertTokenizer. * fix quality code ci * ensure parameter consistency * fix ci * fix copyright year and flatten vocab list. * fix style	2024-06-05 13:52:16 +02:00
Huazhong Ji	41cf4097f7	enable deterministic mode for npu (#31253 )	2024-06-05 07:35:35 -04:00
Vaibhav Srivastav	4a6024921f	doc: add info about wav2vec2 bert in older wav2vec2 models. (#31120 ) * doc: add info about wav2vec2 bert in older wav2vec2 models. * apply suggestions from review. * forward contrib credits from review --------- Co-authored-by: Sanchit Gandhi <sanchit-gandhi@users.noreply.github.com>	2024-06-05 11:56:11 +01:00
dependabot[bot]	c39aaea972	Bump transformers from 3.5.1 to 4.38.0 in /examples/research_projects/deebert (#31244 ) Bump transformers in /examples/research_projects/deebert Bumps [transformers](https://github.com/huggingface/transformers) from 3.5.1 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v3.5.1...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-05 11:12:58 +01:00
amyeroberts	54659048a2	Early labels validation (#31240 ) * Move label validation checks - fail early * Remove some formatting changes - add back labels change wav2vec2	2024-06-05 10:50:55 +01:00
Yih-Dar	03ea160937	Benchmark GitHub Actions workflow (#31163 ) * benchmark workflow * benchmark workflow * benchmark workflow * benchmark workflow * build * build * build * build * build * build * build * build * build * build * build * build * build * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-05 10:39:00 +02:00
James Braza	63fb253df0	Fixing `name 'torch' is not defined` in `bitsandbytes` integration (#31243 ) Fixed torch definition error	2024-06-05 08:00:30 +02:00
Yury Sulsky	66875ac070	Specify dtype=torch.bool to avoid xla error (#31191 ) The StoppingCriteriaList allocates is_done without specifying dtype=torch.bool. On XLA this allocates a float tensor and causes a failure on the following line: is_done = is_done \| criteria(input_ids, scores, **kwargs) by attempting to OR float with bool.	2024-06-05 07:50:54 +02:00
dependabot[bot]	8685b3c5d2	Bump transformers from 4.26.0 to 4.38.0 in /examples/research_projects/vqgan-clip (#31242 ) Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.26.0 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.26.0...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-04 22:11:45 +01:00
Yih-Dar	3714f3f86b	Upload (daily) CI results to Hub (#31168 ) * build * build * build * build * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-04 21:20:54 +02:00
amyeroberts	99de3a844b	Move out common backbone config param validation (#31144 ) * Move out common validation * Add missing backbone config arguments	2024-06-04 18:15:37 +01:00
Younes Belkada	485d913dfb	Blip: Deprecate `BlipModel` (#31235 ) * deprecate blip * mention deprecation on docs	2024-06-04 18:29:45 +02:00
Yih-Dar	fd3238b4b0	Fix `MistralIntegrationTest` (#31231 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-04 18:04:08 +02:00
Manuel Faysse	2965b20459	add no split modules for xlmrobertaxl (#31223 )	2024-06-04 15:46:19 +01:00
Jacklanda	821b772ab9	Add new line switch before logging *** Running {description} * (#31225 ) ✨ Add new line switch before logging "* Running {description} ***". Signed-off-by: jacklanda <yonyonlau@gmail.com>	2024-06-04 13:38:17 +01:00
amyeroberts	4ba66fdb4c	Fix pipeline tests - torch imports (#31227 ) * Fix pipeline tests - torch imports * Frameowrk dependant float conversion	2024-06-04 12:30:23 +01:00
Chujie Zheng	6b22a8f2d8	fix bf16 issue in text classification pipeline (#30996 ) * fix logits dtype * Add bf16/fp16 tests for text_classification pipeline * Update test_pipelines_text_classification.py * fix * fix	2024-06-04 11:20:48 +01:00
Kristen Pereira	de460e28e1	Add dynamic resolution input/interpolate position embedding to deit (#31131 ) * Added interpolate pos encoding feature and test to deit * Added interpolate pos encoding feature and test for deit TF model * readded accidentally delted test for multi_gpu * storing only patch_size instead of entire config and removed commented code * Update modeling_tf_deit.py to remove extra line Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-04 10:29:01 +01:00
Raushan Turganbay	d64e4da713	Video-LLaVa: handle any number of frames (#31221 ) video-llava can handle more frames	2024-06-04 14:20:03 +05:00
Max Strobel	36ade4a32b	fix(PatchTST): Wrong dropout used for PretainHead (#31117 ) * fix(PatchTST): Wrong dropout used for PretainHead * feat(PatchTST): remove unused config.dropout --------- Co-authored-by: Strobel Maximilian (IFAG PSS SIS SCE ACM) <Maximilian.Strobel@infineon.com>	2024-06-04 10:11:36 +01:00
DomHudson	e83cf58145	Fix sentence fragment within test comments (#31218 )	2024-06-04 10:09:24 +01:00
Raushan Turganbay	83238eeebc	Pass device in Logits Processor's init (#29804 ) * add device in logits processor * remove device when not needed * codestyle * tests * forgot `melody` version * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * codestyle * updates --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-06-04 10:19:19 +05:00
Aaron Jimenez	c73ee1333d	[docs] Spanish translation of tokenizer_summary.md (#31154 ) * add tokenizer_summary to es/_toctree.yml * add tokenizer_summary to es/ * fix link to Transformes XL in en/ * translate until Subword tokenization section * fix GPT link in en/ * fix other GPT link in en/ * fix typo in en/ * translate the doc * run make fixup * Remove .md in Transformer XL link * fix some link issues in es/ * fix typo	2024-06-03 16:52:23 -07:00
Yih-Dar	8a1a23ae4d	Fix GPU OOM for `mistral.py::Mask4DTestHard` (#31212 ) * build * build * build * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-03 19:25:15 +02:00
miivanov90	df5abae894	Set greater_is_better to False if metric_for_best_model ends with "loss" (#31142 ) * update to not(endswith(loss)) * ruff formatting	2024-06-03 17:52:28 +01:00
Younes Belkada	924c46d40c	Cohere: Fix copied from (#31213 ) Update modeling_cohere.py	2024-06-03 18:29:31 +02:00
Jade Choghari	98dd842339	Wrong translation FR : Contents = Contenu (#31186 ) Update index.md - Contents = Contenu French typo - Contents = Contenu	2024-06-03 17:40:14 +02:00
Qubitium	c6c78733d7	Rename sanity_evaluation to eval_on_start (#31192 ) * Rename sanity_evaluation to eval_on_start * move arg back to last	2024-06-03 16:32:21 +01:00
Bojun Feng	c230504b36	Fix typo in utils (#31169 ) fix typo	2024-06-03 17:27:53 +02:00
Sangbum Daniel Choi	874ac129bb	fix the get_size_with_aspect_ratio in max_size situation (#30902 ) * fix the get_size_with_aspect_ratio in max_size situation * make fix-up * add more general solution * consider when max_size is not defined * fix typo * fix typo * simple fix * fix error * fix if else error * fix error of size overwrite * fix yolos image processing * fix detr image processing * make * add longest related test script * Update src/transformers/models/yolos/image_processing_yolos.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more test * add test script about longest size * remove deprecated --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-03 16:12:08 +01:00
Isotr0py	e4628434d8	Add Qwen2 GGUF loading support (#31175 ) * add qwen2 gguf support * Update docs * fix qwen2 tokenizer * add qwen2 gguf test * fix typo in qwen2 gguf test * format code * Remove mistral, clarify the error message * format code * add typing and update docstring	2024-06-03 14:55:10 +01:00
Yih-Dar	df848acc5d	Fix `test_compile_static_cache` (#30991 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-03 15:16:28 +02:00
NielsRogge	70c8713872	🚨 [Mistral and friends] Update MLP (#31057 ) Update MLP	2024-06-03 14:57:07 +02:00
Joao Gante	d475f76745	SlidingWindowCache: reduce differences to other Cache classes (#30970 ) * tmp commit * sliding window with fewer differences * make fixup + rebase * missing overwrite	2024-06-03 14:04:24 +02:00
fxmarty	221aaec6ec	Ignore non-causal mask in more cases with SDPA (#30138 ) * update non-causal mask for sdpa * add test * update docstrings * add one more test * fix cross attention bug * gentler atol/rtol	2024-06-03 19:08:41 +08:00
Pavithra Devi M	f4f696255f	Fix Cannot convert [array()] to EagerTensor of dtype int64 (#31109 ) While running the model.prepare_tf_dataset() method, it raises the error below: ``` TypeError: Cannot convert [array([322., 1.])] to EagerTensor of dtype int64 ``` This happens, in "DataCollatorForSeq2Seq" function when we are try to convert the labels to tensors. While converting the labels to tensors, the labels can be in the format of list of list or list of ndarrays. There is no problem converting the list of list lables. There is a problem when the list of ndarrays are float values(like below). ``` [array([322., 1.])] ``` so the exception raises while trying to convert this label to tensors using below code. ``` batch["labels"] = tf.constant(batch["labels"], dtype=tf.int64) ``` The labels are always integer values, so this got converted to float values in the label padding operation below. ``` batch["labels"] = [ call(label) if padding_side == "right" else np.concatenate([[self.label_pad_token_id] * (max_label_length - len(label)), label]) for label in labels ] ``` Here we have 2 cases: 1 - Concatenating an array having integer padding token value with labels. 2 - Concatenating an empty array with labels. ---------------------------------------------------------------------------------------- case 1: Concatenating an array having integer padding token value with labels. WORKS EXPECTED: ---------------------------------------------------------------------------------------- ``` label = np.array([233, 1]) max_label_length = 4 label_pad_token_id = -100 np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label]) o/p: array([-100, -100, 233, 1]) ``` ---------------------------------------------------------------------------------------- Case 2: Concatenating an empty array with labels. GIVES THE ISSUE: This scenorio can happen when the label has the maximum label length -- No padding needed. ---------------------------------------------------------------------------------------- ``` label = np.array([233, 1]) max_label_length = 2 label_pad_token_id = -100 np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label]) o/p: array([233., 1.]) ``` ---------------------------------------------------------------------------------------- Solution: ---------------------------------------------------------------------------------------- We need to concatenate a ndarray of dtype int with labels. AFTER FIX: ---------- case 1: ``` label = np.array([233, 1]) max_label_length = 4 label_pad_token_id = -100 np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label]) o/p: array([-100, -100, 233, 1]) ``` case 2: ``` label = np.array([233, 1]) max_label_length = 2 label_pad_token_id = -100 np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label]) o/p: array([233, 1]) ```	2024-06-03 10:49:03 +01:00
Arthur	1749841a0e	[`GemmaModel`] fix small typo (#31202 ) * fixes * fix-copies	2024-06-03 11:02:38 +02:00
Ahmed Moubtahij	39b2ff69d6	Token healing (#30081 ) * token healing impl + trie with extensions * make fixup * prefix-robust space tokenization * examples readme and requirements * make fixup * allow input prompt and model * redundant defaults * Specialized Trie * make fixup * updated tests with new inherited Tree * input ids to auto device_map * rm unused import * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * naming convention * Revert "naming convention" This reverts commit dd39d9c5b7a969e2d8a8d2a8e54f121b82dc44f0. * naming convention * last -hopefully- changes --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-06-03 10:53:15 +02:00
amyeroberts	5b5b48b11d	Remove copied froms for deprecated models (#31153 ) * Remove copied froms for deprecated models * Remove automatically in script	2024-06-03 09:42:53 +01:00
CharlesCNorton	97e5a7072c	Fix typo: use_safetenstors to use_safetensors (#31184 ) Corrected a typo in security.md. Changed `use_safetenstors` to `use_safetensors` in the section discussing the usage of safe formats for loading models to prevent arbitrary code execution.	2024-06-03 10:33:02 +02:00

1 2 3 4 5 ...

16087 Commits