transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-19 04:28:26 +06:00

Author	SHA1	Message	Date
Younes Belkada	924c46d40c	Cohere: Fix copied from (#31213 ) Update modeling_cohere.py	2024-06-03 18:29:31 +02:00
Jade Choghari	98dd842339	Wrong translation FR : Contents = Contenu (#31186 ) Update index.md - Contents = Contenu French typo - Contents = Contenu	2024-06-03 17:40:14 +02:00
Qubitium	c6c78733d7	Rename sanity_evaluation to eval_on_start (#31192 ) * Rename sanity_evaluation to eval_on_start * move arg back to last	2024-06-03 16:32:21 +01:00
Bojun Feng	c230504b36	Fix typo in utils (#31169 ) fix typo	2024-06-03 17:27:53 +02:00
Sangbum Daniel Choi	874ac129bb	fix the get_size_with_aspect_ratio in max_size situation (#30902 ) * fix the get_size_with_aspect_ratio in max_size situation * make fix-up * add more general solution * consider when max_size is not defined * fix typo * fix typo * simple fix * fix error * fix if else error * fix error of size overwrite * fix yolos image processing * fix detr image processing * make * add longest related test script * Update src/transformers/models/yolos/image_processing_yolos.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more test * add test script about longest size * remove deprecated --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-06-03 16:12:08 +01:00
Isotr0py	e4628434d8	Add Qwen2 GGUF loading support (#31175 ) * add qwen2 gguf support * Update docs * fix qwen2 tokenizer * add qwen2 gguf test * fix typo in qwen2 gguf test * format code * Remove mistral, clarify the error message * format code * add typing and update docstring	2024-06-03 14:55:10 +01:00
Yih-Dar	df848acc5d	Fix `test_compile_static_cache` (#30991 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-06-03 15:16:28 +02:00
NielsRogge	70c8713872	🚨 [Mistral and friends] Update MLP (#31057 ) Update MLP	2024-06-03 14:57:07 +02:00
Joao Gante	d475f76745	SlidingWindowCache: reduce differences to other Cache classes (#30970 ) * tmp commit * sliding window with fewer differences * make fixup + rebase * missing overwrite	2024-06-03 14:04:24 +02:00
fxmarty	221aaec6ec	Ignore non-causal mask in more cases with SDPA (#30138 ) * update non-causal mask for sdpa * add test * update docstrings * add one more test * fix cross attention bug * gentler atol/rtol	2024-06-03 19:08:41 +08:00
Pavithra Devi M	f4f696255f	Fix Cannot convert [array()] to EagerTensor of dtype int64 (#31109 ) While running the model.prepare_tf_dataset() method, it raises the error below: ``` TypeError: Cannot convert [array([322., 1.])] to EagerTensor of dtype int64 ``` This happens, in "DataCollatorForSeq2Seq" function when we are try to convert the labels to tensors. While converting the labels to tensors, the labels can be in the format of list of list or list of ndarrays. There is no problem converting the list of list lables. There is a problem when the list of ndarrays are float values(like below). ``` [array([322., 1.])] ``` so the exception raises while trying to convert this label to tensors using below code. ``` batch["labels"] = tf.constant(batch["labels"], dtype=tf.int64) ``` The labels are always integer values, so this got converted to float values in the label padding operation below. ``` batch["labels"] = [ call(label) if padding_side == "right" else np.concatenate([[self.label_pad_token_id] * (max_label_length - len(label)), label]) for label in labels ] ``` Here we have 2 cases: 1 - Concatenating an array having integer padding token value with labels. 2 - Concatenating an empty array with labels. ---------------------------------------------------------------------------------------- case 1: Concatenating an array having integer padding token value with labels. WORKS EXPECTED: ---------------------------------------------------------------------------------------- ``` label = np.array([233, 1]) max_label_length = 4 label_pad_token_id = -100 np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label]) o/p: array([-100, -100, 233, 1]) ``` ---------------------------------------------------------------------------------------- Case 2: Concatenating an empty array with labels. GIVES THE ISSUE: This scenorio can happen when the label has the maximum label length -- No padding needed. ---------------------------------------------------------------------------------------- ``` label = np.array([233, 1]) max_label_length = 2 label_pad_token_id = -100 np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label]) o/p: array([233., 1.]) ``` ---------------------------------------------------------------------------------------- Solution: ---------------------------------------------------------------------------------------- We need to concatenate a ndarray of dtype int with labels. AFTER FIX: ---------- case 1: ``` label = np.array([233, 1]) max_label_length = 4 label_pad_token_id = -100 np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label]) o/p: array([-100, -100, 233, 1]) ``` case 2: ``` label = np.array([233, 1]) max_label_length = 2 label_pad_token_id = -100 np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label]) o/p: array([233, 1]) ```	2024-06-03 10:49:03 +01:00
Arthur	1749841a0e	[`GemmaModel`] fix small typo (#31202 ) * fixes * fix-copies	2024-06-03 11:02:38 +02:00
Ahmed Moubtahij	39b2ff69d6	Token healing (#30081 ) * token healing impl + trie with extensions * make fixup * prefix-robust space tokenization * examples readme and requirements * make fixup * allow input prompt and model * redundant defaults * Specialized Trie * make fixup * updated tests with new inherited Tree * input ids to auto device_map * rm unused import * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * naming convention * Revert "naming convention" This reverts commit dd39d9c5b7a969e2d8a8d2a8e54f121b82dc44f0. * naming convention * last -hopefully- changes --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-06-03 10:53:15 +02:00
amyeroberts	5b5b48b11d	Remove copied froms for deprecated models (#31153 ) * Remove copied froms for deprecated models * Remove automatically in script	2024-06-03 09:42:53 +01:00
CharlesCNorton	97e5a7072c	Fix typo: use_safetenstors to use_safetensors (#31184 ) Corrected a typo in security.md. Changed `use_safetenstors` to `use_safetensors` in the section discussing the usage of safe formats for loading models to prevent arbitrary code execution.	2024-06-03 10:33:02 +02:00
Arthur	96eb06286b	Diff converter v2 (#30868 ) * current working example! * commit regex and result file * update * nit * push the conversion file * oups * roadmap and nits * attempt diffs for 3 files * persimmon * nit * add diff file that is the same as the modeling_llama.py * fix rope nits * updates * updates with converted versions * give some breathing space to the code * delete * update * update * push the actual result * update regex patterns * update regex patterns * fix some issues * fix some issues * fix some issues * updates * updates * updates * updates * updates * revert changes done to llama * updates * update gemma * updates * oups * current state * current state * update * ouiiii * nit * clear diffs * nit * fixup * update * doc 🚀 * 🔥 * for now use gemma * deal with comments * style * handle funtions * deal with assigns * todos * process inheritage * keep decorators? * 🤗 * deal with duplicates * fixup * correctly remove duplicate code * run ruff post script * ruff deals pretty well with imports, let's leave it to him * ah maybe not lol * for now remove all imports from child. * nit * conversion of llama * okay * convert starcoder2 * synch with main * update llama diff * updates * https://docs.astral.sh/ruff/rules/redefined-while-unused/ fixes the imports, bit needs later version of ruff * updates * okay actual state * non zero exit * update! * revert unrelated * remove other diff files * updates * cleanup * update * less diff! * stash * current updates * updates * No need for call * finished fining deps * update * current changes * current state * current state * new status * nit * finally * fixes * nits * order is now expected * use logger info instead of prints * fixup * up * nit * update * nits * update * correct merge * update * update * update * add warning * update caution message * update * better merging strategy * copy class statements :wink * fixups * nits * update * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * smaller header * do cleanup some stuff * even simpler header? * fixup * updates * ruff * update examples * nit * TODO * state * OUUUUUUF * current state * nits * final state * add a readme * fixup * remove diff llama * fix * nit * dummy noy funny * ruff format tests src utils --check * everless diffs * less diffs and fix test * fixes * naming nit? * update converter and add supper example * nits * updated for function signatures * update * update * add converted dummies * autoformat * single target assign fix * fixup * fix some imports * fixes * don't push them * `# noqa: F841` --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-31 18:37:43 +02:00
Vallepu Vamsi Krishna	372baec2e6	Added description of quantization_config (#31133 ) * Description of quantization_config Added missing description about quantization_config in replace_with_bnb_linear for better readability. * Removed trailing spaces	2024-05-31 18:23:11 +02:00
Pavel Iakubovskii	cdc813113a	Instance segmentation examples (#31084 ) * Initial setup * Metrics * Overfit on two batches * Train 40 epochs * Memory leak debugging * Trainer fine-tuning * Draft * Fixup * Trained end-to-end * Add requirements * Rewrite evaluator * nits * Add readme * Add instance-segmentation to the table * Support void masks * Remove sh * Update docs * Add pytorch test * Add accelerate test * Update examples/pytorch/instance-segmentation/README.md * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Fix consistency oneformer * Fix imports * Fix imports sort * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Add resources to docs * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove explicit model_type argument * Fix tests * Update readme * Note about other models --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-31 16:56:17 +01:00
Aymeric Roucher	9837a25481	Add streaming, various fixes (#30838 ) * Implement streaming run in ReAct agents * Allow additional imports in code agents * Python interpreter: support classes and exceptions, fixes	2024-05-31 14:16:23 +02:00
Marc Sun	f8e6ba454c	[trainer] add sanity evaluation option (#31146 ) * add sanity evaluation * fix * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-05-31 12:44:20 +02:00
Younes Belkada	fc5d3e112a	Quantization: Enhance bnb error message (#31160 ) enhance error message	2024-05-31 12:36:46 +02:00
Asif Ajrof	bd9d1ddf41	Update sam.md (#31130 ) `mask` variable is not defined. probably a writing mistake. it should be `segmentation_map`. `segmentation_map` should be a `1` channel image rather than `RGB`. [on a different note, the `mask_url` is the same as `raw_image`. could provide a better example.	2024-05-31 12:34:29 +02:00
Marc Sun	48cada87c3	Fix quantized cache output (#31143 )	2024-05-31 12:08:55 +02:00
Yih-Dar	d19566e852	pytest -rsfE (#31140 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-31 10:35:54 +02:00
Arthur	f3f640dce1	helper (#31152 ) * helper * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * more doc --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-31 08:49:33 +02:00
Younes Belkada	6bd511a45a	Workflow: Remove `IS_GITHUB_CI` (#31147 ) remove `IS_GITHUB_CI`	2024-05-30 17:21:10 +02:00
Younes Belkada	f5590deaa8	Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (#31136 ) Replace all occurences of `load_in_8bit` with bnb config	2024-05-30 16:47:35 +02:00
zspo	cda9c82a63	fix get_scheduler when name is warmup_stable_decay (#31128 ) fix get_scheduler args	2024-05-30 15:25:43 +01:00
Younes Belkada	5e5c4d629d	FIX / Quantization: Add extra validation for bnb config (#31135 ) add validation for bnb config	2024-05-30 11:45:03 +02:00
Yih-Dar	2b9e252b16	Cleanup docker build (#31119 ) * remove * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-29 19:43:51 +02:00
Dhruv Pai	5c88253556	Add on_optimizer_step to callback options (#31095 ) * Modified test * Added on_optimizer_step to callbacks * Move callback after step is called * Added on optimizer step callback	2024-05-29 16:20:59 +02:00
Joao Gante	4af705c6ce	Add VLM generation default contributor (#31115 ) * add Raushan * add Raushan	2024-05-29 15:17:14 +01:00
Younes Belkada	cb879c5801	FIX / Docs: Fix GPTQ expected number of bits (#31111 ) Update overview.md	2024-05-29 15:56:28 +02:00
Yih-Dar	1f84141391	Fix nightly circleci (#31114 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-29 15:42:39 +02:00
Zach Mueller	d16053c867	Rm maintainer + migrate (#31089 )	2024-05-29 09:35:37 -04:00
Matt	0bef4a2738	Fix faulty rstrip in module loading (#31108 )	2024-05-29 13:33:26 +01:00
Matt	97a58a5d2c	Fix env.py in cases where torch is not present (#31113 ) * Fix env.py in cases where torch is not present * Simplify the fix (and avoid some issues)	2024-05-29 13:20:36 +01:00
Huazhong Ji	c8861376ad	Improve `transformers-cli env` reporting (#31003 ) * Improve `transformers-cli env` reporting * move the line `"Using GPU in script?": "<fill in>"` to in if conditional statement * same option for npu	2024-05-29 11:57:54 +01:00
Lucain	c3044ec2f3	Use `HF_HUB_OFFLINE` + fix has_file in offline mode (#31016 ) * Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs	2024-05-29 11:55:43 +01:00
Younes Belkada	bfe6f513b9	FEAT: Add mistral v3 conversion script (#30981 ) * add mistral v3 conversion script * Update src/transformers/models/mistral/convert_mistral_weights_to_hf.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-05-29 11:43:54 +02:00
Raushan Turganbay	d521ba5797	Quantized KV cache: update quanto (#31052 ) * quanto latest version was refactored * add error msg * incorrect compare sign * Update src/transformers/cache_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-29 14:25:44 +05:00
amyeroberts	a564d10afe	Deprecate low use models (#30781 ) * Deprecate models - graphormer - time_series_transformer - xlm_prophetnet - qdqbert - nat - ernie_m - tvlt - nezha - mega - jukebox - vit_hybrid - x_clip - deta - speech_to_text_2 - efficientformer - realm - gptsan_japanese * Fix up * Fix speech2text2 imports * Make sure message isn't indented * Fix docstrings * Correctly map for deprecated models from model_type * Uncomment out * Add back time series transformer and x-clip * Import fix and fix-up * Fix up with updated ruff	2024-05-28 18:07:07 +01:00
Younes Belkada	7f08817be4	Docs / Quantization: Redirect deleted page (#31063 ) Update _redirects.yml	2024-05-28 18:29:22 +02:00
Younes Belkada	3264be4114	TST: Fix instruct-blip tests (#31088 ) * fix flan t5 tests * better format	2024-05-28 18:29:11 +02:00
Jonny Li	476890e9ae	Fix DeepSpeed compatibility with weight_norm (#30881 ) (#31018 )	2024-05-28 17:25:15 +01:00
Albert Villanova del Moral	aada568f73	Fix PretrainedConfig docstring with deprecated resume_download (#31014 )	2024-05-28 17:47:35 +02:00
Yih-Dar	3af7bf30ad	skip `test_multi_gpu_data_parallel_forward` for `vit` and `deit` (#31086 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-28 17:44:52 +02:00
Younes Belkada	ab19f907fd	FIX / OPT: Fix OPT multi-GPU training for `OPTForQuestionAnswering` (#31092 ) Update modeling_opt.py	2024-05-28 17:06:00 +02:00
Younes Belkada	94d416f018	FIX: Add `accelerate` as a hard requirement (#31090 ) add accelerate	2024-05-28 17:05:44 +02:00
Sigbjørn Skjæret	22dab246c5	Render chat template tojson filter as unicode (#31041 ) * Render chat template tojson filter as unicode * ruff--	2024-05-28 15:02:51 +01:00

1 2 3 4 5 ...

16052 Commits