transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

Author	SHA1	Message	Date
Daniel Bustamante Ospina	aa4a0f8ef3	Remove fast tokenization warning in Data Collators (#28213 )	2024-01-02 18:32:23 +00:00
Marco Carosi	5be46dfc09	[Whisper] Fix errors with MPS backend introduced by new code on word-level timestamps computation (#28288 ) * Update modeling_whisper.py to support MPS backend Fixed some issue with MPS backend. First, the torch.std_mean is not implemented and is not scheduled for implementation, while the single torch.std and torch.mean are. Second, MPS backend does not support float64, so it can not cast from float32 to float64. Inverting the double() when the matrix is in the cpu fixes the issue while should not change the logic. * Found another instruction in modeling_whisper.py not implemented byor MPS After a load test, where I transcribed a 2 hours audio file, I got into a branch that did not fix in the previous commit. Similar fix, where the torch.std_mean is changed into torch.std and torch.mean * Update modeling_whisper.py removed trailing white spaces Removed trailing white spaces * Update modeling_whisper.py to use is_torch_mps_available() Using is_torch_mps_available() instead of capturing the NotImplemented exception * Update modeling_whisper.py sorting the import block Sorting the utils import block * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-02 16:22:28 +00:00
frankenliu	87ae2a4632	fix bug:divide by zero in _maybe_log_save_evaluate() (#28251 ) Co-authored-by: liujizhong1 <liujizhong1@xiaomi.com>	2024-01-02 14:19:42 +00:00
hoshi-hiyouga	502a10a6f8	Fix trainer saving safetensors: metadata is None (#28219 ) * Update trainer.py * format	2024-01-02 12:58:29 +00:00
Dean Wyatte	cad9f5c6cc	Update docs around mixing hf scheduler with deepspeed optimizer (#28223 ) update docs around mixing hf scheduler with deepspeed optimizer	2024-01-02 11:48:17 +00:00
Stas Bekman	3cefac1d97	small typo (#28229 ) Update modeling_utils.py	2023-12-26 21:52:10 +01:00
Sourab Mangrulkar	3b7675b2b8	fix FA2 when using quantization (#28203 )	2023-12-26 08:36:41 +05:30
Younes Belkada	fa21ead73d	[`Awq`] Enable the possibility to skip quantization for some target modules (#27950 ) * v1 * add docstring * add tests * add awq 0.1.8 * oops * fix test	2023-12-25 11:06:56 +01:00
Younes Belkada	29e7a1e183	[`Llava`] Fix llava index errors (#28032 ) * fix llava index errors * forward contrib credits from original implementation and fix * better fix * final fixes and fix all tests * fix * fix nit * fix tests * add regression tests --------- Co-authored-by: gullalc <gullalc@users.noreply.github.com>	2023-12-22 17:47:38 +01:00
lin yudong	68fa1e855b	update the logger message with accordant weights_file_name (#28181 ) Co-authored-by: yudong.lin <yudong.lin@funplus.com>	2023-12-22 15:05:10 +00:00
Anindyadeep	74d9d0cebb	Fixing visualization code for object detection to support both types of bounding box. (#27842 ) * fix: minor enhancement and fix in bounding box visualization example The example that was trying to visualize the bounding box was not considering an edge case, where the bounding box can be un-normalized. So using the same set of code, we can not get results with a different dataset with un-normalized bounding box. This commit fixes that. * run make clean * add an additional note on the scenarios where the box viz code works --------- Co-authored-by: Anindyadeep <anindya@pop-os.localdomain>	2023-12-22 13:24:40 +00:00
Yoach Lacombe	5da3db3fd5	[Whisper] Fix word-level timestamps with bs>1 or num_beams>1 (#28114 ) * fix frames * use smaller chunk length * correct beam search + tentative stride * fix whisper word timestamp in batch * add test batch generation with return token timestamps * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * clean a test * make style + correct typo * write clearer comments * explain test in comment --------- Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-12-22 12:43:11 +00:00
Yih-Dar	c4df7c1668	Drop `feature_extractor_type` when loading an image processor file (#28195 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-22 13:19:04 +01:00
Yih-Dar	bb3bd44739	Fix the check of models supporting FA/SDPA not run (#28202 ) * add check_support_list.py * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-22 12:56:11 +01:00
Michael Feil	e37ab52dff	Bug: `training_args.py` fix missing import with accelerate with version `accelerate==0.20.1` (#28171 ) * fix-accelerate-version * updated with exported ACCELERATE_MIN_VERSION, * update string in ACCELERATE_MIN_VERSION	2023-12-22 11:41:35 +00:00
NielsRogge	c9fb250a25	Add Swinv2 backbone (#27742 ) * First draft * More improvements * More improvements * Make all tests pass * Remove script * Update image processor * Address comments * Use new gradient checkpointing method * Convert checkpoints, add integration test * Do not keep aspect ratio for now * Set keep_aspect_ratio=False for beit, add integration test * Remove print statement	2023-12-22 11:12:56 +00:00
Nicholas Neo	1ef86c4f56	Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format in the SeamlessM4TFeatureExtractor class (#27914 ) * fixes: code fixes on is_batched condition to also check for batched audio data in torch.Tensor format instead of only just checking for batched audio data in np.ndarray format * Update src/transformers/models/seamless_m4t/feature_extraction_seamless_m4t.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * refactor: code refactoring to remove torch framework dependency * docs: updated docstring to add torch tensor compatibility * test: add test cases to incorporate torch tensor inputs * test: ran make fix-copies for code conformity * test: refactor test to separate the test_call into test_call_numpy and test_call_torch --------- Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>	2023-12-22 10:47:30 +00:00
Dean Wyatte	548a8f6119	Fix ONNX export for causal LM sequence classifiers by removing reverse indexing (#28144 ) * normalize reverse indexing for causal lm sequence classifiers * normalize reverse indexing for causal lm sequence classifiers * normalize reverse indexing for causal lm sequence classifiers * use modulo instead * unify modulo-based sequence lengths	2023-12-22 10:33:44 +00:00
Yih-Dar	71f460578d	Update `docs/source/en/perf_infer_gpu_one.md` (#28198 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-22 10:40:22 +01:00
Younes Belkada	3a8769f6a9	[`Docs`] Add 4-bit serialization docs (#28182 ) * add 4-bit serialization docs * up * up	2023-12-22 10:18:32 +01:00
amyeroberts	3657748b4d	Update YOLOS slow test values (#28187 ) Update test values	2023-12-21 18:17:07 +00:00
amyeroberts	cd1350ce9b	Fix slow backbone tests - out_indices must match stage name ordering (#28186 ) Indices must match stage name ordering	2023-12-21 18:16:50 +00:00
Matt	260b9d2179	Even more TF test fixes (#28146 ) * Fix vision text dual encoder * Small cleanup for wav2vec2 (not fixed yet) * Small fix for vision_encoder_decoder * Fix SAM builds * Update TFBertTokenizer test with modern exporting + tokenizer * Fix DeBERTa * Fix DeBERTav2 * Try RAG fix but it's impossible to test locally * Actually fix RAG now that I got FAISS working somehow * Fix Wav2Vec2, add sermon * Fix Hubert	2023-12-21 15:14:46 +00:00
Arthur	f9a98c476c	[`Mixtral` & `Mistral`] Add support for sdpa (#28133 ) * some nits * update test * add support d\sd[a * remove some dummy inputs * all good * style * nits * fixes * fix more copies * nits * styling * fix * Update src/transformers/models/mistral/modeling_mistral.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add a slow test just to be sure * fixup --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-12-21 12:38:22 +01:00
Sanchit Gandhi	814619f54f	[Whisper] Use torch for stft if available (#26119 ) * [Whisper] Use torch for stft if available * update docstring * mock patch decorator * fit on one line	2023-12-21 11:04:05 +00:00
Joao Gante	7e93ce40c5	Fix `input_embeds` docstring in encoder-decoder architectures (#28168 )	2023-12-21 11:01:54 +00:00
Poedator	4f7806ef7e	[bnb] Let's make serialization of 4bit models possible (#26037 ) * updated bitsandbytes.py * rm test_raise_* from test_4bit.py * add test_4bit_serialization.py * modeling_utils bulk edits * bnb_ver 0.41.3 in integrations/bitsandbytes.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * @slow reinstated Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * bnb ver 0.41.3 in src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * rm bnb version todo in integrations/bitsandbytes.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * moved 4b serialization tests to test_4bit * tests upd for opt * to torch_device Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * ruff fixes to tests * rm redundant bnb version check in mod_utils Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded modeling_utils.py::2188 Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded test in modeling_utils.py::2199 Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fixed NOT getattr(self, "is_8bit_serializable") Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * setting model.is_4bit_serializable * rm separate fp16_statistics arg from set_module... * rm else branch in integrations::bnb::set_module * bnb 4bit dtype check * upd comment on 4bit weights * upd tests for FP4 safe --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-12-21 11:54:44 +01:00
Dean Wyatte	e268d7e5dc	disable test_retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest (#28169 ) disable retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest	2023-12-21 08:39:44 +01:00
amyeroberts	1d77735947	Fix yolos resizing (#27663 ) * Fix yolos resizing * Update tests * Add a test	2023-12-20 20:55:51 +00:00
Joao Gante	45b70384a7	Generate: fix speculative decoding (#28166 ) Co-authored-by: Merve Noyan <merveenoyan@gmail.com>	2023-12-20 18:55:35 +00:00
Steven Liu	01c081d138	[docs] Trainer docs (#28145 ) * fsdp, debugging, gpu selection * fix hfoption * fix	2023-12-20 10:37:23 -08:00
amyeroberts	ee298a16a2	Align backbone stage selection with out_indices & out_features (#27606 ) * Iteratre over out_features instead of stage_names * Update for all backbones * Add tests * Fix * Align timm backbone behaviour with other backbones * Fix tests * Stricter checks on set out_features and out_indices * Revert back stage selection logic * Remove out-of-order logic * Document restriction in docstrings	2023-12-20 18:33:17 +00:00
amyeroberts	224ab70969	Update FA2 exception msg to point to hub discussions (#28161 ) * Update FA2 exception msg to point to hub discussions * Use path for hub url	2023-12-20 16:52:16 +00:00
Yih-Dar	9924df9eb2	Avoid unnecessary warnings when loading `CLIPConfig` (#28108 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-20 17:24:53 +01:00
Yih-Dar	7938c8c836	Fix weights not properly initialized due to shape mismatch (#28122 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-20 14:20:02 +01:00
peter-sk	769a9542de	move code to Trainer.evaluate to enable use of that function with multiple datasets (#27844 ) * move code to Trainer.evaluate to enable use of that function with multiple datasets * test * update doc string * and a tip * forgot the type --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-12-20 10:55:56 +01:00
Jong-hun Shin	cd9f9d63f1	[gpt-neox] Add attention_bias config to support model trained without attention biases (#28126 ) * add attention_bias hparam for a model trained without attention biases * fix argument documentation error	2023-12-20 10:05:32 +01:00
Sourab Mangrulkar	def581ef51	Fix FA2 integration (#28142 ) * fix fa2 * fix FA2 for popular models * improve warning and add Younes as co-author Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix the warning * Add Tip * typo fix * nit --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-12-20 14:25:07 +05:30
Abolfazl Shahbazi	b134f6857e	Remove deprecated CPU dockerfiles (#28149 ) Signed-off-by: Abolfazl Shahbazi <abolfazl.shahbazi@intel.com>	2023-12-20 05:51:35 +01:00
Aaron Jimenez	38611086d2	[docs] Fix mistral link in mixtral.md (#28143 ) Fix mistral link in mixtral.md	2023-12-19 10:34:14 -08:00
Mike Zellinger	23f8e4db77	Update modeling_utils.py (#28127 ) In docstring for PreTrainedModel.resize_token_embeddings, correct definition of new_num_tokens parameter to read "the new number of tokens" (meaning the new size of the vocab) rather than "the number of new tokens" (number of newly added tokens only).	2023-12-19 09:07:57 -08:00
Arthur	4a04b4ccca	[`Mixtral`] Fix loss + nits (#28115 ) * default config should not use sliding window * update the doc * nits * add a proper test * update * update * update expected value * Update src/transformers/tokenization_utils_fast.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * convert to float * average then N*2 comment * revert nit * good to fo * fixup * Update tests/models/mixtral/test_modeling_mixtral.py Co-authored-by: Lysandre Debut <hi@lysand.re> * revert unrelated change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-12-19 17:31:54 +01:00
Joao Gante	ac974199c8	Generate: speculative decoding (#27979 ) * speculative decoding * fix test * space * better comments * remove redundant test * test nit * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * PR comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-12-19 13:58:30 +00:00
amyeroberts	bd7a356135	Update split string in doctest to reflect #28087 (#28135 )	2023-12-19 13:55:09 +00:00
qihqi	5aec50ecaf	When save a model on TPU, make a copy to be moved to CPU (#27993 ) * When save a model, make a copy to be moved to CPU, dont move the original model * make deepcopy inside of _save_tpu * Move to tpu without copy	2023-12-19 10:08:51 +00:00
Aaron Jimenez	4edffda636	[Doc] Fix token link in What 🤗 Transformers can do (#28123 ) Fix token link	2023-12-18 15:06:54 -08:00
Mike Salvatore	c52b515e94	Fix a typo in tokenizer documentation (#28118 )	2023-12-18 19:44:35 +01:00
Steven Liu	a52e180a0f	[docs] General doc fixes (#28087 ) * doc fix friday * deprecated objects * update not_doctested * update toctree	2023-12-18 10:44:09 -08:00
Rockerz	08a6e7a702	Fix indentation error - semantic_segmentation.md (#28117 ) Update semantic_segmentation.md	2023-12-18 12:47:54 -05:00
Matt	71d47f0ad4	More TF fixes (#28081 ) * More build_in_name_scope() * Make sure we set the save spec now we don't do it with dummies anymore * make fixup	2023-12-18 15:26:03 +00:00

1 2 3 4 5 ...

14821 Commits