transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 02:58:23 +06:00

Author	SHA1	Message	Date
Sanchit Gandhi	c9cf337772	[Whisper Tokenizer] Skip special tokens when decoding with timestamps (#23945 )	2023-06-02 16:26:59 +02:00
Claudius Kienle	8940d315aa	Trainer: fixed evaluate raising `KeyError` for ReduceLROnPlateau (#23952 ) Trainer: fixed KeyError on evaluate for ReduceLROnPlateau Co-authored-by: Claudius Kienle <claudius.kienle@artiminds.com>	2023-06-02 08:53:48 -04:00
Kihoon Son	2fdba73a99	🌐 [i18n-KO] Translated object_detection.mdx to Korean (#23164 ) * translated object_detection.mdx Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: simso <3035487+simso@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: simso <3035487+simso@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>	2023-06-02 07:43:55 -04:00
Patrick von Platen	dcb5e18c9e	add new mms functions to doc (#23954 )	2023-06-02 11:35:52 +01:00
Shehan Munasinghe	07c54413ac	Add MobileViTv2 (#22820 ) * generated code from add-new-model-like * Add code for modeling, config, and weight conversion * add tests for image-classification, update modeling and config * add code, tests for semantic-segmentation * make style, make quality, make fix-copies * make fix-copies * Update modeling_mobilevitv2.py fix bugs * Update _toctree.yml * update modeling, config fix bugs * Edit docs - fix bug MobileViTv2v2 -> MobileViTv2 * Update mobilevitv2.mdx * update docstrings * Update configuration_mobilevitv2.py make style * Update convert_mlcvnets_to_pytorch.py remove unused options * Update convert_mlcvnets_to_pytorch.py make style * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style, make quality * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Remove MobileViTv2ImageProcessor Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style * Add suggestions from code review Rename MobileViTv2 -> MobileViTV2 Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_mobilevitv2.py make style * Update serialization.mdx * Update modeling_mobilevitv2.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-02 10:37:02 +01:00
Patrick von Platen	5dfd407b37	[MMS] Scaling Speech Technology to 1,000+ Languages \| Add attention adapter to Wav2Vec2 (#23813 ) * add fine-tuned with adapter layer * Add set_target_lang to tokenizer * Implement load adapter * add tests * make style * Apply suggestions from code review * Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py * make fix-copies * Apply suggestions from code review * make fix-copies * make style again * mkae style again * fix doc string * Update tests/models/wav2vec2/test_tokenization_wav2vec2.py * Apply suggestions from code review * fix * Correct wav2vec2 adapter * mkae style * Update src/transformers/models/wav2vec2/modeling_wav2vec2.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add more nice docs * finish * finish * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review * all finish --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-02 10:30:24 +01:00
wasupandceacar	f49a3453ca	Fix `ReduceLROnPlateau` object has no attribute 'get_last_lr' (#23944 ) * Fix 'ReduceLROnPlateau' object has no attribute 'get_last_lr' * fix style	2023-06-01 16:10:52 -04:00
Kashif Rasul	c62b01d0b0	use _make_causal_mask in clip/vit models (#23942 ) use _make_causal_mask in clip models	2023-06-01 16:10:24 -04:00
Marc Sun	e03a9cc0cd	Modify device_map behavior when loading a model using from_pretrained (#23922 ) * Modify device map behavior for 4/8 bits model * Remove device_map arg for training 4/8 bit model * Remove index Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add Exceptions * Modify comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix formatting * Get current device with accelerate * Revert "Get current device with accelerate" This reverts commit `46f0079910`. * Fix Exception * Modify quantization doc * Fix error Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-01 13:21:22 -04:00
Brendon Soong	d1fa349e78	#23675 Registering Malay language (#23689 ) * #23675 Registering Malay language * removing untranslated files * some translate * more updates to toctree * inc index * additional translations for toctree * translations of more sections * removing untranslated file * translated index.mdx to malay	2023-06-01 13:17:27 -04:00
Lysandre Debut	dc67da0182	Revert "Update stale.yml to use HuggingFaceBot" (#23943 ) Revert "Update stale.yml to use HuggingFaceBot (#23941)" This reverts commit `5929f86ebb`.	2023-06-01 11:58:11 -04:00
Matt	8088ca4185	Make TF ESM inv_freq non-trainable like PyTorch (#23940 ) Make TF inv_freq non-trainable like PyTorch	2023-06-01 16:15:00 +01:00
Lysandre Debut	5929f86ebb	Update stale.yml to use HuggingFaceBot (#23941 )	2023-06-01 10:54:50 -04:00
Adam Lewis	857d4e1c87	rename DocumentQuestionAnsweringTool parameter input to match docstring (#23939 ) rename encode input to match docstring	2023-06-01 10:54:01 -04:00
Sylvain Gugger	9193188276	Pin rhoknp (#23937 )	2023-06-01 10:25:43 -04:00
Sheon Han	af2c36793f	Fix doc string nits (#23929 )	2023-06-01 10:10:15 -04:00
fxmarty	9a35a7b9e1	Effectively allow `encoder_outputs` input to be a tuple in pix2struct (#23932 ) consistentcy	2023-06-01 09:07:57 -04:00
Sanchit Gandhi	9603ef890a	[Flax Whisper] Update decode docstring (#23908 )	2023-06-01 14:36:45 +02:00
Sylvain Gugger	fabe17a726	Skip device placement for past key values in decoder models (#23919 )	2023-05-31 15:32:21 -04:00
NielsRogge	6affd9cd7c	[PushToHub] Make it possible to upload folders (#23920 ) Add first draft	2023-05-31 15:31:28 -04:00
Sylvain Gugger	4aa13224a5	Update the update metadata job to use upload_folder (#23917 )	2023-05-31 14:10:14 -04:00
Sylvain Gugger	3ff443a6d9	Re-enable squad test (#23912 ) * Re-enable squad test * [all-test] * [all-test] Fix all test command * Fix the all-test	2023-05-31 13:44:26 -04:00
Sourab Mangrulkar	d13021e35f	remove the extra `accelerator.prepare` (#23914 ) remove the extra `accelerator.prepare` that slipped in with multiple update from main 😅	2023-05-31 23:04:55 +05:30
amyeroberts	c608b8fc93	Bug fix - flip_channel_order for channels first images (#23701 ) Bug fix - flip_channel_order for channels_first	2023-05-31 17:12:27 +01:00
Sylvain Gugger	0b3d092f63	Empty circleci config (#23913 ) * Try easy first * Add an empty job * Fix name * Fix method	2023-05-31 12:02:05 -04:00
amyeroberts	8714b964ee	Raise error if loss can't be calculated - ViT MIM (#23872 ) Raise error if loss can't be calculated	2023-05-31 17:01:53 +01:00
Hari	404d925384	add conditional statement for auxiliary loss calculation (#23899 ) * add conditional statement for auxiliary loss calculation * fix style and copies	2023-05-31 16:40:23 +01:00
Younes Belkada	c63bfc3023	[`RWKV`] Fix RWKV 4bit (#23910 ) fix RWKV 4bit	2023-05-31 17:36:56 +02:00
Zachary Mueller	55451c66ce	Upgrade safetensors version (#23911 ) * Upgrade safetensors * Second table	2023-05-31 11:30:39 -04:00
Connor Henderson	7adce8b532	fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility (#23796 ) * add ' ' replacement for add_prefix_space * add fast tokenizer test	2023-05-31 10:52:35 -04:00
Zachary Mueller	84bac652f3	Move import check to before state reset (#23906 ) * Move import check to before state reset * Guard better	2023-05-31 10:49:43 -04:00
Younes Belkada	e42869b091	[`bnb`] add warning when no linear (#23894 ) * add warning for gpt2-like models * more details * adapt from suggestions	2023-05-31 16:40:07 +02:00
Sanchit Gandhi	8f915c450d	Unpin numba (#23162 ) * fix for ragged list * unpin numba * make style * np.object -> object * propagate changes to tokenizer as well * np.long -> "long" * revert tokenization changes * check with tokenization changes * list/tuple logic * catch numpy * catch else case * clean up * up * better check * trigger ci * Empty commit to trigger CI	2023-05-31 14:59:30 +01:00
Xinyu Yang	d99f11e898	ensure banned_mask and indices in same device (#23901 ) * ensure banned_mask and indices in same device * ensure banned_mask and indices in same device switch the order in which indices and banned_mask are created and create banned_mask on the proper device	2023-05-31 09:47:46 -04:00
Thomas Wang	d68d6665f9	Support shared tensors (#23871 ) * Suport shared storage * Really be sure we have the same storage * Make style * - Refactor storage identifier mechanism - Group everything into a single for loop * Make style * PR * make style * Update src/transformers/pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-31 09:42:30 -04:00
Sylvain Gugger	68d53bc717	Fix Trainer when model is loaded on a different GPU (#23792 )	2023-05-31 07:54:26 -04:00
Calico	0963a2508b	fix(configuration_llama): add `keys_to_ignore_at_inference` to `LlamaConfig` (#23891 )	2023-05-31 07:39:51 -04:00
Sylvain Gugger	00f6ba0e7e	Skip failing test for now	2023-05-31 06:31:33 -04:00
Sourab Mangrulkar	a73b1d59a3	accelerate deepspeed and gradient accumulation integrate (#23236 ) * mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix 😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving * shift torch dynamo handling to accelerate * shift deepspeed integration and save & load utils to accelerate * fix accelerate launcher support * oops * fix 🐛 * save ckpt fix * Trigger CI * nasty 🐛 😅 * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate * make tests happy * quality ✨ * loss tracked needs to account for grad_acc * fixing the deepspeed tests * quality ✨ * 😅😅😅 * tests 😡 * quality ✨ * Trigger CI * resolve comments and fix the issue with the previous merge from branch * Trigger CI * accelerate took over deepspeed integration --------- Co-authored-by: Stas Bekman <stas@stason.org>	2023-05-31 15:16:22 +05:30
Denisa Roberts	88f50a1e89	Add TensorFlow implementation of EfficientFormer (#22620 ) * Add tf code for efficientformer * Fix return dict bug - return last hidden state after last stage * Fix corresponding return dict bug * Override test tol * Change default values of training to False * Set training to default False X3 * Rm axis from ln * Set init in dense projection * Rm debug stuff * Make style; all tests pass. * Modify year to 2023 * Fix attention biases codes * Update the shape list logic * Add a batch norm eps config * Remove extract comments in test files * Add conditional attn and hidden states return for serving output * Change channel dim checking logic * Add exception for withteacher model in training mode * Revert layer count for now * Add layer count for conditional layer naming * Transpose for conv happens only in main layer * Make tests smaller * Make style * Update doc * Rm from_pt * Change to actual expect image class label * Remove stray print in tests * Update image processor test * Remove the old serving output logic * Make style * Make style * Complete test	2023-05-31 10:43:12 +01:00
Sylvain Gugger	9fea71b465	Fix last instances of kbit -> quantized (#23797 )	2023-05-31 11:38:20 +02:00
Sam Passaglia	38dbbc2640	Fix bug leading to missing token in GPTSanJapaneseTokenizer (#23883 ) * add \n * removed copied from header	2023-05-31 11:32:27 +02:00
Sourab Mangrulkar	03db591047	shift torch dynamo handling to accelerate (#23168 ) * mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix 😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving * shift torch dynamo handling to accelerate	2023-05-31 14:42:07 +05:30
Sourab Mangrulkar	0b774074a5	move fsdp handling to accelerate (#23158 ) * mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix 😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving	2023-05-31 14:10:46 +05:30
Sohyun Sim	015829e6c4	🌐 [i18n-KO] Translated `pad_truncation.mdx` to Korean (#23823 ) * docs: ko: pad_truncation.mdx * feat: manual draft * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-05-31 10:23:59 +02:00
Sourab Mangrulkar	1cf148a6aa	Smangrul/accelerate ddp integrate (#23151 ) * mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix 😅 * resolving comments	2023-05-31 13:42:49 +05:30
Sourab Mangrulkar	9f0646a555	Smangrul/accelerate mp integrate (#23148 ) * mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * address comments by removing debugging print statements	2023-05-31 12:27:51 +05:30
Abhinav Patil	de9255de27	Adds AutoProcessor.from_pretrained support for MCTCTProcessor (#23856 ) Adds support for AutoProcessor.from_pretrained to MCTCTProcessor models	2023-05-30 14:36:18 -04:00
George	6451ad0471	Editing issue with pickle def with lambda function (#23869 ) * Editing issue with pickle def with lambda function * fix type * Made helper function private * delete tab --------- Co-authored-by: georgebredis <9454-georgebredis@users.noreply.gitlab.aicrowd.com>	2023-05-30 13:26:37 -04:00
Arthur	af2aac51fc	[from_pretrained] imporve the error message when `_no_split_modules` is not defined (#23861 ) * Better warning * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * format line --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-30 17:12:14 +02:00

... 39 40 41 42 43 ...

15053 Commits