transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
Yih-Dar	142ce68389	Don't fail when `LocalEntryNotFoundError` during `processor_config.json` loading (#28709 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 09:02:32 +01:00
Peter Götz	2875195887	[`docs`] Improve visualization for vertical parallelism (#28583 ) The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.	2024-01-25 17:55:11 +00:00
Fanli Lin	4cbd876e42	[`Vilt`] align input and model dtype in the ViltPatchEmbeddings forward pass (#28633 ) align dtype	2024-01-25 15:03:20 +00:00
Yusuf	24f1a00e4c	Update question_answering.md (#28694 ) fix typo: from: "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")" to: model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")	2024-01-25 14:06:38 +00:00
Merve Noyan	2000095666	Improve Backbone API docs (#28666 ) Update backbones.md	2024-01-25 11:51:58 +00:00
Tom Aarsen	7fa4b36eba	[`chore`] Add missing space in warning (#28695 ) Add missing space in warning	2024-01-25 09:34:52 +00:00
NielsRogge	963db81a5a	Add Depth Anything (#28654 ) * First draft * More improvements * More improvements * More improvements * More improvements * Add docs * Remove file * Add copied from * Address comments * Address comments * Address comments * Fix style * Update docs * Convert all checkpoints, add integration test * Rename checkpoints * Add pretrained backbone attributes * Fix default config * Address comment * Add figure to docs * Fix bug thanks to @xenova * Update conversion script * Fix integration test	2024-01-25 09:34:50 +01:00
Steven Liu	f40b87de0c	[docs] Fix doc format (#28684 ) * fix hfoptions * revert changes to other files * fix	2024-01-24 11:18:59 -08:00
Fanli Lin	8278b1538e	improve efficient training on CPU documentation (#28646 ) * update doc * revert * typo fix * refine * add dtypes * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * no comma * use avx512-vnni --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-01-24 09:07:13 -08:00
nakranivaibhav	5d29530ea2	Improved type hinting for all attention parameters (#28479 ) * Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None' * Fixed the ruff formatting issue * fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None' * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * Removed the myvenv file * Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py	2024-01-24 16:47:34 +00:00
Steven Liu	738ec75c90	[docs] DeepSpeed (#28542 ) * config * optim * pre deploy * deploy * save weights, memory, troubleshoot, non-Trainer * done	2024-01-24 08:31:28 -08:00
amyeroberts	bb6aa8bc5f	Add back in generation types (#28681 )	2024-01-24 14:37:30 +00:00
jeffhataws	0549000c5b	Use save_safetensor to disable safe serialization for XLA (#28669 ) * Use save_safetensor to disable safe serialization for XLA https://github.com/huggingface/transformers/issues/28438 * Style fixup	2024-01-24 11:57:45 +00:00
Khai Mai	c5c69096b3	Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517 ) * fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask * format code using black and ruff * skip computing mask if attention_mask=None * add tests for load balancing loss Mixtral-Moe * fix assert loss is different in mixtral_test * fix pad_leng * use assertNotAlmostEqual and print to debug * remove print for debug * minor updates * reduce rtol and atol	2024-01-24 10:12:14 +01:00
Vladimir Pinera	5f81266fb0	Update README_es.md (#28612 ) Fixing grammatical errors in the text	2024-01-23 21:09:01 +00:00
Zhenwei	39c3c0a72a	fix a hidden bug of `GenerationConfig`, now the `generation_config.json` can be loaded successfully (#28604 ) * fix a hidden bug of GenerationConfig * keep `sort_keys=True` to maintain visibility * Update src/transformers/generation/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update configuration_utils.py in case `obj` is a list, check the items in the list --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-23 17:48:38 +00:00
Matt	ebc8f47bd9	Remove deprecated eager_serving fn (#28665 ) * Remove deprecated eager_serving fn * Fix the input_signature docstring while I'm here	2024-01-23 16:53:07 +00:00
cmathw	9a4521dd9b	Support single token decode for `CodeGenTokenizer` (#28628 ) convert token id to list in .decode()	2024-01-23 16:27:24 +01:00
Quentin Meeus	5b5e71dc41	add dataloader prefetch factor in training args and trainer (#28498 ) * add dataloader prefetch factor in training args and trainer * remove trailing spaces * prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process. * Remove whitespaces in empty line * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-23 15:08:18 +00:00
Zach Mueller	582d104b93	Fix windows err with checkpoint race conditions (#28637 ) Fix windows err	2024-01-23 14:30:36 +01:00
Scruel Tao	c475eca9cd	`tensor_size` - fix copy/paste error msg typo (#28660 ) Fix copy/paste error msg typo	2024-01-23 11:22:02 +00:00
amyeroberts	27c79a0fb4	Enable instantiating model with pretrained backbone weights (#28214 ) * Enable instantiating model with pretrained backbone weights * Update tests so backbone checkpoint isn't passed in * Remove doc updates until changes made in modeling code * Clarify pretrained import * Update configs - docs and validation check * Update src/transformers/utils/backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clarify exception message * Update config init in tests * Add test for when use_timm_backbone=True * Small test updates --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-23 11:01:50 +00:00
Lysandre Debut	008a6a2208	Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599 ) * Initial commit * Requirements & tests * Tests * Tests * Rogue import * Rogue torch import * Cleanup * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * bfloat16 management * Sanchit's comments * Import shield * apply suggestions from code review * correct bf16 * rebase --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>	2024-01-23 10:28:23 +01:00
Dave Berenbaum	039866094c	integrations: fix DVCLiveCallback model logging (#28653 )	2024-01-23 10:11:10 +01:00
Huazhong Ji	1fc1296014	get default device through `PartialState().default_device` as it has been officially released (#27256 ) get default device through `PartialState().default_device` as it has been officially released	2024-01-23 10:09:31 +01:00
amyeroberts	e547458c43	Fix phi model doc checkpoint (#28581 ) Co-authored-by: Pashmina Cameron <11311835+pashminacameron@users.noreply.github.com>	2024-01-22 17:15:07 +00:00
amyeroberts	590be773e6	[`SigLIP`] Only import tokenizer if sentencepiece available (#28636 ) Only import class if sp available	2024-01-22 15:20:16 +00:00
Sounak Dey	a35ea570a8	Update image_processing_deformable_detr.py (#28561 ) * Update image_processing_deformable_detr.py * Changes after running make fix-copies	2024-01-22 15:17:39 +00:00
Younes Belkada	e201864bcb	[`GPTNeoX`] Fix GPTNeoX + Flash Attention 2 issue (#28645 ) Update modeling_gpt_neox.py	2024-01-22 15:50:01 +01:00
isaac-vidas	dafd59512c	[`Llava`] Update convert_llava_weights_to_hf.py script (#28617 ) * Update convert_llava_weights_to_hf.py script * Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception. * Remove keys that ends with `inv_freq` from the state dict. * Add examples and instructions for creating `model_state_dict.bin` that can be used by the script. * Update convert_llava_weights_to_hf.py * Update convert_vipllava_weights_to_hf.py	2024-01-22 15:28:18 +01:00
bofeng huang	deb2b59073	Fix lr_scheduler in no_trainer training scripts (#27872 ) * Fix lr_scheduler * Fix lr scheduler	2024-01-22 14:22:18 +00:00
Matt	692c3c6b73	Add config tip to custom model docs (#28601 ) Add tip to custom model docs	2024-01-22 13:46:04 +00:00
Yih-Dar	d336c56d94	Avoid root logger's level being changed (#28638 ) * avoid root logger's level being changed --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-22 14:45:30 +01:00
Matt	bf674153d3	Add missing key to TFLayoutLM signature (#28640 ) Fix missing bbox in LayoutLM signature	2024-01-22 13:16:29 +00:00
jheitmann	f0acf7b6d8	Fix id2label assignment in run_classification.py (#28590 )	2024-01-22 11:31:31 +00:00
Arthur	83f9196cc4	[`GPTNeoX`] Fix BC issue with 4.36 (#28602 ) * fix dtype issue * add a test * update copied from mentions * nits * fixup * fix copies * Apply suggestions from code review	2024-01-21 17:01:19 +00:00
Sangbum Daniel Choi	3f69f415ad	Fix auxiliary loss related code in transformers (#28406 ) * [DETA] fix freeze/unfreeze function * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add freeze/unfreeze test case in DETA * fix type * fix typo 2 * fix : enable aux and enc loss in training pipeline * Add unsynced variables from original DETA for training * modification for passing CI test * make style * make fix * manual make fix * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking * remove print * divide configuration in DetaModel and DetaForObjectDetection * image smaller size than 224 will give topk error * pred_boxes and logits should be equivalent to two_stage_num_proposals * add missing part in DetaConfig * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add docstring in configure and prettify TO DO part * change distribute related code to accelerate * Update src/transformers/models/deta/configuration_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deta/test_modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * protect importing accelerate * change variable name to specific value * wrong import * fix aux_loss in conditional_detr * add test aux_loss * add aux_loss test in deta and table_transformer * fix yolos since it doesn't have auxiliary function * fix maskformer auxiliary_loss related code * make style * change param 'auxiliary_loss' to 'use_auxiliary_loss' * change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests * make style & fix-copies, also revert yolos related parameter * revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig * revert variable name in yolos * revert maskformer * add aux_loss test in maskformer * make style * Update src/transformers/models/yolos/configuration_yolos.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-19 14:12:01 +00:00
Joao Gante	948ffff407	RWKV: raise informative exception when attempting to manipulate `past_key_values` (#28600 )	2024-01-19 14:09:36 +00:00
Ofir Zafrir	9efec11400	Fix `_speculative_sampling` implementation (#28508 )	2024-01-19 14:07:31 +00:00
Matt	d15781597a	Allow add_tokens for ESM (#28535 ) * Allow non-special tokens to be added * Add test, fix token adding code * Revert changes to id_to_token and token_to_id * Update the ESM tokenizer to be a bit more standardized * Update src/transformers/models/esm/tokenization_esm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-19 12:32:05 +00:00
isaac-vidas	5b7f4bc6c1	[`Llava`] Fix convert_llava_weights_to_hf.py script (#28570 ) * Update convert_llava_weights_to_hf.py Fix call to `tokenizer.add_tokens` * Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py	2024-01-19 13:31:25 +01:00
NielsRogge	faf03541e2	[SigLIP] Don't pad by default (#28578 ) First draft	2024-01-19 13:30:00 +01:00
Fanli Lin	8db64367b2	Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386 ) * remove elif xpu * remove redudant code	2024-01-19 13:28:53 +01:00
Patrick von Platen	690fe73f20	[Whisper] Finalize batched SOTA long-form generation (#27658 ) * finalize * make fix copies whisper * [Tests] Make sure that we don't run tests mulitple times * Update src/transformers/models/whisper/modeling_whisper.py * [Tests] Make sure that we don't run tests mulitple times * fix more * improve * improve * improve further * improve more * improve * fix more * git commit and git push * fix more * fix more * fix more * New try * Fix more whisper stuff * Improve * correct more * correct more * correct more * Fix some tests * Add more tests * correct more * correct more * correct more * push * correct more * Fix more * Better * without dec mask * correct more * clean * save intermediate * Fix more * Fix VAD for large-v2 * Save new * Correct more * make cleaner * correct tests * correct src * Finish * Fix more * Fix more * finish * Fix edge cases * fix return_dict_in_generate * fix all tests * make style * add docstrings * add docstrings * Fix logit processor * make style * fix pipeline test * fix more style * Apply suggestions from code review * apply feedback Sanchit * correct more * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct more * correct more * correct more * Fix staticmethod * correct more * fix * fix slow tests * make style * fix tokenizer test * fix tokenizer test * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * finish * finish * revert kwargs change --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-19 14:04:17 +02:00
Saibo-creator	d4fc1eb498	feat: Sequential beam search (#26304 )	2024-01-19 11:36:54 +00:00
Yoach Lacombe	268fc1fdfa	Add w2v2bert to pipeline (#28585 ) * generalize asr pipeline to fbank models * change w2v2 pipeline output * Update test_pipelines_automatic_speech_recognition.py	2024-01-19 11:25:01 +00:00
Amy Roberts	b2748a6efd	v4.38.dev.0	2024-01-19 10:43:28 +00:00
Yih-Dar	db9a7e9d3d	Don't save `processor_config.json` if a processor has no extra attribute (#28584 ) * not save if empty * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-19 09:59:14 +00:00
Yoach Lacombe	772307be76	Making CTC training example more general (#28582 ) * add w2v2bert compatibility * Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-18 17:01:49 +00:00
Sanchit Gandhi	186aa6befe	[Whisper] Fix audio classification with weighted layer sum (#28563 ) * fix * tests * fix test	2024-01-18 16:41:44 +00:00

1 2 3 4 5 ...

15079 Commits