transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-20 04:58:22 +06:00

Author	SHA1	Message	Date
Zhan Ling	1f5590d32e	Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841 ) Add _no_split_modules to CLIPModel	2024-01-30 02:15:58 +01:00
Omar Sanseviero	a989c6c6eb	Don't allow passing `load_in_8bit` and `load_in_4bit` at the same time (#28266 ) * Update quantization_config.py * Style * Protect from setting directly * add tests * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-01-30 01:43:40 +01:00
ThibaultLengagne	cd2eb8cb2b	Add French translation: french README.md (#28696 ) * doc: french README Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add Depth Anything Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add french link in other docs Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add missing links in fr docs * doc: fix several mistakes in translation Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> --------- Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> Co-authored-by: Sarapuce <alexandreh@padok.fr>	2024-01-29 10:07:49 -08:00
Ajay Patel	a055d09e11	Support saving only PEFT adapter in checkpoints when using PEFT + FSDP (#28297 ) * Update trainer.py * Revert "Update trainer.py" This reverts commit 0557e2cc9effa3a41304322032239a3874b948a7. * Make trainer.py use adapter_only=True when using FSDP + PEFT * Support load_best_model with adapter_only=True * Ruff format * Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it	2024-01-29 17:10:15 +00:00
Sanchit Gandhi	da3c79b245	[Whisper] Make tokenizer normalization public (#28136 ) * [Whisper] Make tokenizer normalization public * add to docs	2024-01-29 16:07:35 +00:00
xkszltl	e694e985d7	Fix typo of `Block`. (#28727 )	2024-01-29 15:25:00 +00:00
amyeroberts	9e8f35fa28	Mark test_constrained_beam_search_generate as flaky (#28757 ) * Make test_constrained_beam_search_generate as flaky * Update tests/generation/test_utils.py	2024-01-29 15:22:25 +00:00
amyeroberts	0f8d015a41	Pin pytest version <8.0.0 (#28758 ) * Pin pytest version <8.0.0 * Update setup.py * make deps_table_update	2024-01-29 15:22:14 +00:00
Julien Chaumond	26aa03a252	small doc update for CamemBERT (#28644 )	2024-01-29 15:46:32 +01:00
Nate Cibik	0548af54cc	Enable Gradient Checkpointing in Deformable DETR (#28686 ) * Enabled gradient checkpointing in Deformable DETR * Enabled gradient checkpointing in Deformable DETR encoder * Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code	2024-01-29 10:10:40 +00:00
Wesley Gifford	f72c7c22d9	PatchtTST and PatchTSMixer fixes (#28083 ) * 🐛 fix .max bug * remove prediction_length from regression output dimensions * fix parameter names, fix output names, update tests * ensure shape for PatchTST * ensure output shape for PatchTSMixer * update model, batch, and expected for regression distribution test * update test expected Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * standardize on patch_length Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make arguments more explicit Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * adjust prepared inputs Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> --------- Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-29 10:09:26 +00:00
Vinyzu	3a08cc485f	[Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD -> TMDB) (#28751 ) * [Docs] Fix Typo in English CLIP model_doc * [Docs] Fix Typo in Japanese CLIP model_doc	2024-01-29 10:06:51 +00:00
Klaus Hipp	39fa400969	Fix input data file extension in examples (#28741 )	2024-01-29 10:06:31 +00:00
Yih-Dar	5649c0cbb8	Fix `DepthEstimationPipeline`'s docstring (#28733 ) * fix * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-29 10:42:55 +01:00
Angela Yi	243e186efb	Add serialization logic to pytree types (#27871 ) * Add serialized type name to pytrees * Modify context * add serde test	2024-01-29 10:41:20 +01:00
amyeroberts	f1cc615721	[`Siglip`] protect from imports if sentencepiece not installed (#28737 ) [Siglip] protect from imports if sentencepiece not installed	2024-01-28 15:10:14 +00:00
Joao Gante	03cc17775b	Generate: deprecate old src imports (#28607 )	2024-01-27 15:54:19 +00:00
Joao Gante	a28a76996c	Falcon: removed unused function (#28605 )	2024-01-27 15:52:59 +00:00
Sanchit Gandhi	de13a951b3	[Flax] Update no init test for Flax v0.7.1 (#28735 )	2024-01-26 18:20:39 +00:00
Steven Liu	abe0289e6d	[docs] Fix datasets in guides (#28715 ) * change datasets * fix	2024-01-26 09:29:07 -08:00
Yih-Dar	f8b7c4345a	Unpin pydantic (#28728 ) * try pydantic v2 * try pydantic v2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 17:39:33 +01:00
Scruel Tao	3aea38ce61	fix: suppress `GatedRepoError` to use cache file (fix #28558 ). (#28566 ) * fix: suppress `GatedRepoError` to use cache file (fix #28558). * move condition_to_return parameter back to outside.	2024-01-26 16:25:08 +00:00
Matt	708b19eb09	Stop confusing the TF compiler with ModelOutput objects (#28712 ) * Stop confusing the TF compiler with ModelOutput objects * Stop confusing the TF compiler with ModelOutput objects	2024-01-26 12:22:29 +00:00
Yih-Dar	a638de1987	Fix `weights_only` (#28725 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 13:00:49 +01:00
Shukant Pal	d6ac8f4ad2	Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (#28717 ) Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.	2024-01-26 11:59:34 +00:00
D	3a46e30dd1	[`docs`] Update preprocessing.md (#28719 ) * Update preprocessing.md adjust ImageProcessor link to working target (same as in lower section of file) * Update preprocessing.md	2024-01-26 11:58:57 +00:00
Turetskii Mikhail	1f47a24aa1	fix: corrected misleading log message in save_pretrained function (#28699 )	2024-01-26 11:52:53 +00:00
Facico	bbe30c6968	support PeftMixedModel signature inspect (#28321 ) * support PeftMixedModel signature inspect * import PeftMixedModel only peft>=0.7.0 * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix styling * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * style fixup * fix note --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-26 12:05:01 +01:00
fxmarty	8eb74c1c89	Fix duplicate & unnecessary flash attention warnings (#28557 ) * fix duplicate & unnecessary flash warnings * trigger ci * warning_once * if/else order --------- Co-authored-by: Your Name <you@example.com>	2024-01-26 09:37:04 +01:00
Yih-Dar	142ce68389	Don't fail when `LocalEntryNotFoundError` during `processor_config.json` loading (#28709 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 09:02:32 +01:00
Peter Götz	2875195887	[`docs`] Improve visualization for vertical parallelism (#28583 ) The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.	2024-01-25 17:55:11 +00:00
Fanli Lin	4cbd876e42	[`Vilt`] align input and model dtype in the ViltPatchEmbeddings forward pass (#28633 ) align dtype	2024-01-25 15:03:20 +00:00
Yusuf	24f1a00e4c	Update question_answering.md (#28694 ) fix typo: from: "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")" to: model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")	2024-01-25 14:06:38 +00:00
Merve Noyan	2000095666	Improve Backbone API docs (#28666 ) Update backbones.md	2024-01-25 11:51:58 +00:00
Tom Aarsen	7fa4b36eba	[`chore`] Add missing space in warning (#28695 ) Add missing space in warning	2024-01-25 09:34:52 +00:00
NielsRogge	963db81a5a	Add Depth Anything (#28654 ) * First draft * More improvements * More improvements * More improvements * More improvements * Add docs * Remove file * Add copied from * Address comments * Address comments * Address comments * Fix style * Update docs * Convert all checkpoints, add integration test * Rename checkpoints * Add pretrained backbone attributes * Fix default config * Address comment * Add figure to docs * Fix bug thanks to @xenova * Update conversion script * Fix integration test	2024-01-25 09:34:50 +01:00
Steven Liu	f40b87de0c	[docs] Fix doc format (#28684 ) * fix hfoptions * revert changes to other files * fix	2024-01-24 11:18:59 -08:00
Fanli Lin	8278b1538e	improve efficient training on CPU documentation (#28646 ) * update doc * revert * typo fix * refine * add dtypes * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * no comma * use avx512-vnni --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-01-24 09:07:13 -08:00
nakranivaibhav	5d29530ea2	Improved type hinting for all attention parameters (#28479 ) * Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None' * Fixed the ruff formatting issue * fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None' * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * Removed the myvenv file * Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py	2024-01-24 16:47:34 +00:00
Steven Liu	738ec75c90	[docs] DeepSpeed (#28542 ) * config * optim * pre deploy * deploy * save weights, memory, troubleshoot, non-Trainer * done	2024-01-24 08:31:28 -08:00
amyeroberts	bb6aa8bc5f	Add back in generation types (#28681 )	2024-01-24 14:37:30 +00:00
jeffhataws	0549000c5b	Use save_safetensor to disable safe serialization for XLA (#28669 ) * Use save_safetensor to disable safe serialization for XLA https://github.com/huggingface/transformers/issues/28438 * Style fixup	2024-01-24 11:57:45 +00:00
Khai Mai	c5c69096b3	Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517 ) * fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask * format code using black and ruff * skip computing mask if attention_mask=None * add tests for load balancing loss Mixtral-Moe * fix assert loss is different in mixtral_test * fix pad_leng * use assertNotAlmostEqual and print to debug * remove print for debug * minor updates * reduce rtol and atol	2024-01-24 10:12:14 +01:00
Vladimir Pinera	5f81266fb0	Update README_es.md (#28612 ) Fixing grammatical errors in the text	2024-01-23 21:09:01 +00:00
Zhenwei	39c3c0a72a	fix a hidden bug of `GenerationConfig`, now the `generation_config.json` can be loaded successfully (#28604 ) * fix a hidden bug of GenerationConfig * keep `sort_keys=True` to maintain visibility * Update src/transformers/generation/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update configuration_utils.py in case `obj` is a list, check the items in the list --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-23 17:48:38 +00:00
Matt	ebc8f47bd9	Remove deprecated eager_serving fn (#28665 ) * Remove deprecated eager_serving fn * Fix the input_signature docstring while I'm here	2024-01-23 16:53:07 +00:00
cmathw	9a4521dd9b	Support single token decode for `CodeGenTokenizer` (#28628 ) convert token id to list in .decode()	2024-01-23 16:27:24 +01:00
Quentin Meeus	5b5e71dc41	add dataloader prefetch factor in training args and trainer (#28498 ) * add dataloader prefetch factor in training args and trainer * remove trailing spaces * prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process. * Remove whitespaces in empty line * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-23 15:08:18 +00:00
Zach Mueller	582d104b93	Fix windows err with checkpoint race conditions (#28637 ) Fix windows err	2024-01-23 14:30:36 +01:00
Scruel Tao	c475eca9cd	`tensor_size` - fix copy/paste error msg typo (#28660 ) Fix copy/paste error msg typo	2024-01-23 11:22:02 +00:00

... 21 22 23 24 25 ...

16108 Commits