Commit Graph

16108 Commits

Author SHA1 Message Date
Zhan Ling
1f5590d32e
Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841)
Add _no_split_modules to CLIPModel
2024-01-30 02:15:58 +01:00
Omar Sanseviero
a989c6c6eb
Don't allow passing load_in_8bit and load_in_4bit at the same time (#28266)
* Update quantization_config.py

* Style

* Protect from setting directly

* add tests

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-01-30 01:43:40 +01:00
ThibaultLengagne
cd2eb8cb2b
Add French translation: french README.md (#28696)
* doc: french README

Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

* doc: Add Depth Anything

Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

* doc: Add french link in other docs

Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

* doc: Add missing links in fr docs

* doc: fix several mistakes in translation

Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

---------

Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
Co-authored-by: Sarapuce <alexandreh@padok.fr>
2024-01-29 10:07:49 -08:00
Ajay Patel
a055d09e11
Support saving only PEFT adapter in checkpoints when using PEFT + FSDP (#28297)
* Update trainer.py

* Revert "Update trainer.py"

This reverts commit 0557e2cc9effa3a41304322032239a3874b948a7.

* Make trainer.py use adapter_only=True when using FSDP + PEFT

* Support load_best_model with adapter_only=True

* Ruff format

* Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it
2024-01-29 17:10:15 +00:00
Sanchit Gandhi
da3c79b245
[Whisper] Make tokenizer normalization public (#28136)
* [Whisper] Make tokenizer normalization public

* add to docs
2024-01-29 16:07:35 +00:00
xkszltl
e694e985d7
Fix typo of Block. (#28727) 2024-01-29 15:25:00 +00:00
amyeroberts
9e8f35fa28
Mark test_constrained_beam_search_generate as flaky (#28757)
* Make test_constrained_beam_search_generate as flaky

* Update tests/generation/test_utils.py
2024-01-29 15:22:25 +00:00
amyeroberts
0f8d015a41
Pin pytest version <8.0.0 (#28758)
* Pin pytest version <8.0.0

* Update setup.py

* make deps_table_update
2024-01-29 15:22:14 +00:00
Julien Chaumond
26aa03a252
small doc update for CamemBERT (#28644) 2024-01-29 15:46:32 +01:00
Nate Cibik
0548af54cc
Enable Gradient Checkpointing in Deformable DETR (#28686)
* Enabled gradient checkpointing in Deformable DETR

* Enabled gradient checkpointing in Deformable DETR encoder

* Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code
2024-01-29 10:10:40 +00:00
Wesley Gifford
f72c7c22d9
PatchtTST and PatchTSMixer fixes (#28083)
* 🐛 fix .max bug

* remove prediction_length from regression output dimensions

* fix parameter names, fix output names, update tests

* ensure shape for PatchTST

* ensure output shape for PatchTSMixer

* update model, batch, and expected for regression distribution test

* update test expected

Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

* Update tests/models/patchtst/test_modeling_patchtst.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtst/test_modeling_patchtst.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtst/test_modeling_patchtst.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* standardize on patch_length

Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Make arguments more explicit

Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

* adjust prepared inputs

Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

---------

Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-29 10:09:26 +00:00
Vinyzu
3a08cc485f
[Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD -> TMDB) (#28751)
* [Docs] Fix Typo in English CLIP model_doc

* [Docs] Fix Typo in Japanese CLIP model_doc
2024-01-29 10:06:51 +00:00
Klaus Hipp
39fa400969
Fix input data file extension in examples (#28741) 2024-01-29 10:06:31 +00:00
Yih-Dar
5649c0cbb8
Fix DepthEstimationPipeline's docstring (#28733)
* fix

* fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-29 10:42:55 +01:00
Angela Yi
243e186efb
Add serialization logic to pytree types (#27871)
* Add serialized type name to pytrees

* Modify context

* add serde test
2024-01-29 10:41:20 +01:00
amyeroberts
f1cc615721
[Siglip] protect from imports if sentencepiece not installed (#28737)
[Siglip] protect from imports if sentencepiece not installed
2024-01-28 15:10:14 +00:00
Joao Gante
03cc17775b
Generate: deprecate old src imports (#28607) 2024-01-27 15:54:19 +00:00
Joao Gante
a28a76996c
Falcon: removed unused function (#28605) 2024-01-27 15:52:59 +00:00
Sanchit Gandhi
de13a951b3
[Flax] Update no init test for Flax v0.7.1 (#28735) 2024-01-26 18:20:39 +00:00
Steven Liu
abe0289e6d
[docs] Fix datasets in guides (#28715)
* change datasets

* fix
2024-01-26 09:29:07 -08:00
Yih-Dar
f8b7c4345a
Unpin pydantic (#28728)
* try pydantic v2

* try pydantic v2

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-26 17:39:33 +01:00
Scruel Tao
3aea38ce61
fix: suppress GatedRepoError to use cache file (fix #28558). (#28566)
* fix: suppress `GatedRepoError` to use cache file (fix #28558).

* move condition_to_return parameter back to outside.
2024-01-26 16:25:08 +00:00
Matt
708b19eb09
Stop confusing the TF compiler with ModelOutput objects (#28712)
* Stop confusing the TF compiler with ModelOutput objects

* Stop confusing the TF compiler with ModelOutput objects
2024-01-26 12:22:29 +00:00
Yih-Dar
a638de1987
Fix weights_only (#28725)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-26 13:00:49 +01:00
Shukant Pal
d6ac8f4ad2
Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (#28717)
Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS

It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.
2024-01-26 11:59:34 +00:00
D
3a46e30dd1
[docs] Update preprocessing.md (#28719)
* Update preprocessing.md

adjust ImageProcessor link to working target (same as in lower section of file)

* Update preprocessing.md
2024-01-26 11:58:57 +00:00
Turetskii Mikhail
1f47a24aa1
fix: corrected misleading log message in save_pretrained function (#28699) 2024-01-26 11:52:53 +00:00
Facico
bbe30c6968
support PeftMixedModel signature inspect (#28321)
* support PeftMixedModel signature inspect

* import PeftMixedModel only peft>=0.7.0

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix styling

* Update src/transformers/trainer.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style fixup

* fix note

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-26 12:05:01 +01:00
fxmarty
8eb74c1c89
Fix duplicate & unnecessary flash attention warnings (#28557)
* fix duplicate & unnecessary flash warnings

* trigger ci

* warning_once

* if/else order

---------

Co-authored-by: Your Name <you@example.com>
2024-01-26 09:37:04 +01:00
Yih-Dar
142ce68389
Don't fail when LocalEntryNotFoundError during processor_config.json loading (#28709)
* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-26 09:02:32 +01:00
Peter Götz
2875195887
[docs] Improve visualization for vertical parallelism (#28583)
The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.
2024-01-25 17:55:11 +00:00
Fanli Lin
4cbd876e42
[Vilt] align input and model dtype in the ViltPatchEmbeddings forward pass (#28633)
align dtype
2024-01-25 15:03:20 +00:00
Yusuf
24f1a00e4c
Update question_answering.md (#28694)
fix typo:

from:

 "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")"

to:
model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
2024-01-25 14:06:38 +00:00
Merve Noyan
2000095666
Improve Backbone API docs (#28666)
Update backbones.md
2024-01-25 11:51:58 +00:00
Tom Aarsen
7fa4b36eba
[chore] Add missing space in warning (#28695)
Add missing space in warning
2024-01-25 09:34:52 +00:00
NielsRogge
963db81a5a
Add Depth Anything (#28654)
* First draft

* More improvements

* More improvements

* More improvements

* More improvements

* Add docs

* Remove file

* Add copied from

* Address comments

* Address comments

* Address comments

* Fix style

* Update docs

* Convert all checkpoints, add integration test

* Rename checkpoints

* Add pretrained backbone attributes

* Fix default config

* Address comment

* Add figure to docs

* Fix bug thanks to @xenova

* Update conversion script

* Fix integration test
2024-01-25 09:34:50 +01:00
Steven Liu
f40b87de0c
[docs] Fix doc format (#28684)
* fix hfoptions

* revert changes to other files

* fix
2024-01-24 11:18:59 -08:00
Fanli Lin
8278b1538e
improve efficient training on CPU documentation (#28646)
* update doc

* revert

* typo fix

* refine

* add dtypes

* Update docs/source/en/perf_train_cpu.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* no comma

* use avx512-vnni

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-01-24 09:07:13 -08:00
nakranivaibhav
5d29530ea2
Improved type hinting for all attention parameters (#28479)
* Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None'

* Fixed the ruff formatting issue

* fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None'

* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py

* test fail update

* fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py

* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py

* test fail update

* Removed the myvenv file

* Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py
2024-01-24 16:47:34 +00:00
Steven Liu
738ec75c90
[docs] DeepSpeed (#28542)
* config

* optim

* pre deploy

* deploy

* save weights, memory, troubleshoot, non-Trainer

* done
2024-01-24 08:31:28 -08:00
amyeroberts
bb6aa8bc5f
Add back in generation types (#28681) 2024-01-24 14:37:30 +00:00
jeffhataws
0549000c5b
Use save_safetensor to disable safe serialization for XLA (#28669)
* Use save_safetensor to disable safe serialization for XLA

https://github.com/huggingface/transformers/issues/28438

* Style fixup
2024-01-24 11:57:45 +00:00
Khai Mai
c5c69096b3
Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517)
* fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask

* format code using black and ruff

* skip computing mask if attention_mask=None

* add tests for load balancing loss Mixtral-Moe

* fix assert loss is different in mixtral_test

* fix pad_leng

* use assertNotAlmostEqual and print to debug

* remove print for debug

* minor updates

* reduce rtol and atol
2024-01-24 10:12:14 +01:00
Vladimir Pinera
5f81266fb0
Update README_es.md (#28612)
Fixing grammatical errors in the text
2024-01-23 21:09:01 +00:00
Zhenwei
39c3c0a72a
fix a hidden bug of GenerationConfig, now the generation_config.json can be loaded successfully (#28604)
* fix a hidden bug of GenerationConfig

* keep `sort_keys=True` to maintain visibility

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update configuration_utils.py

in case `obj` is a list, check the items in the list

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-23 17:48:38 +00:00
Matt
ebc8f47bd9
Remove deprecated eager_serving fn (#28665)
* Remove deprecated eager_serving fn

* Fix the input_signature docstring while I'm here
2024-01-23 16:53:07 +00:00
cmathw
9a4521dd9b
Support single token decode for CodeGenTokenizer (#28628)
convert token id to list in .decode()
2024-01-23 16:27:24 +01:00
Quentin Meeus
5b5e71dc41
add dataloader prefetch factor in training args and trainer (#28498)
* add dataloader prefetch factor in training args and trainer

* remove trailing spaces

* prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None

dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process.

* Remove whitespaces in empty line

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-23 15:08:18 +00:00
Zach Mueller
582d104b93
Fix windows err with checkpoint race conditions (#28637)
Fix windows err
2024-01-23 14:30:36 +01:00
Scruel Tao
c475eca9cd
tensor_size - fix copy/paste error msg typo (#28660)
Fix copy/paste error msg typo
2024-01-23 11:22:02 +00:00