Phuc Van Phan
5af2c62696
docs: add space to docs ( #26067 )
...
* docs: add space to docs
* docs: remove reduntant space
2023-09-11 22:03:26 +01:00
Patrick von Platen
ce2e7ef3d9
[Core] Add lazy import structure to imports ( #26090 )
...
* improve import time
* Update src/transformers/integrations/__init__.py
* sort import
2023-09-11 17:20:29 +02:00
Phuc Van Phan
9cebae64ad
docs: update link huggingface map ( #26077 )
2023-09-11 12:57:04 +01:00
Hang
7fd2d68613
only main process should call _save on deepspeed zero3 ( #25959 )
...
only main process should call _save when deepspeed zero3
2023-09-11 12:56:36 +01:00
Arthur
95b374952d
[CITests
] skip failing tests until #26054 is merged ( #26063 )
...
* skip failing tests until #26054 is merged
* fixup
2023-09-09 05:43:26 +02:00
Arthur
09b2de6eb7
[CodeLlamaTokenizerFast
] Fix fix set_infilling_processor
to properly reset ( #26041 )
...
* fix `set_infilling_processor` to properly reset
* Add docstring!
* fixups
* more details in the docuemtation about the tokenization
* styl;e
2023-09-08 22:03:09 +02:00
Harheem Kim
d53606031f
🌐 [i18n-KO] Translated llama.md
to Korean ( #26044 )
...
* docs: ko-llama.md
* fix: chatgpt draft
* feat: manual edits
* fix: resolve suggestions
2023-09-08 12:38:41 -07:00
Angela Yi
6c26faa159
Skip warning if tracing with dynamo ( #25581 )
...
* Ignore warning if tracing with dynamo
* fix import error
* separate to function
* add test
2023-09-08 21:13:33 +02:00
Thien Tran
18ee1fe762
Update missing docs on activation_dropout
and fix DropOut docs for SEW-D ( #26031 )
...
* add missing doc for activation dropout
* fix doc for SEW-D dropout
* deprecate hidden_dropout for SEW-D
2023-09-08 14:51:54 +01:00
Alexander Krauck
0c67a72c9a
Fix Dropout Implementation in Graphormer ( #24817 )
...
This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically:
1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`.
2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers.
These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
2023-09-08 12:49:39 +01:00
dumpmemory
fb7d246951
Try to fix training Loss inconsistent after resume from old checkpoint ( #25872 )
...
* fix loss inconsistent after resume #25340
* fix typo
* clean code
* reformatted code
* adjust code according to comments
* adjust check_dataloader_randomsampler location
* return sampler only
* handle sampler is None
* Update src/transformers/trainer_pt_utils.py
thanks @amyeroberts
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-07 20:00:22 +01:00
MyungHa Kwon
c5e66a40a4
Punctuation fix ( #26025 )
...
fix typo
2023-09-07 19:54:52 +01:00
raghavanone
00efd64e51
Fix vilt config docstring parameter to match value in init ( #26017 )
...
* Fix vilt config init parameter to match the ones in documentation
* Fix the documentation
2023-09-07 19:53:43 +01:00
Muskan Kumar
02c4a77f57
Added HerBERT to README.md ( #26020 )
...
* Added HerBERT to README.md
* Update README.md to contain HerBERT (#26016 )
* Resolved #26016 : Updated READMEs and index.md to contain Herbert
Updated READMEs and ran make fix-copies
2023-09-07 19:51:45 +01:00
Sanchit Gandhi
2af87d018e
[VITS] Fix nightly tests ( #25986 )
...
* fix tokenizer
* make bs even
* fix multi gpu test
* style
* model forward
* fix torch import
* revert tok pin
2023-09-07 17:49:14 +01:00
CokeDong
3744126c87
Add tgs
speed metrics ( #25858 )
...
* Add tgs metrics
* bugfix and black formatting
* workaround for tokens counting
* formating and bugfix
* Fix
* Add opt-in for tgs metrics
* make style and fix error
* Fix doc
* fix docbuild
* hf-doc-build
* fix
* test
* Update src/transformers/training_args.py
renaming
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* Update src/transformers/training_args.py
renaming
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* Fix some symbol
* test
* Update src/transformers/trainer_utils.py
match nameing patterns
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/trainer.py
nice
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix reviews
* Fix
* Fix black
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-07 17:17:30 +01:00
Yih-Dar
0188739a74
Fix CircleCI config ( #26023 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-07 14:51:35 +02:00
Kai
df04959e55
fix _resize_token_embeddings will set lm head size to 0 when enabled deepspeed zero3 ( #26024 )
2023-09-07 10:10:40 +01:00
Zach Mueller
e3a9716384
Fix err with FSDP ( #25991 )
...
* Fix err
* Use version check
2023-09-07 09:52:53 +05:30
Marc Sun
fa6107c97e
modify context length for GPTQ + version bump ( #25899 )
...
* add new arg for gptq
* add tests
* add min version autogptq
* fix order
* skip test
* fix
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix style
* change model path
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-09-06 11:45:47 -04:00
Matt
300d6a4a62
Remove Falcon from undocumented list ( #26008 )
...
Remove falcon from undocumented list
2023-09-06 15:49:04 +01:00
Harheem Kim
fa522d8d7b
🌐 [i18n-KO] Translated llm_tutorial.md
to Korean ( #25791 )
...
* docs: ko: llm_tutoroal.md
* feat: chatgpt draft
* fix: manual edits
* fix: resolve suggestions
* fix: resolve suggestions
2023-09-06 07:40:03 -07:00
zspo
3e203f92be
Fix small typo README.md ( #25934 )
...
* fix some samll bugs in readme
* Update docs/README.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-06 14:07:29 +01:00
Matt
842e99f1b9
TF-OPT attention mask fixes ( #25238 )
...
* stash commit
* More OPT updates
* Update src/transformers/models/opt/modeling_tf_opt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-06 13:37:27 +01:00
Lysandre Debut
f6301b9a13
Falcon: fix revision propagation ( #26006 )
...
* Fix revision propagation
* Cleaner
2023-09-06 07:21:00 -04:00
Nino Risteski
f6295c6c53
Update README.md ( #26003 )
...
fixed a typo
2023-09-06 10:55:11 +01:00
tju_skywalker
172f42c512
save space when converting hf model to megatron model. ( #25950 )
...
* fix convert megatron model too large
* fix convert megatron model too large
2023-09-05 16:47:48 -04:00
Tanay Mehta
b8def68934
Fix Mega chunking error when using decoder-only model ( #25765 )
...
* add: potential fix to mega chunking in decoder only model bug
* add: decoder with chunking test
* add: input_mask passed with input_ids
2023-09-05 21:50:14 +02:00
Arthur
4fa0aff21e
[VITS
] tokenizer integration test: fix revision did not exist ( #25996 )
...
* revision did not exist
* correct revision
2023-09-05 21:21:33 +02:00
Arthur
d0354e5e86
[CI
] Fix red CI and ERROR failed should show ( #25995 )
...
* start with error too
* fix ?
* start with nit
* one more path
* use `job_name`
* mark pipeline test as slow
2023-09-05 20:16:00 +02:00
Injin Paek
6206f599e1
Add LLaMA resources ( #25859 )
...
* docs: feat: model resources for llama
* fix: resolve suggestion
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-09-05 10:50:08 -07:00
Sanchit Gandhi
8d518013ef
[Wav2Vec2 Conformer] Fix inference float16 ( #25985 )
...
* [Wav2Vec2 Conformer] Fix inference float16
* fix test
* fix test more
* clean pipe test
2023-09-05 18:26:06 +01:00
Sourab Mangrulkar
6bc517ccd4
deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler ( #25863 )
...
* Add support for deepspeed optimizer and HF scheduler
* fix bug
* fix the import
* fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario
* fix loading of hf scheduler when loading deepspeed checkpoint
* fix import of `DeepSpeedSchedulerWrapper`
* add tests
* add the comment and skip the failing tests
* address comment
2023-09-05 22:31:20 +05:30
raghavanone
1110b565d6
Add TFDebertaV2ForMultipleChoice ( #25932 )
...
* Add TFDebertaV2ForMultipleChoice
* Import newer model in main init
* Fix import issues
* Fix copies
* Add doc
* Fix tests
* Fix copies
* Fix docstring
2023-09-05 17:13:06 +01:00
andreeahedes
da1af21dbb
PegasusX add _no_split_modules ( #25933 )
...
* no_split_modules
* no_split_modules
* inputs_embeds+pos same device
* update _no_split_modules
* update _no_split_modules
2023-09-05 16:34:34 +01:00
Abhilash Majumder
70a98024b1
Patch with accelerate xpu ( #25714 )
...
* patch with accelerate xpu
* patch with accelerate xpu
* formatting
* fix tests
* revert ruff unrelated fixes
* revert ruff unrelated fixes
* revert ruff unrelated fixes
* fix test
* review fixes
* review fixes
* black fixed
* review commits
* review commits
* style fix
* use pytorch_utils
* revert markuplm test
2023-09-05 15:41:42 +01:00
Yih-Dar
aa5c94d38d
Show failed tests on CircleCI layout in a better way ( #25895 )
...
* update
* update
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 15:49:33 +02:00
Joao Gante
9a70d6e56f
Trainer: delegate default generation values to generation_config
( #25987 )
2023-09-05 14:47:00 +01:00
Sahel Sharify
aea761499f
Update training_args.py to remove the runtime error ( #25920 )
...
This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error:
File "..../transformers/training_args.py", line 1544, in post_init
for k, v in self.fsdp_config.items():
RuntimeError: dictionary keys changed during iteration
2023-09-05 12:43:51 +01:00
Traun Leyden
7011cd8667
Update RAG README.md with correct path to examples/seq2seq ( #25953 )
...
Update README.md with correct path to examples/seq2seq
2023-09-05 12:31:59 +01:00
Julien Chaumond
6316ce8d27
[doc] Always call it Agents for consistency ( #25958 )
2023-09-05 12:27:20 +01:00
Yih-Dar
391f26459a
Use main in conversion script ( #25973 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 13:04:49 +02:00
Kai
6f125aaa48
fix typo ( #25981 )
...
rename doanloading to downloading
2023-09-05 11:13:06 +01:00
Susnato Dhar
52a46dc57b
Add Pop2Piano
space demo. ( #25975 )
...
Update pop2piano.md
2023-09-05 11:07:02 +01:00
Huazhong Ji
1cc3bc22fe
nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0 ( #25974 )
...
nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the
minimum PyTorch version we currently support is 1.10.0
2023-09-05 11:37:54 +02:00
Yih-Dar
fbbe1b8a40
Fix test_load_img_url_timeout
( #25976 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 11:34:28 +02:00
Yih-Dar
feec56959a
Fix Detr CI ( #25972 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 11:19:56 +02:00
Susnato Dhar
404ff8fc17
Fix typo ( #25966 )
...
* Update feature_extraction_clap.py
* changed all lenght to length
2023-09-05 10:12:25 +02:00
Lysandre
d8e13b3e04
v4.34.dev.0
2023-09-04 15:12:11 -04:00
Younes Belkada
49b69fe0d4
[Falcon
] Remove SDPA for falcon to support earlier versions of PyTorch (< 2.0) ( #25947 )
...
* remove SDPA for falcon
* revert previous behaviour and add warning
* nit
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Update src/transformers/models/falcon/modeling_falcon.py
---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-09-04 14:34:04 -04:00