Commit Graph

13943 Commits

Author SHA1 Message Date
Phuc Van Phan
5af2c62696
docs: add space to docs (#26067)
* docs: add space to docs

* docs: remove reduntant space
2023-09-11 22:03:26 +01:00
Patrick von Platen
ce2e7ef3d9
[Core] Add lazy import structure to imports (#26090)
* improve import time

* Update src/transformers/integrations/__init__.py

* sort import
2023-09-11 17:20:29 +02:00
Phuc Van Phan
9cebae64ad
docs: update link huggingface map (#26077) 2023-09-11 12:57:04 +01:00
Hang
7fd2d68613
only main process should call _save on deepspeed zero3 (#25959)
only main process should call _save when deepspeed zero3
2023-09-11 12:56:36 +01:00
Arthur
95b374952d
[CITests] skip failing tests until #26054 is merged (#26063)
* skip failing tests until #26054 is merged

* fixup
2023-09-09 05:43:26 +02:00
Arthur
09b2de6eb7
[CodeLlamaTokenizerFast] Fix fix set_infilling_processor to properly reset (#26041)
* fix `set_infilling_processor` to properly reset

* Add docstring!

* fixups

* more details in the docuemtation about the tokenization

* styl;e
2023-09-08 22:03:09 +02:00
Harheem Kim
d53606031f
🌐 [i18n-KO] Translated llama.md to Korean (#26044)
* docs: ko-llama.md

* fix: chatgpt draft

* feat: manual edits

* fix: resolve suggestions
2023-09-08 12:38:41 -07:00
Angela Yi
6c26faa159
Skip warning if tracing with dynamo (#25581)
* Ignore warning if tracing with dynamo

* fix import error

* separate to function

* add test
2023-09-08 21:13:33 +02:00
Thien Tran
18ee1fe762
Update missing docs on activation_dropout and fix DropOut docs for SEW-D (#26031)
* add missing doc for activation dropout

* fix doc for SEW-D dropout

* deprecate hidden_dropout for SEW-D
2023-09-08 14:51:54 +01:00
Alexander Krauck
0c67a72c9a
Fix Dropout Implementation in Graphormer (#24817)
This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically:

1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`.
2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers.

These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
2023-09-08 12:49:39 +01:00
dumpmemory
fb7d246951
Try to fix training Loss inconsistent after resume from old checkpoint (#25872)
* fix loss inconsistent after resume  #25340

* fix typo

* clean code

* reformatted code

* adjust code according to comments

* adjust check_dataloader_randomsampler location

* return sampler only

* handle sampler is None

* Update src/transformers/trainer_pt_utils.py

thanks @amyeroberts

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-07 20:00:22 +01:00
MyungHa Kwon
c5e66a40a4
Punctuation fix (#26025)
fix typo
2023-09-07 19:54:52 +01:00
raghavanone
00efd64e51
Fix vilt config docstring parameter to match value in init (#26017)
* Fix vilt config init parameter to match the ones in documentation

* Fix the documentation
2023-09-07 19:53:43 +01:00
Muskan Kumar
02c4a77f57
Added HerBERT to README.md (#26020)
* Added HerBERT to README.md

* Update README.md to contain HerBERT (#26016)

* Resolved #26016: Updated READMEs and index.md to contain Herbert

Updated READMEs and ran make fix-copies
2023-09-07 19:51:45 +01:00
Sanchit Gandhi
2af87d018e
[VITS] Fix nightly tests (#25986)
* fix tokenizer

* make bs even

* fix multi gpu test

* style

* model forward

* fix torch import

* revert tok pin
2023-09-07 17:49:14 +01:00
CokeDong
3744126c87
Add tgs speed metrics (#25858)
* Add tgs metrics

* bugfix and black formatting

* workaround for tokens counting

* formating and bugfix

* Fix

* Add opt-in for tgs metrics

* make style and fix error

* Fix doc

* fix docbuild

* hf-doc-build

* fix

* test

* Update src/transformers/training_args.py

renaming

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Update src/transformers/training_args.py

renaming

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Fix some symbol

* test

* Update src/transformers/trainer_utils.py

match nameing patterns

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/trainer.py

nice

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix reviews

* Fix

* Fix black

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-07 17:17:30 +01:00
Yih-Dar
0188739a74
Fix CircleCI config (#26023)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-07 14:51:35 +02:00
Kai
df04959e55
fix _resize_token_embeddings will set lm head size to 0 when enabled deepspeed zero3 (#26024) 2023-09-07 10:10:40 +01:00
Zach Mueller
e3a9716384
Fix err with FSDP (#25991)
* Fix err

* Use version check
2023-09-07 09:52:53 +05:30
Marc Sun
fa6107c97e
modify context length for GPTQ + version bump (#25899)
* add new arg for gptq

* add tests

* add min version autogptq

* fix order

* skip test

* fix

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

* change model path

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-09-06 11:45:47 -04:00
Matt
300d6a4a62
Remove Falcon from undocumented list (#26008)
Remove falcon from undocumented list
2023-09-06 15:49:04 +01:00
Harheem Kim
fa522d8d7b
🌐[i18n-KO] Translated llm_tutorial.md to Korean (#25791)
* docs: ko: llm_tutoroal.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* fix: resolve suggestions
2023-09-06 07:40:03 -07:00
zspo
3e203f92be
Fix small typo README.md (#25934)
* fix some samll bugs in readme

* Update docs/README.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-06 14:07:29 +01:00
Matt
842e99f1b9
TF-OPT attention mask fixes (#25238)
* stash commit

* More OPT updates

* Update src/transformers/models/opt/modeling_tf_opt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-06 13:37:27 +01:00
Lysandre Debut
f6301b9a13
Falcon: fix revision propagation (#26006)
* Fix revision propagation

* Cleaner
2023-09-06 07:21:00 -04:00
Nino Risteski
f6295c6c53
Update README.md (#26003)
fixed a typo
2023-09-06 10:55:11 +01:00
tju_skywalker
172f42c512
save space when converting hf model to megatron model. (#25950)
* fix convert megatron model too large

* fix convert megatron model too large
2023-09-05 16:47:48 -04:00
Tanay Mehta
b8def68934
Fix Mega chunking error when using decoder-only model (#25765)
* add: potential fix to mega chunking in decoder only model bug

* add: decoder with chunking test

* add: input_mask passed with input_ids
2023-09-05 21:50:14 +02:00
Arthur
4fa0aff21e
[VITS] tokenizer integration test: fix revision did not exist (#25996)
* revision did not exist

* correct revision
2023-09-05 21:21:33 +02:00
Arthur
d0354e5e86
[CI] Fix red CI and ERROR failed should show (#25995)
* start with error too

* fix ?

* start with nit

* one more path

* use `job_name`

* mark pipeline test as slow
2023-09-05 20:16:00 +02:00
Injin Paek
6206f599e1
Add LLaMA resources (#25859)
* docs: feat: model resources for llama

* fix: resolve suggestion

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-09-05 10:50:08 -07:00
Sanchit Gandhi
8d518013ef
[Wav2Vec2 Conformer] Fix inference float16 (#25985)
* [Wav2Vec2 Conformer] Fix inference float16

* fix test

* fix test more

* clean pipe test
2023-09-05 18:26:06 +01:00
Sourab Mangrulkar
6bc517ccd4
deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863)
* Add support for deepspeed optimizer and HF scheduler

* fix bug

* fix the import

* fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario

* fix loading of hf scheduler when loading deepspeed checkpoint

* fix import of `DeepSpeedSchedulerWrapper`

* add tests

* add the comment and skip the failing tests

* address comment
2023-09-05 22:31:20 +05:30
raghavanone
1110b565d6
Add TFDebertaV2ForMultipleChoice (#25932)
* Add TFDebertaV2ForMultipleChoice

* Import newer model in main init

* Fix import issues

* Fix copies

* Add doc

* Fix tests

* Fix copies

* Fix docstring
2023-09-05 17:13:06 +01:00
andreeahedes
da1af21dbb
PegasusX add _no_split_modules (#25933)
* no_split_modules

* no_split_modules

* inputs_embeds+pos same device

* update _no_split_modules

* update _no_split_modules
2023-09-05 16:34:34 +01:00
Abhilash Majumder
70a98024b1
Patch with accelerate xpu (#25714)
* patch with accelerate xpu

* patch with accelerate xpu

* formatting

* fix tests

* revert ruff unrelated fixes

* revert ruff unrelated fixes

* revert ruff unrelated fixes

* fix test

* review fixes

* review fixes

* black fixed

* review commits

* review commits

* style fix

* use pytorch_utils

* revert markuplm test
2023-09-05 15:41:42 +01:00
Yih-Dar
aa5c94d38d
Show failed tests on CircleCI layout in a better way (#25895)
* update

* update

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 15:49:33 +02:00
Joao Gante
9a70d6e56f
Trainer: delegate default generation values to generation_config (#25987) 2023-09-05 14:47:00 +01:00
Sahel Sharify
aea761499f
Update training_args.py to remove the runtime error (#25920)
This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error:
File "..../transformers/training_args.py", line 1544, in post_init
for k, v in self.fsdp_config.items():
RuntimeError: dictionary keys changed during iteration
2023-09-05 12:43:51 +01:00
Traun Leyden
7011cd8667
Update RAG README.md with correct path to examples/seq2seq (#25953)
Update README.md with correct path to examples/seq2seq
2023-09-05 12:31:59 +01:00
Julien Chaumond
6316ce8d27
[doc] Always call it Agents for consistency (#25958) 2023-09-05 12:27:20 +01:00
Yih-Dar
391f26459a
Use main in conversion script (#25973)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 13:04:49 +02:00
Kai
6f125aaa48
fix typo (#25981)
rename doanloading to downloading
2023-09-05 11:13:06 +01:00
Susnato Dhar
52a46dc57b
Add Pop2Piano space demo. (#25975)
Update pop2piano.md
2023-09-05 11:07:02 +01:00
Huazhong Ji
1cc3bc22fe
nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0 (#25974)
nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the
minimum PyTorch version we currently support is 1.10.0
2023-09-05 11:37:54 +02:00
Yih-Dar
fbbe1b8a40
Fix test_load_img_url_timeout (#25976)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 11:34:28 +02:00
Yih-Dar
feec56959a
Fix Detr CI (#25972)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-05 11:19:56 +02:00
Susnato Dhar
404ff8fc17
Fix typo (#25966)
* Update feature_extraction_clap.py

* changed all lenght to length
2023-09-05 10:12:25 +02:00
Lysandre
d8e13b3e04 v4.34.dev.0 2023-09-04 15:12:11 -04:00
Younes Belkada
49b69fe0d4
[Falcon] Remove SDPA for falcon to support earlier versions of PyTorch (< 2.0) (#25947)
* remove SDPA for falcon

* revert previous behaviour and add warning

* nit

* Update src/transformers/models/falcon/modeling_falcon.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-09-04 14:34:04 -04:00