Yih-Dar
e84bf1f734
⚠️ Time to say goodbye to py37 ( #24091 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 07:22:39 +02:00
Dario Sučić
12240925cf
Add bitsandbytes support for gpt2 models ( #24504 )
...
* Add bitsandbytes support for gpt2 models
* Guard Conv1D import to pass tensorflow test
* Appease ruff linter
* Fix 4bit test and remove int8 test boilerplate
* Update tests/bnb/test_mixed_int8.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-06-28 05:55:32 +02:00
Sylvain Gugger
89b6ee49fd
Finishing tidying keys to ignore on load ( #24535 )
2023-06-27 21:35:15 -04:00
MS Kim(tony9402)
04f46a22d8
Fix Typo ( #24530 )
...
* Fix Typo
* Fix all copies
2023-06-27 15:38:14 -04:00
amyeroberts
462f77cbce
Allow backbones not in backbones_supported - Maskformer Mask2Former ( #24532 )
...
Allow backbones not in backbones_supported
2023-06-27 20:34:36 +01:00
Sylvain Gugger
8e5d1619b3
Clean load keys ( #24505 )
...
* Preliminary work on some models
* Fix test load missing and make sure nonpersistent buffers are tested
* Always ignore nonpersistent buffers if in state_dict
* Treat models
* More models
* Treat remaining models
* Fix quality
* Fix tests
* Remove draft
* This test is not needed anymore
* Fix copies
* Fix last test
* Newly added models
* Fix last tests
* Address review comments
2023-06-27 14:45:40 -04:00
NielsRogge
53194991e9
[Mask2Former] Remove SwinConfig ( #24259 )
...
Remove SwinConfig
2023-06-27 13:33:55 -04:00
Zach Mueller
fb6a62762f
Fix LR scheduler based on bs from auto bs finder ( #24521 )
...
* One solution
* args -> self
2023-06-27 13:28:26 -04:00
Sylvain Gugger
38db04ece0
Find module name in an OS-agnostic fashion ( #24526 )
...
* Find module name in an OS-agnostic fashion
* address review comment
2023-06-27 13:21:19 -04:00
Yih-Dar
7d150d68ff
Update huggingface_hub
commit sha ( #24527 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-27 17:41:55 +02:00
Wang, Yi
4e8929dcbb
set model to training mode before accelerate.prepare ( #24520 )
2023-06-27 10:09:38 -04:00
Sebastian
06910f5a76
[T5
] Add T5ForQuestionAnswering and MT5ForQuestionAnswering ( #24481 )
...
* Adding T5ForQuestionAnswering
* Changed weight initialization that results in better initial loss when fine-tuning
* Update to class variables
* Running make fixup
* Running make fix-copies
* Remove model_parallel
* Adding MT5ForQuestionAnswering
* Adding docs
* Fix wrong doc
* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* File formatting
* Undoing change
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-06-27 10:07:06 -04:00
Sourab Mangrulkar
bcf02ec701
Update hyperparameter_search.py ( #24515 )
...
* Update hyperparameter_search.py
* resolve comments
2023-06-27 18:42:15 +05:30
Wang, Yi
6fe8d198e3
use accelerate autocast in jit eval path, since mix precision logic is… ( #24460 )
...
use accelerate autocast in jit eval path, since mix precision logic is in accelerator currently
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-06-27 08:33:21 -04:00
Hyeonseo Yun
0863436b6c
🌐 [i18n-KO] Translated tflite.mdx
to Korean ( #24435 )
...
* docs: ko: tflite.mdx
* feat: nmt and manual edit `tflite.mdx`
* revised: resolve suggestions tflite.mdx
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* revised: resolve suggestions and new line tflite.mdx
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
---------
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-06-27 08:18:42 -04:00
Yih-Dar
4abd3ee479
Fix poor past ci ( #24485 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-27 14:14:17 +02:00
Xiaoli Wang
239ace152b
Fix TypeError: Object of type int64 is not JSON serializable ( #24340 )
...
* Fix TypeError: Object of type int64 is not JSON serializable
* Convert numpy.float64 and numpy.int64 to float and int for json serialization
* Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py
* * make style
2023-06-27 12:15:49 +01:00
Joao Gante
ac19871ce2
Generate: min_tokens_to_keep
has to be >= 1
( #24453 )
2023-06-27 11:48:23 +01:00
Joao Gante
5f3efdf762
Generate: group_beam_search
requires diversity_penalty>0.0
( #24456 )
...
* add exception
* update docs
2023-06-27 10:46:39 +01:00
hukuda222
43479ef98f
🚨 🚨 Fix group beam search ( #24407 )
...
* group_beam_search now works correctly
* add argument descriptions
* add a comment
* format
* make style
* change comment
* Update src/transformers/generation/beam_search.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
---------
Co-authored-by: shogo.fujita <shogo.fujita@legalontech.jp>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-06-27 10:43:10 +01:00
Gema Parreño
68c92981ff
Fix link in utils ( #24501 )
...
* fix link
* new link
---------
Co-authored-by: Gema <gema@mbp-de-gema-2.lan>
2023-06-26 14:26:09 -04:00
Yih-Dar
7b4e3b5b40
Compute dropout_probability
only in training mode (SpeechT5) ( #24498 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 19:43:06 +02:00
Tomoko Uchida
c9fd49853f
Fix 'local_rank' AttiributeError in Trainer class ( #24297 )
...
fix attribute error
2023-06-26 13:38:29 -04:00
Yih-Dar
850cf4af0c
Compute dropout_probability
only in training mode ( #24486 )
...
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 18:36:47 +02:00
Younes Belkada
9895670e95
[InstructBlip
] Add accelerate support for instructblip ( #24488 )
...
* add accelerate support for instructblip
* add `_keep_in_fp32_modules`
* dynamically adapt `_no_split_modules`
* better fix
* same logic for `_keep_in_fp32_modules`
2023-06-26 18:36:27 +02:00
Sylvain Gugger
5757923888
Add support for for loops in python interpreter ( #24429 )
...
Add support for for loops
2023-06-26 09:58:14 -04:00
condor-cp
c2aa5e17e4
Update token_classification.md ( #24484 )
...
Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.
2023-06-26 08:42:38 -04:00
Yih-Dar
3ca022238b
Update InstructBlipModelIntegrationTest
( #24490 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 14:37:12 +02:00
Sourab Mangrulkar
195a9e5bdb
deepspeed z1/z2 state dict fix ( #24489 )
...
* deepspeed z2/z1 state_dict bloating fix
* update
* version check
2023-06-26 17:45:37 +05:30
Wang, Yi
c8aff1d3e6
when resume from peft checkpoint, the model should be trainable ( #24463 )
2023-06-26 08:07:27 -04:00
Younes Belkada
914289ac4b
[pipeline
] Fix str device issue ( #24396 )
...
* fix str device issue
* fixup
* adapt from suggestions
* forward contrib credits from suggestions
* better fix
* added backward compatibility for older PT versions
* final fixes
* oops
* Attempting something with less branching.
---------
Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-06-26 13:58:36 +02:00
amyeroberts
892399c5ff
Update AlbertModel type annotation ( #24450 )
...
Update type annotation
2023-06-26 10:59:42 +01:00
Meghan Cowan
be2d9f2e47
Fix tpu_metrics_debug ( #24452 )
...
fix for tpu metrics debugs string
2023-06-26 10:59:07 +01:00
Matthijs Hollemans
3b84d86b57
add missing alignment_heads to Whisper integration test ( #24487 )
...
add missing alignment heads
2023-06-26 11:50:10 +02:00
NielsRogge
868363abb9
Add InstructBLIP ( #23460 )
...
* Squash 88 commits
* Use markdown
* Remove mdx files due to bad rebase
* Fix modeling files due to bad rebase
* Fix style
* Update comment
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 11:23:57 +02:00
Matt
8e164c5400
Improved keras imports ( #24448 )
...
* An end to accursed version-specific imports
* No more K.is_keras_tensor() either
* Update dependency tables
* Use a cleaner call context function getter
* Add a cap to <2.14
* Add cap to examples requirements too
2023-06-23 19:09:34 +01:00
Yih-Dar
1e9da2b0a6
Update JukeboxConfig.from_pretrained
( #24443 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-23 15:00:52 +02:00
Sanchit Gandhi
8767958fc1
Allow dict input for audio classification pipeline ( #23445 )
...
* Allow dict input for audio classification pipeline
* make style
* Empty commit to trigger CI
* Empty commit to trigger CI
* check for torchaudio
* add pip instructions
Co-authored-by: Sylvain <sylvain.gugger@gmail.com>
* Update src/transformers/pipelines/audio_classification.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* asr -> audio class
* asr -> audio class
---------
Co-authored-by: Sylvain <sylvain.gugger@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-06-23 13:50:37 +01:00
Sourab Mangrulkar
a6f37f8879
fixes issue when saving fsdp via accelerate's FSDP plugin ( #24446 )
2023-06-23 18:03:57 +05:30
Yih-Dar
2898fd3968
Fix some TFWhisperModelIntegrationTests
( #24428 )
...
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-06-23 14:27:49 +02:00
Moon Gi Cho
5e9f6752ee
Fix typo ( #24440 )
2023-06-23 08:21:08 -04:00
Bowen Bao
a28325e25e
Replace python random with torch.rand to enable dynamo.export ( #24434 )
...
* Replace python random with torch.rand to enable dynamo.export
* revert changes to flax model code
* Remove unused random import
* Fix torch template
* Move torch.manual_seed(0) to right location
2023-06-23 08:17:21 -04:00
Sourab Mangrulkar
c036c814f4
fix the grad_acc issue at epoch boundaries ( #24415 )
...
* fix the grad_acc issue at epoch boundaries
Co-Authored-By: Zach Mueller <7831895+muellerzr@users.noreply.github.com>
* add contributors.
Co-authored-by: sumpster
* address comments
---------
Co-authored-by: Zach Mueller <7831895+muellerzr@users.noreply.github.com>
2023-06-23 17:43:07 +05:30
Younes Belkada
468aed39af
[Trainer
] Fix .to
call on 4bit models ( #24444 )
...
* fix `.to` call on 4bit models
* better check
2023-06-23 13:35:04 +02:00
Sanchit Gandhi
ea91c2adca
[AutoModel] Add AutoModelForTextEncoding ( #24305 )
...
* [AutoModel] Add AutoModelForTextEncoding
* add mt5
* add other models
* add to docs
* fix tf imports
* add tf to docs / init
* up
* fix inits
* add to dummy objects
2023-06-23 10:01:37 +01:00
Weiming Zhao
feb83521ec
[llama] Fix comments in weights converter ( #24436 )
...
Explain the reason to clone tensor
2023-06-22 20:38:53 -04:00
Yih-Dar
2c977e4a90
Save site-packages
as cache in CircleCI job ( #24424 )
...
* fix
* fix
* Upgrade complete!
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-22 23:16:35 +02:00
Sylvain Gugger
2834c17ad2
Clarify batch size displayed when using DataParallel ( #24430 )
2023-06-22 14:46:20 -04:00
Alex Hall
b6295b26c5
Refactor hyperparameter search backends ( #24384 )
...
* Refactor hyperparameter search backends
* Simpler refactoring without abstract base class
* black
* review comments:
specify name in class
use methods instead of callable class attributes
name constant better
* review comments: safer bool checking, log multiple available backends
* test ALL_HYPERPARAMETER_SEARCH_BACKENDS vs HPSearchBackend in unit test, not module. format with black.
* copyright
2023-06-22 14:28:25 -04:00
Matt
a1c4b63076
TF CI fix for Segformer ( #24426 )
...
Fix segformer so compilation can figure out the channel dim
2023-06-22 15:49:13 +01:00