Yih-Dar
77db28dc52
Update some torchscript tests after #24505 ( #24566 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-29 16:05:24 +02:00
Sanchit Gandhi
1c1c90756d
Add Musicgen ( #24109 )
...
* Add Audiocraft
* add cross attention
* style
* add for lm
* convert and verify
* introduce t5
* split configs
* load t5 + lm
* clean conversion
* copy from t5
* style
* start pattern provider
* make generation work
* style
* fix pos embs
* propagate shape changes
* propagate shape changes
* style
* delay pattern: pad tokens at end
* audiocraft -> musicgen
* fix inits
* add mdx
* style
* fix pad token in processor
* override generate and add todos
* add init to test
* undo pattern delay mask after gen
* remove cfg logits processor
* remove cfg logits processor
* remove logits processor in favour of mask
* clean pos embs
* make fix copies
* update readmes
* clean pos emb
* refactor encoder/decoder
* make fix copies
* update conversion
* fix config imports
* update config docs
* make style
* send pattern mask to device
* pattern mask with delay
* recover prompted audio tokens
* fix docstrings
* laydown test file
* pattern edge case
* remove t5 ref
* add processing class
* config refactor
* better pattern comment
* check if mask is not present
* check if mask is not present
* refactor to auto class
* remove encoder configs
* fix processor
* processor import
* start updating conversion
* start updating tests
* make style
* convert t5, encodec, lm
* convert as composite
* also convert processor
* run generate
* classifier free gen
* comments and clean up
* make style
* docs for logit proc
* docstring for uncond gen
* start lm tests
* work tests
* let the lm generate
* refactor: reshape inside forward
* undo greedy loop changes
* from_enc_dec -> from_sub_model
* fix input id shapes in docstrings
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* undo generate changes
* from sub model config
* Update src/transformers/models/musicgen/modeling_musicgen.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make generate work again
* generate uncond -> get uncond inputs
* remove prefix allowed tokens fn
* better error message
* logit proc checks
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* make decoder only tests work
* composite fast tests
* make style
* uncond generation
* feat extr padding
* make audio prompt work
* fix inputs docstrings
* unconditional inputs: dict -> model output
* clean up tests
* more clean up tests
* make style
* t5 encoder -> auto text encoder
* remove comments
* deal with frames
* fix auto text
* slow tests
* nice mdx
* remove can generate
* todo - hub id
* convert m/l
* make fix copies
* only import generation with torch
* ignore decoder from tests
* don't wrap uncond inputs
* make style
* cleaner uncond inputs
* add example to musicgen forward
* fix docs
* ignore MusicGen Model/ForConditionalGeneration in auto mapping
* add doc section to toctree
* add to doc tests
* add processor tests
* fix push to hub in conversion
* tips for decoder only loading
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix conversion for s / m / l checkpoints
* import stopping criteria from module
* remove from pipeline tests
* fix uncond docstring
* decode audio method
* fix docs
* org: sanchit-gandhi -> facebook
* fix max pos embeddings
* remove auto doc (not compatible with shapes)
* bump max pos emb
* make style
* fix doc
* fix config doc
* fix config doc
* ignore musicgen config from docstring
* make style
* fix config
* fix config for doctest
* consistent from_sub_models
* don't automap decoder
* fix mdx save audio file
* fix mdx save audio file
* processor batch decode for audio
* remove keys to ignore
* update doc md
* update generation config
* allow changes for default generation config
* update tests
* make style
* fix docstring for uncond
* fix processor test
* fix processor test
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-06-29 14:48:59 +01:00
Sylvain Gugger
2dc5e1a120
Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" ( #24574 )
...
Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549 )"
This reverts commit c5e29d4381
.
2023-06-29 08:14:43 -04:00
Joao Gante
4f1b31c2ee
Docs: 4 bit doc corrections ( #24572 )
...
4 bit doc corrections
2023-06-29 13:13:20 +01:00
MS Kim(tony9402)
1fd52e6e60
Fix annotations ( #24571 )
...
* fix annotations
* fix copies
2023-06-29 08:05:19 -04:00
MS Kim(tony9402)
63cc30e71b
Fix Typo ( #24559 )
2023-06-29 08:04:07 -04:00
amyeroberts
ae454f41d4
Update old existing feature extractor references ( #24552 )
...
* Update old existing feature extractor references
* Typo
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Address comments from review - update 'feature extractor'
Co-authored by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2023-06-29 10:17:36 +01:00
Pasquale De Marinis
10c2ac7bc6
Fixed OwlViTModel inplace operations ( #24529 )
...
* fixed OwlViTModel inplace operations
* fixed operands order in owlvit
2023-06-29 10:17:26 +02:00
condor-cp
66954ea25e
Update masked_language_modeling.md ( #24560 )
...
See https://github.com/huggingface/transformers/issues/24546
2023-06-28 17:54:20 -04:00
Yih-Dar
fd6735102a
Make PT/Flax tests could be run on GPU ( #24557 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 20:11:01 +02:00
Yih-Dar
faae8d8255
Update PT/Flax weight conversion after #24030 ( #24556 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 19:44:31 +02:00
Younes Belkada
33b5ef5cdf
[InstructBlip
] Add instruct blip int8 test ( #24555 )
...
* add 8bit instructblip test
* update tests
2023-06-28 19:06:30 +02:00
amyeroberts
c70c88a268
Fix processor __init__ bug if image processor undefined ( #24554 )
...
Make sure feature_extractor is defined in all cases
2023-06-28 17:17:27 +01:00
Younes Belkada
903b97d8df
[gpt2-int8
] Add gpt2-xl int8 test ( #24543 )
...
add gpt2-xl test
2023-06-28 18:02:13 +02:00
Yih-Dar
b0651655be
Update EncodecIntegrationTest
( #24553 )
...
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 18:01:41 +02:00
Yih-Dar
6c57ce1558
Update PT/TF weight conversion after #24030 ( #24547 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 16:36:57 +02:00
Max Ryabinin
c5e29d4381
Fix typing annotations for FSDP and DeepSpeed in TrainingArguments ( #24549 )
...
* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments
* Change dict to Dict
2023-06-28 10:36:17 -04:00
Frank995
daccde143d
Allow for warn_only selection in enable_full_determinism ( #24496 )
...
* Warn only in enable full determinism
* Add option in the function definition
2023-06-28 08:54:36 -04:00
Yih-Dar
11cb6e0f7e
Unpin DeepSpeed and require DS >= 0.9.3 ( #24541 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 14:01:22 +02:00
Yih-Dar
e84bf1f734
⚠️ Time to say goodbye to py37 ( #24091 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 07:22:39 +02:00
Dario Sučić
12240925cf
Add bitsandbytes support for gpt2 models ( #24504 )
...
* Add bitsandbytes support for gpt2 models
* Guard Conv1D import to pass tensorflow test
* Appease ruff linter
* Fix 4bit test and remove int8 test boilerplate
* Update tests/bnb/test_mixed_int8.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-06-28 05:55:32 +02:00
Sylvain Gugger
89b6ee49fd
Finishing tidying keys to ignore on load ( #24535 )
2023-06-27 21:35:15 -04:00
MS Kim(tony9402)
04f46a22d8
Fix Typo ( #24530 )
...
* Fix Typo
* Fix all copies
2023-06-27 15:38:14 -04:00
amyeroberts
462f77cbce
Allow backbones not in backbones_supported - Maskformer Mask2Former ( #24532 )
...
Allow backbones not in backbones_supported
2023-06-27 20:34:36 +01:00
Sylvain Gugger
8e5d1619b3
Clean load keys ( #24505 )
...
* Preliminary work on some models
* Fix test load missing and make sure nonpersistent buffers are tested
* Always ignore nonpersistent buffers if in state_dict
* Treat models
* More models
* Treat remaining models
* Fix quality
* Fix tests
* Remove draft
* This test is not needed anymore
* Fix copies
* Fix last test
* Newly added models
* Fix last tests
* Address review comments
2023-06-27 14:45:40 -04:00
NielsRogge
53194991e9
[Mask2Former] Remove SwinConfig ( #24259 )
...
Remove SwinConfig
2023-06-27 13:33:55 -04:00
Zach Mueller
fb6a62762f
Fix LR scheduler based on bs from auto bs finder ( #24521 )
...
* One solution
* args -> self
2023-06-27 13:28:26 -04:00
Sylvain Gugger
38db04ece0
Find module name in an OS-agnostic fashion ( #24526 )
...
* Find module name in an OS-agnostic fashion
* address review comment
2023-06-27 13:21:19 -04:00
Yih-Dar
7d150d68ff
Update huggingface_hub
commit sha ( #24527 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-27 17:41:55 +02:00
Wang, Yi
4e8929dcbb
set model to training mode before accelerate.prepare ( #24520 )
2023-06-27 10:09:38 -04:00
Sebastian
06910f5a76
[T5
] Add T5ForQuestionAnswering and MT5ForQuestionAnswering ( #24481 )
...
* Adding T5ForQuestionAnswering
* Changed weight initialization that results in better initial loss when fine-tuning
* Update to class variables
* Running make fixup
* Running make fix-copies
* Remove model_parallel
* Adding MT5ForQuestionAnswering
* Adding docs
* Fix wrong doc
* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* File formatting
* Undoing change
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-06-27 10:07:06 -04:00
Sourab Mangrulkar
bcf02ec701
Update hyperparameter_search.py ( #24515 )
...
* Update hyperparameter_search.py
* resolve comments
2023-06-27 18:42:15 +05:30
Wang, Yi
6fe8d198e3
use accelerate autocast in jit eval path, since mix precision logic is… ( #24460 )
...
use accelerate autocast in jit eval path, since mix precision logic is in accelerator currently
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-06-27 08:33:21 -04:00
Hyeonseo Yun
0863436b6c
🌐 [i18n-KO] Translated tflite.mdx
to Korean ( #24435 )
...
* docs: ko: tflite.mdx
* feat: nmt and manual edit `tflite.mdx`
* revised: resolve suggestions tflite.mdx
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* revised: resolve suggestions and new line tflite.mdx
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
---------
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-06-27 08:18:42 -04:00
Yih-Dar
4abd3ee479
Fix poor past ci ( #24485 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-27 14:14:17 +02:00
Xiaoli Wang
239ace152b
Fix TypeError: Object of type int64 is not JSON serializable ( #24340 )
...
* Fix TypeError: Object of type int64 is not JSON serializable
* Convert numpy.float64 and numpy.int64 to float and int for json serialization
* Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py
* * make style
2023-06-27 12:15:49 +01:00
Joao Gante
ac19871ce2
Generate: min_tokens_to_keep
has to be >= 1
( #24453 )
2023-06-27 11:48:23 +01:00
Joao Gante
5f3efdf762
Generate: group_beam_search
requires diversity_penalty>0.0
( #24456 )
...
* add exception
* update docs
2023-06-27 10:46:39 +01:00
hukuda222
43479ef98f
🚨 🚨 Fix group beam search ( #24407 )
...
* group_beam_search now works correctly
* add argument descriptions
* add a comment
* format
* make style
* change comment
* Update src/transformers/generation/beam_search.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
---------
Co-authored-by: shogo.fujita <shogo.fujita@legalontech.jp>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-06-27 10:43:10 +01:00
Gema Parreño
68c92981ff
Fix link in utils ( #24501 )
...
* fix link
* new link
---------
Co-authored-by: Gema <gema@mbp-de-gema-2.lan>
2023-06-26 14:26:09 -04:00
Yih-Dar
7b4e3b5b40
Compute dropout_probability
only in training mode (SpeechT5) ( #24498 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 19:43:06 +02:00
Tomoko Uchida
c9fd49853f
Fix 'local_rank' AttiributeError in Trainer class ( #24297 )
...
fix attribute error
2023-06-26 13:38:29 -04:00
Yih-Dar
850cf4af0c
Compute dropout_probability
only in training mode ( #24486 )
...
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 18:36:47 +02:00
Younes Belkada
9895670e95
[InstructBlip
] Add accelerate support for instructblip ( #24488 )
...
* add accelerate support for instructblip
* add `_keep_in_fp32_modules`
* dynamically adapt `_no_split_modules`
* better fix
* same logic for `_keep_in_fp32_modules`
2023-06-26 18:36:27 +02:00
Sylvain Gugger
5757923888
Add support for for loops in python interpreter ( #24429 )
...
Add support for for loops
2023-06-26 09:58:14 -04:00
condor-cp
c2aa5e17e4
Update token_classification.md ( #24484 )
...
Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.
2023-06-26 08:42:38 -04:00
Yih-Dar
3ca022238b
Update InstructBlipModelIntegrationTest
( #24490 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-26 14:37:12 +02:00
Sourab Mangrulkar
195a9e5bdb
deepspeed z1/z2 state dict fix ( #24489 )
...
* deepspeed z2/z1 state_dict bloating fix
* update
* version check
2023-06-26 17:45:37 +05:30
Wang, Yi
c8aff1d3e6
when resume from peft checkpoint, the model should be trainable ( #24463 )
2023-06-26 08:07:27 -04:00
Younes Belkada
914289ac4b
[pipeline
] Fix str device issue ( #24396 )
...
* fix str device issue
* fixup
* adapt from suggestions
* forward contrib credits from suggestions
* better fix
* added backward compatibility for older PT versions
* final fixes
* oops
* Attempting something with less branching.
---------
Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-06-26 13:58:36 +02:00