amyeroberts
30409af6e1
Update InstructBLIP & Align values after rescale update ( #25209 )
...
* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests
* Update test values after rescale update
* Remove left over commented out code
* Revert to previous rescaling logic
* Update rescale tests
2023-08-03 11:01:10 +01:00
Tom Aarsen
15082a9dc6
Docs: Update list of report_to
logging integrations in docstring ( #25281 )
...
* Update list of logging integrations in docstring
Also update type hint
* Also add 'flyte' to report_to callback list
* Revert 'report_to' type hint update
Due to CLI breaking
2023-08-03 11:34:45 +02:00
Yih-Dar
2bd7a27a67
CI with pytest_num_workers=8
for torch/tf jobs ( #25274 )
...
n8
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 22:00:32 +02:00
Yih-Dar
bd90cda9a6
CI with num_hidden_layers=2
🚀 🚀 🚀 ( #25266 )
...
* CI with layers=2
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 20:22:36 +02:00
Patrick von Platen
b28ebb2655
[MMS] Fix mms ( #25267 )
...
* [MMS] Fix mms
* [MMS] Fix mms
* fix mms loading
* Apply suggestions from code review
* make style
* Update tests/models/wav2vec2/test_modeling_wav2vec2.py
2023-08-02 18:11:15 +02:00
Kevin Lloyd Bernal
ad8321512d
recommend DeepSpeed's Argument Parsing documentation ( #25268 )
2023-08-02 11:48:39 -04:00
heuristicwave
bef02fd6b9
🌐 [i18n-KO] Translated perf_infer_gpu_many.md
to Korean ( #24943 )
...
* doc: ko: perf_infer_gpu_many.mdx
* feat: chatgpt draft
* fix: manual edits
* Update docs/source/ko/perf_infer_gpu_many.md
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
---------
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-08-02 16:06:35 +02:00
Yih-Dar
8edd0da960
Remove pytest_options={"rA": None}
in CI ( #25263 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 14:53:05 +02:00
Euan Ong
1baeed5bdf
Fix return_dict_in_generate bug in InstructBlip generate function ( #25246 )
...
Fix bug in InstructBlip generate function
Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`).
This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object.
2023-08-02 13:43:54 +01:00
Ashish Thomas Chempolil
eec0d84e6a
[DOCS] Add example and modified docs of EtaLogitsWarper ( #25125 )
...
* added example and modified docs for EtaLogitsWarper
* make style
* fixed styling issue on 544
* removed error info and added set_seed
* Update src/transformers/generation/logits_process.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/generation/logits_process.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* updated the results
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-08-02 11:55:56 +01:00
Yupeng Jia
8021c684ec
Fix some bugs for two stage training of deformable detr ( #25045 )
...
* Update modeling_deformable_detr.py
Fix bugs for two stage training
* Update modeling_deformable_detr.py
* Add test_two_stage_training to DeformableDetrModelTest
---------
Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai>
2023-08-02 11:30:36 +01:00
amyeroberts
1b35409768
Update rescale tests - cast to float after rescaling to reflect #25229 ( #25259 )
...
Rescale tests - cast to float after rescaling to reflect #25229
2023-08-02 11:29:55 +01:00
Sourab Mangrulkar
904e7e0f3c
resolving zero3 init when using accelerate config with Trainer ( #25227 )
...
* resolving zero3 init when using accelerate config with Trainer
* refactor
* fix
* fix import
2023-08-02 15:07:27 +05:30
Yih-Dar
149cb0cce2
Add token
arugment in example scripts ( #25172 )
...
* fix
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 11:17:31 +02:00
YQ
c6a8768dab
add pathname and line number to logging formatter in debug mode ( #25203 )
...
* add pathname and lineno to logging formatter in debug mode
* use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno
2023-08-02 09:44:43 +01:00
YQ
2230d149f0
fix get_keys_to_not_convert() to return correct modules for full precision inference ( #25105 )
...
* add test for `get_keys_to_not_convert`
* add minimum patch to keep mpt lm_head from 8bit quantization
* add reivsion to
2023-08-02 04:21:52 -04:00
Sylvain Gugger
f6f567d0be
Fix set of model parallel in the Trainer when no GPUs are available ( #25239 )
2023-08-02 03:29:00 -04:00
amyeroberts
d27e4c18fe
Move rescale dtype recasting to match torchvision ToTensor ( #25229 )
...
Move dtype recasting to match torchvision ToTensor
2023-08-01 12:33:12 +01:00
Younes Belkada
3170af71e1
[Detr
] Fix detr BatchNorm replacement issue ( #25230 )
...
* fix detr weird issue
* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix copies
* fix copies
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-08-01 12:21:48 +02:00
Younes Belkada
05ebb0264e
[MPT
] Add require_bitsandbytes
on MPT integration tests ( #25201 )
...
* add `require_bitsandbytes` on MPT integration tests
* add it on mpt as well
2023-08-01 12:20:34 +02:00
Younes Belkada
972fdcc778
[Docs
/quantization
] Clearer explanation on how things works under the hood. + remove outdated info ( #25216 )
...
* clearer explanation on how things works under the hood.
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add `load_in_4bit` in `from_pretrained`
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-08-01 10:56:52 +02:00
Younes Belkada
77c3973e8f
[Pix2Struct
] Fix pix2struct cross attention ( #25200 )
...
* fix pix2struct cross attention
* fix torchscript slow test
2023-08-01 10:56:37 +02:00
Wang, Yi
4033ea7167
make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… ( #25193 )
...
make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-08-01 01:35:49 -04:00
Yih-Dar
0fd8d2aa2c
Fix docker image build failure ( #25214 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 20:13:15 +02:00
Yih-Dar
1b4f6199c6
Update tiny model info. and pipeline testing ( #25213 )
...
* update tiny_model_summary.json
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 19:35:33 +02:00
Younes Belkada
e0c50b274a
[pipeline
] revisit device check for pipeline ( #25207 )
...
* revisit device check for pipeline
* let's raise an error.
2023-07-31 18:43:21 +02:00
Stas Bekman
5220606607
[quantization.md] fix ( #25190 )
...
Update quantization.md
2023-07-31 09:37:29 -07:00
Yih-Dar
9ca3aa0156
Fix all_model_classes
in FlaxBloomGenerationTest
( #25211 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 17:32:05 +02:00
Younes Belkada
59dcea3fe4
[PreTrainedModel
] Wrap cuda
and to
method correctly ( #25206 )
...
wrap `cuda` and `to` method correctly
2023-07-31 17:25:09 +02:00
Yih-Dar
67b85f24de
Better error message in _prepare_output_docstrings
( #25202 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 16:15:02 +02:00
Joao Gante
4a564490e1
Musicgen: CFG is manually added ( #25173 )
2023-07-31 11:21:11 +01:00
amyeroberts
05cda5df34
🚨 🚨 🚨 Fix rescale ViVit Efficientnet ( #25174 )
...
* Fix rescaling bug
* Add tests
* Update integration tests
* Fix up
* Update src/transformers/image_transforms.py
* Update test - new possible order in list
2023-07-28 19:52:51 +01:00
Sanchit Gandhi
03f98f9683
[MusicGen] Fix integration tests ( #25169 )
...
* move to device
* update with cuda values
* fix fp16
* more rigorous
2023-07-28 18:50:15 +01:00
Yoni Gottesman
c90e14fb0f
Fix beam search to sample at least 1 non eos token ( #25103 ) ( #25115 )
2023-07-28 13:20:24 -04:00
Sohyun Sim
31f137c04f
🌐 [i18n-KO] Translated transformers_agents.md
to Korean ( #24881 )
...
* docs: ko: transformers_agents.md
* docs: ko: transformers_agents.md
* feat: deepl draft
* fix: manual edits
* fix: resolve suggestions
Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com>
Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>
---------
Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com>
Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>
2023-07-28 13:06:37 -04:00
Younes Belkada
dd9d45b6ec
[InstructBlip
] Fix instructblip slow test ( #25171 )
...
* fix instruct blip slow test
* Update tests/models/instructblip/test_modeling_instructblip.py
2023-07-28 17:00:10 +02:00
Younes Belkada
add0895dd9
[Mpt
] Fix mpt slow test ( #25170 )
...
fix mpt slow test
2023-07-28 16:45:09 +02:00
Yih-Dar
d53b8ad780
Update use_auth_token
-> token
in example scripts ( #25167 )
...
* pytorch examples
* tensorflow examples
* flax examples
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-28 15:33:45 +02:00
Alexander Markov
3cbc560d03
added compiled model support for inference ( #25124 )
...
* added compiled model support for inference
* linter
* Fix tests
* linter
* linter
* remove inference mode from pipelines
* Linter
---------
Co-authored-by: amarkov <alexander@inworld.ai>
2023-07-28 08:28:04 -04:00
Alan Ji
afa96fffdf
make run_generation more generic for other devices ( #25133 )
...
* make run_generation more generic for other devices
* use Accelerate to support any device type it supports.
* make style
* fix error usage of accelerator.prepare_model
* use `PartialState` to make sure everything is running on the right device
---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com>
2023-07-28 08:20:10 -04:00
jiqing-feng
d23d2c27c2
Represent query_length in a different way to solve jit issue ( #25164 )
...
Fix jit trace
2023-07-28 08:19:10 -04:00
YQ
2a78720104
override .cuda() to check if model is already quantized ( #25166 )
2023-07-28 08:17:24 -04:00
Lucain
c1dba1111b
Add test when downloading from gated repo ( #25039 )
2023-07-28 08:14:27 -04:00
Lucain
6232c380f2
Fix .push_to_hub
and cleanup get_full_repo_name
usage ( #25120 )
...
* Fix .push_to_hub and cleanup get_full_repo_name usage
* Do not rely on Python bool conversion magic
* request changes
2023-07-28 11:40:08 +02:00
Sylvain Gugger
400e76ef11
Add new model in doc table of content ( #25148 )
2023-07-27 13:41:50 -04:00
Sanchit Gandhi
e93103632b
Add bloom flax ( #25094 )
...
* First commit
* step 1 working
* add alibi
* placeholder for `scan`
* add matrix mult alibi
* beta scaling factor for bmm
* working v1 - simple forward pass
* move layer_number from attribute to arg in call
* partial functioning scan
* hacky working scan
* add more modifs
* add test
* update scan for new kwarg order
* fix position_ids problem
* fix bug in attention layer
* small fix
- do the alibi broadcasting only once
* prelim refactor
* finish refactor
* alibi shifting
* incorporate dropout_add to attention module
* make style
* make padding work again
* update
* remove bogus file
* up
* get generation to work
* clean code a bit
* added small tests
* adding albii test
* make CI tests pass:
- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work
* fix few nits
* fix nit onnx
* fix onnx nit
* add missing dtype args to nn.Modules
* remove debugging statements
* fix scan generate
* Update modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* fix small test issue + make style
* clean up
* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* fix function name
* small fix test
* forward contrib credits from PR17761
* Fix failing test
* fix small typo documentation
* fix non passing test
- remove device from build alibi
* refactor call
- refactor `FlaxBloomBlockCollection` module
* make style
* upcast to fp32
* cleaner way to upcast
* remove unused args
* remove layer number
* fix scan test
* make style
* fix i4 casting
* fix slow test
* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove `layer_past`
* refactor a bit
* fix `scan` slow test
* remove useless import
* major changes
- remove unused code
- refactor a bit
- revert import `torch`
* major refactoring
- change build alibi
* remove scan
* fix tests
* make style
* clean-up alibi
* add integration tests
* up
* fix batch norm conversion
* style
* style
* update pt-fx cross tests
* update copyright
* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* per-weight check
* style
* line formats
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
Yih-Dar
0c790ddbd1
More token
things ( #25146 )
...
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-27 17:42:07 +02:00
Yoach Lacombe
0b92ae3489
Add offload support to Bark ( #25037 )
...
* initial Bark offload proposal
* use hooks instead of manually offloading
* add test of bark offload to cpu feature
* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove unecessary set_seed in Bark tests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-07-27 15:35:17 +01:00
Arthur
9cea3e7b80
[MptConfig
] support from pretrained args ( #25116 )
...
* support from pretrained args
* draft addition of tests
* update test
* use parrent assert true
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-07-27 16:24:52 +02:00
Zach Mueller
a1c4954d25
🚨 🚨 🚨 Change default from adamw_hf
to adamw_torch
🚨 🚨 🚨 ( #25109 )
...
* Change defaults
* Sylvain's comments
2023-07-27 09:11:28 -04:00