YQ
2a78720104
override .cuda() to check if model is already quantized ( #25166 )
2023-07-28 08:17:24 -04:00
Lucain
c1dba1111b
Add test when downloading from gated repo ( #25039 )
2023-07-28 08:14:27 -04:00
Lucain
6232c380f2
Fix .push_to_hub
and cleanup get_full_repo_name
usage ( #25120 )
...
* Fix .push_to_hub and cleanup get_full_repo_name usage
* Do not rely on Python bool conversion magic
* request changes
2023-07-28 11:40:08 +02:00
Sylvain Gugger
400e76ef11
Add new model in doc table of content ( #25148 )
2023-07-27 13:41:50 -04:00
Sanchit Gandhi
e93103632b
Add bloom flax ( #25094 )
...
* First commit
* step 1 working
* add alibi
* placeholder for `scan`
* add matrix mult alibi
* beta scaling factor for bmm
* working v1 - simple forward pass
* move layer_number from attribute to arg in call
* partial functioning scan
* hacky working scan
* add more modifs
* add test
* update scan for new kwarg order
* fix position_ids problem
* fix bug in attention layer
* small fix
- do the alibi broadcasting only once
* prelim refactor
* finish refactor
* alibi shifting
* incorporate dropout_add to attention module
* make style
* make padding work again
* update
* remove bogus file
* up
* get generation to work
* clean code a bit
* added small tests
* adding albii test
* make CI tests pass:
- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work
* fix few nits
* fix nit onnx
* fix onnx nit
* add missing dtype args to nn.Modules
* remove debugging statements
* fix scan generate
* Update modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* fix small test issue + make style
* clean up
* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* fix function name
* small fix test
* forward contrib credits from PR17761
* Fix failing test
* fix small typo documentation
* fix non passing test
- remove device from build alibi
* refactor call
- refactor `FlaxBloomBlockCollection` module
* make style
* upcast to fp32
* cleaner way to upcast
* remove unused args
* remove layer number
* fix scan test
* make style
* fix i4 casting
* fix slow test
* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove `layer_past`
* refactor a bit
* fix `scan` slow test
* remove useless import
* major changes
- remove unused code
- refactor a bit
- revert import `torch`
* major refactoring
- change build alibi
* remove scan
* fix tests
* make style
* clean-up alibi
* add integration tests
* up
* fix batch norm conversion
* style
* style
* update pt-fx cross tests
* update copyright
* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* per-weight check
* style
* line formats
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
Yih-Dar
0c790ddbd1
More token
things ( #25146 )
...
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-27 17:42:07 +02:00
Yoach Lacombe
0b92ae3489
Add offload support to Bark ( #25037 )
...
* initial Bark offload proposal
* use hooks instead of manually offloading
* add test of bark offload to cpu feature
* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove unecessary set_seed in Bark tests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-07-27 15:35:17 +01:00
Arthur
9cea3e7b80
[MptConfig
] support from pretrained args ( #25116 )
...
* support from pretrained args
* draft addition of tests
* update test
* use parrent assert true
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-07-27 16:24:52 +02:00
Zach Mueller
a1c4954d25
🚨 🚨 🚨 Change default from adamw_hf
to adamw_torch
🚨 🚨 🚨 ( #25109 )
...
* Change defaults
* Sylvain's comments
2023-07-27 09:11:28 -04:00
Bram Vanroy
9a220ce30c
Clarify 4/8 bit loading log message ( #25134 )
...
* clarify 4/8 bit loading log message
* make style
2023-07-27 09:09:27 -04:00
Arthur
9429642e2d
[T5/LlamaTokenizer
] default legacy to None
to not always warn ( #25131 )
...
default legacy to None
2023-07-27 14:43:18 +02:00
Pbihao
de9e3b5945
fix delete all checkpoints when save_total_limit is set to 1 ( #25136 )
2023-07-27 08:34:02 -04:00
Sourab Mangrulkar
a004237926
fix deepspeed load best model at end when the model gets sharded ( #25057 )
2023-07-27 07:11:43 +05:30
amyeroberts
1689aea733
Move center_crop to BaseImageProcessor ( #25122 )
2023-07-26 18:30:38 +01:00
amyeroberts
659829b6ae
MaskFormer - enable return_dict in order to compile ( #25052 )
...
* Enable return_dict in order to compile
* Update tests
2023-07-26 16:23:30 +01:00
Eric Bezzam
b914ec9847
Fix ViT docstring regarding default dropout values. ( #25118 )
...
Fix docstring for dropout.
2023-07-26 11:08:57 -04:00
amyeroberts
1486d2aec2
Move common image processing methods to BaseImageProcessor ( #25089 )
...
Move out common methods
2023-07-26 15:09:17 +01:00
Yih-Dar
d30cf3d02f
Fix past CI after #24334 ( #25113 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:34:42 +02:00
Yih-Dar
224da5df69
update use_auth_token
-> token
( #25083 )
...
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:09:59 +02:00
Leo
c53c8e490c
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … ( #24772 )
...
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."
Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>
2023-07-26 09:07:21 -04:00
David Reguera
04a5c859b0
Add descriptive docstring to TemperatureLogitsWarper ( #24892 )
...
* Add descriptive docstring to TemperatureLogitsWarper
It addresses https://github.com/huggingface/transformers/issues/24783
* Remove niche features
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Commit suggestion
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Refactor the examples to simpler ones
* Add a missing comma
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Make args description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Remove extra text after making description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Fix linter
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-26 08:58:26 -04:00
Yih-Dar
31acba5697
Fix PvtModelIntegrationTest::test_inference_fp16
( #25106 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 14:57:44 +02:00
Kihoon Son
ee63520a7b
🌐 [i18n-KO] Translated pipeline_webserver.md to Korean ( #24828 )
...
* translated pipeline_webserver.md
Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update pipeline_webserver.md
* Apply suggestions from code review
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>
2023-07-26 08:40:37 -04:00
Shauray Singh
277d3aed0a
documentation for llama2 models ( #25102 )
...
* fix documentation
* changes
2023-07-26 08:30:33 -04:00
Marc Sun
a5cc30d72a
fix tied_params for meta tensor ( #25101 )
...
* fix tied_params for meta tensor
* remove duplicate
2023-07-25 18:08:45 -04:00
dependabot[bot]
f1deb21fce
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert ( #25097 )
...
Bump certifi in /examples/research_projects/visual_bert
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:14 -04:00
dependabot[bot]
45bde362d2
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer ( #25098 )
...
Bump certifi in /examples/research_projects/decision_transformer
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:05 -04:00
dependabot[bot]
6b8dbc283c
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert ( #25096 )
...
Bump certifi in /examples/research_projects/lxmert
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:24:50 -04:00
Yih-Dar
da5ff18a4a
Fix doctest ( #25031 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-25 22:10:06 +02:00
Sebastian Husch Lee
8f36ab3e22
[T5
, MT5
, UMT5
] Add [T5, MT5, UMT5]ForSequenceClassification ( #24726 )
...
* Initial addition of t5forsequenceclassification
* Adding imports and adding tests
* Formatting
* Running make fix-copies
* Adding mt5forseq
* Formatting
* run make fix-copies
* Adding to docs
* Add model_parallel
* Fix bug
* Fix
* Remove TODO
* Fixing tests for T5ForSequenceClassification
* Undo changes to dependency_versions_table.py
* Change classification head to work with T5Config directly
* Change seq length to let tests pass
* PR comments for formatting
* Formatting
* Initial addition of UMT5ForSequenceClassification
* Adding to inits and formatting
* run make fix-copies
* Add doc for UMT5ForSeqClass
* Update UMT5 config
* Fix docs
* Skip torch fx test for SequenceClassification
* Formatting
* Add skip to UMT5 tests as well
* Fix umt5 tests
* Running make fix-copies
* PR comments
* Fix for change to sentence_representation
* Rename seq_len to hidden_size since that's what it is
* Use base_model to follow format of the rest of the library
* Update docs
* Extract the decoder_input_ids changes and make one liner
* Make one-liner
2023-07-25 21:02:49 +02:00
Yih-Dar
21150cb0f3
Hotfix for failing MusicgenForConditionalGeneration
tests ( #25091 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-25 20:26:00 +02:00
Arthur
f9cc333805
[ PreTrainedTokenizerFast
] Keep properties from fast tokenizer ( #25053 )
...
* draft solution
* use `setdefault`
* nits
* add tests and fix truncation issue
* fix test
* test passes locally
* quality
* updates
* update tsets
2023-07-25 18:45:01 +02:00
Connor Henderson
0779fc8eb8
Edit err message and comment in test_model_is_small
( #25087 )
...
* Edit err message and comment in
* put back 80M comment
2023-07-25 12:24:36 -04:00
Arthur
2fac342238
[TF
] Also apply patch to support left padding ( #25085 )
...
* tf versions
* apply changes to other models
* 3 models slipped through the cracks
2023-07-25 11:23:09 -04:00
Arthur
f104522718
[ ForSequenceClassification
] Support left
padding ( #24979 )
...
* support left padding
* nit
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
2023-07-25 16:19:43 +02:00
Yih-Dar
1e662f0f07
Allow generic composite models to pass more kwargs ( #24927 )
...
* fix
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-25 16:07:00 +02:00
김준재_T3056
b51312e24d
🌐 [i18n-KO] Translated perf_infer_cpu.md
to Korean ( #24920 )
...
* docs: ko: perf_infer_cpu.md
* feat: chatgpt draft
* fix: manual edits
* Update docs/source/ko/_toctree.yml
* Update docs/source/ko/perf_infer_cpu.md
* Update docs/source/ko/perf_infer_cpu.md
이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
동의합니다! 제가 원본에 너무 얽매여 있었네요!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
말씀하신대로 원문에 너무 집착했던것 같습니다
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
더 나은 어휘 사용에 감사드립니다!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
이 당시 '주기'란 용어를 생각해내질 못했네요...
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
좀 더 자연스러운 문맥이 됐네요!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
굳이 원본 형식에 얽매일 필요가 없군요!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
---------
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-07-25 16:04:14 +02:00
Gema Parreño
b99f7bd4fc
[DOCS] add example NoBadWordsLogitsProcessor ( #25046 )
...
* add example NoBadWordsLogitsProcessor
* fix L764 & L767
* make style
2023-07-25 09:41:48 -04:00
Arthur
dcb183f4bd
[MPT
] Add MosaicML's MPT
model to transformers ( #24629 )
...
* draft add new model like
* some cleaning of the config
* nits
* add nested configs
* nits
* update
* update
* added layer norms + triton kernels
* consider only LPLayerNorm for now.
* update
* all keys match.
* Update
* fixing nits here and there
* working forward pass.
* removed einops dependency
* nits
* format
* add alibi
* byebye head mask
* refactor attention
* nits.
* format
* fix nits.
* nuke ande updates
* nuke tokenizer test
* don't reshape query with kv heads
* added a bit of documentation.
* remove unneeded things
* nuke more stuff
* nit
* logits match - same generations
* rm unneeded methods
* 1 remaining failing CI test
* nit
* fix nits
* fix docs
* fix docs
* rm tokenizer
* fixup
* fixup
* fixup and fix tests
* fixed configuration object.
* use correct activation
* few minor fixes
* clarify docs a bit
* logits match à 1e-12
* skip and unskip a test
* added some slow tests.
* fix readme
* add more details
* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix configuration issues
* more fixes in config
* added more models
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove unneeded position ids
* fix some comments
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* revert suggestion
* mpt alibi + added batched generation
* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove init config
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix nit
* add another slow test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fits in one line
* some refactor because make fixup doesn't pass
* add ft notebook
* update md
* correct doc path
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-25 14:32:40 +02:00
Xiaoke Huang
1dbc1440a7
Fix: repeat per sample for SAM image embeddings ( #25074 )
...
Repeat per sample for SAM image embeddings
2023-07-25 08:30:14 -04:00
Harheem Kim
cb8abee511
🌐 [i18n-KO] Translated hpo_train.md
to Korean ( #24968 )
...
* dos: ko: hpo_train.mdx
* feat: chatgpt draft
* fix: manual edits
* fix: resolve suggestions
2023-07-25 08:28:20 -04:00
Arthur
f2c1df93f5
[generate
] Only warn users if the generation_config
's max_length
is set to the default value ( #25030 )
...
* check max length is default
* nit
* update warning: no-longer deprecate
* comment in the configuration_utils in case max length's default gets changed in the futur
2023-07-25 14:20:37 +02:00
Alan Ji
c879318cc5
replace per_gpu_eval_batch_size
with per_device_eval_batch_size
in readme of multiple-choice task ( #25078 )
...
replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size`
in readme of multiple-choice
2023-07-25 08:11:56 -04:00
Susnato Dhar
25e443c0d4
Fix broken link in README_hd.md ( #25067 )
...
Update README_hd.md
2023-07-25 08:09:01 -04:00
Xuehai Pan
6bc61aa7af
Set TF32
flag for PyTorch cuDNN backend ( #25075 )
2023-07-25 08:04:48 -04:00
Injin Paek
5dba88b2d2
fix: add TOC anchor link ( #25066 )
2023-07-25 08:02:33 -04:00
Sylvain Gugger
f295fc8a16
Fix last models for common tests that are too big. ( #25058 )
...
* Fix last models for common tests that are too big.
* Remove print statement
2023-07-25 07:56:04 -04:00
Sangam Lee
ee1eb3b325
🌐 [i18n-KO] Translated perf_hardware.md
to Korean ( #24966 )
...
* docs: ko: perf_hardware.md
* feat: nmt draft
* fix: manual edits
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
* Fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: fix rendering error of perf_hardware.md
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
2023-07-25 07:44:24 -04:00
Haewon Kim
f6fe1d5514
🌐 [i18n-KO] Translated <tf_xla>.md
to Korean ( #24904 )
...
* docs: ko: tf_xla.md
* feat: chatgpt draft
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: resolve suggestions
2023-07-25 07:43:22 -04:00
Kashif Rasul
faf25c040d
[Docs] fix rope_scaling doc string ( #25072 )
...
fix rope_scaling doc string
2023-07-25 07:34:10 -04:00