Commit Graph

15053 Commits

Author SHA1 Message Date
YQ
2a78720104
override .cuda() to check if model is already quantized (#25166) 2023-07-28 08:17:24 -04:00
Lucain
c1dba1111b
Add test when downloading from gated repo (#25039) 2023-07-28 08:14:27 -04:00
Lucain
6232c380f2
Fix .push_to_hub and cleanup get_full_repo_name usage (#25120)
* Fix .push_to_hub and cleanup get_full_repo_name usage

* Do not rely on Python bool conversion magic

* request changes
2023-07-28 11:40:08 +02:00
Sylvain Gugger
400e76ef11
Add new model in doc table of content (#25148) 2023-07-27 13:41:50 -04:00
Sanchit Gandhi
e93103632b
Add bloom flax (#25094)
* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
Yih-Dar
0c790ddbd1
More token things (#25146)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-27 17:42:07 +02:00
Yoach Lacombe
0b92ae3489
Add offload support to Bark (#25037)
* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docstrings of offload

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unecessary set_seed in Bark tests

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-07-27 15:35:17 +01:00
Arthur
9cea3e7b80
[MptConfig] support from pretrained args (#25116)
* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-07-27 16:24:52 +02:00
Zach Mueller
a1c4954d25
🚨🚨🚨Change default from adamw_hf to adamw_torch 🚨🚨🚨 (#25109)
* Change defaults

* Sylvain's comments
2023-07-27 09:11:28 -04:00
Bram Vanroy
9a220ce30c
Clarify 4/8 bit loading log message (#25134)
* clarify 4/8 bit loading log message

* make style
2023-07-27 09:09:27 -04:00
Arthur
9429642e2d
[T5/LlamaTokenizer] default legacy to None to not always warn (#25131)
default legacy to None
2023-07-27 14:43:18 +02:00
Pbihao
de9e3b5945
fix delete all checkpoints when save_total_limit is set to 1 (#25136) 2023-07-27 08:34:02 -04:00
Sourab Mangrulkar
a004237926
fix deepspeed load best model at end when the model gets sharded (#25057) 2023-07-27 07:11:43 +05:30
amyeroberts
1689aea733
Move center_crop to BaseImageProcessor (#25122) 2023-07-26 18:30:38 +01:00
amyeroberts
659829b6ae
MaskFormer - enable return_dict in order to compile (#25052)
* Enable return_dict in order to compile

* Update tests
2023-07-26 16:23:30 +01:00
Eric Bezzam
b914ec9847
Fix ViT docstring regarding default dropout values. (#25118)
Fix docstring for dropout.
2023-07-26 11:08:57 -04:00
amyeroberts
1486d2aec2
Move common image processing methods to BaseImageProcessor (#25089)
Move out common methods
2023-07-26 15:09:17 +01:00
Yih-Dar
d30cf3d02f
Fix past CI after #24334 (#25113)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:34:42 +02:00
Yih-Dar
224da5df69
update use_auth_token -> token (#25083)
* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:09:59 +02:00
Leo
c53c8e490c
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772)
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."

Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>
2023-07-26 09:07:21 -04:00
David Reguera
04a5c859b0
Add descriptive docstring to TemperatureLogitsWarper (#24892)
* Add descriptive docstring to TemperatureLogitsWarper

It addresses https://github.com/huggingface/transformers/issues/24783

* Remove niche features

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Commit suggestion

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Refactor the examples to simpler ones

* Add a missing comma

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Make args description more compact

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Remove extra text after making description more compact

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix linter

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-26 08:58:26 -04:00
Yih-Dar
31acba5697
Fix PvtModelIntegrationTest::test_inference_fp16 (#25106)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 14:57:44 +02:00
Kihoon Son
ee63520a7b
🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828)
* translated pipeline_webserver.md

Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update pipeline_webserver.md

* Apply suggestions from code review

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>
2023-07-26 08:40:37 -04:00
Shauray Singh
277d3aed0a
documentation for llama2 models (#25102)
* fix documentation

* changes
2023-07-26 08:30:33 -04:00
Marc Sun
a5cc30d72a
fix tied_params for meta tensor (#25101)
* fix tied_params for meta tensor

* remove duplicate
2023-07-25 18:08:45 -04:00
dependabot[bot]
f1deb21fce
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097)
Bump certifi in /examples/research_projects/visual_bert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:14 -04:00
dependabot[bot]
45bde362d2
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098)
Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:05 -04:00
dependabot[bot]
6b8dbc283c
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096)
Bump certifi in /examples/research_projects/lxmert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:24:50 -04:00
Yih-Dar
da5ff18a4a
Fix doctest (#25031)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-25 22:10:06 +02:00
Sebastian Husch Lee
8f36ab3e22
[T5, MT5, UMT5] Add [T5, MT5, UMT5]ForSequenceClassification (#24726)
* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner
2023-07-25 21:02:49 +02:00
Yih-Dar
21150cb0f3
Hotfix for failing MusicgenForConditionalGeneration tests (#25091)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-25 20:26:00 +02:00
Arthur
f9cc333805
[ PreTrainedTokenizerFast] Keep properties from fast tokenizer (#25053)
* draft solution

* use `setdefault`

* nits

* add tests and fix truncation issue

* fix test

* test passes locally

* quality

* updates

* update tsets
2023-07-25 18:45:01 +02:00
Connor Henderson
0779fc8eb8
Edit err message and comment in test_model_is_small (#25087)
* Edit err message and comment in

* put back 80M comment
2023-07-25 12:24:36 -04:00
Arthur
2fac342238
[TF] Also apply patch to support left padding (#25085)
* tf versions

* apply changes to other models

* 3 models slipped through the cracks
2023-07-25 11:23:09 -04:00
Arthur
f104522718
[ ForSequenceClassification] Support left padding (#24979)
* support left padding

* nit

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
2023-07-25 16:19:43 +02:00
Yih-Dar
1e662f0f07
Allow generic composite models to pass more kwargs (#24927)
* fix

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-25 16:07:00 +02:00
김준재_T3056
b51312e24d
🌐 [i18n-KO] Translated perf_infer_cpu.md to Korean (#24920)
* docs: ko: perf_infer_cpu.md

* feat: chatgpt draft

* fix: manual edits

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/perf_infer_cpu.md

* Update docs/source/ko/perf_infer_cpu.md

이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다!

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

동의합니다! 제가 원본에 너무 얽매여 있었네요!

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

말씀하신대로 원문에 너무 집착했던것 같습니다

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

더 나은 어휘 사용에 감사드립니다!

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

이 당시 '주기'란 용어를 생각해내질 못했네요...

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

좀 더 자연스러운 문맥이 됐네요!

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

굳이 원본 형식에 얽매일 필요가 없군요!

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/perf_infer_cpu.md

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>

---------

Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-07-25 16:04:14 +02:00
Gema Parreño
b99f7bd4fc
[DOCS] add example NoBadWordsLogitsProcessor (#25046)
* add example NoBadWordsLogitsProcessor

* fix L764 & L767

* make style
2023-07-25 09:41:48 -04:00
Arthur
dcb183f4bd
[MPT] Add MosaicML's MPT model to transformers (#24629)
* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix nit

* add another slow test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-25 14:32:40 +02:00
Xiaoke Huang
1dbc1440a7
Fix: repeat per sample for SAM image embeddings (#25074)
Repeat per sample for SAM image embeddings
2023-07-25 08:30:14 -04:00
Harheem Kim
cb8abee511
🌐 [i18n-KO] Translated hpo_train.md to Korean (#24968)
* dos: ko: hpo_train.mdx

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions
2023-07-25 08:28:20 -04:00
Arthur
f2c1df93f5
[generate] Only warn users if the generation_config's max_length is set to the default value (#25030)
* check max length is default

* nit

* update warning: no-longer deprecate

* comment in the configuration_utils in case max length's default gets changed in the futur
2023-07-25 14:20:37 +02:00
Alan Ji
c879318cc5
replace per_gpu_eval_batch_size with per_device_eval_batch_size in readme of multiple-choice task (#25078)
replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size`
in readme of multiple-choice
2023-07-25 08:11:56 -04:00
Susnato Dhar
25e443c0d4
Fix broken link in README_hd.md (#25067)
Update README_hd.md
2023-07-25 08:09:01 -04:00
Xuehai Pan
6bc61aa7af
Set TF32 flag for PyTorch cuDNN backend (#25075) 2023-07-25 08:04:48 -04:00
Injin Paek
5dba88b2d2
fix: add TOC anchor link (#25066) 2023-07-25 08:02:33 -04:00
Sylvain Gugger
f295fc8a16
Fix last models for common tests that are too big. (#25058)
* Fix last models for common tests that are too big.

* Remove print statement
2023-07-25 07:56:04 -04:00
Sangam Lee
ee1eb3b325
🌐 [i18n-KO] Translated perf_hardware.md to Korean (#24966)
* docs: ko: perf_hardware.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions

Co-authored-by: Haewon Kim <ehdvkf02@naver.com>

* Fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: fix rendering error of perf_hardware.md

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
2023-07-25 07:44:24 -04:00
Haewon Kim
f6fe1d5514
🌐 [i18n-KO] Translated <tf_xla>.md to Korean (#24904)
* docs: ko: tf_xla.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions
2023-07-25 07:43:22 -04:00
Kashif Rasul
faf25c040d
[Docs] fix rope_scaling doc string (#25072)
fix rope_scaling doc string
2023-07-25 07:34:10 -04:00