Susnato Dhar
b5db8ca66f
Add flash attention for gpt_bigcode
( #26479 )
...
* added flash attention of gpt_bigcode
* changed docs
* Update src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py
* add FA-2 docs
* oops
* Update docs/source/en/perf_infer_gpu_one.md Last Nit
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* oops
* remove padding_mask
* change getattr->hasattr logic
* changed .md file
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-31 11:21:02 +00:00
Yih-Dar
9dc4ce9ea7
Disable CI runner check ( #27170 )
...
Disable runner check
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-31 11:59:21 +01:00
Seungwoo, Jeong
14bb196cc8
[doctring] Fix docstring for BlipTextConfig, BlipVisionConfig ( #27173 )
...
Update configuration_blip.py
edit docstrings
2023-10-31 10:41:56 +00:00
Akshar Goyal
9234caefb0
[docstring] Fix docstring for AltCLIPTextConfig, AltCLIPVisionConfig and AltCLIPConfig ( #27128 )
...
* [docstring] Fix docstring for AltCLIPVisionConfig, AltCLIPTextConfig + cleaned some docstring
* Removed entries from check_docstring.py
* Removed entries from check_docstring.py
* Removed entry from check_docstring.py
* [docstring] Fix docstring for AltCLIPTextConfig, AltCLIPVisionConfig and AltCLIPConfig
2023-10-31 10:20:14 +00:00
Clifford Ressel
b5c8e23f0f
Remove broken links to s-JoL/Open-Llama ( #27164 )
2023-10-31 10:17:54 +00:00
Hz, Ji
df6f36a171
deprecate function get_default_device
in tools/base.py
( #26774 )
...
* get default device through `PartialState().default_device` as is has
been officially released
* apply code review suggestion
* apply code review suggestion
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2023-10-31 09:15:39 +00:00
NielsRogge
8211c59b9a
[KOSMOS-2] Update docs ( #27157 )
...
Update docs
2023-10-30 21:42:19 +01:00
NielsRogge
d39352d12c
Fix import of torch.utils.checkpoint ( #27155 )
...
* Fix import
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-10-30 20:08:29 +00:00
MD FAIZAN KHAN
e971486d89
Fix: typos in README.md ( #27154 )
2023-10-30 19:12:09 +00:00
Younes Belkada
f7ea959b96
[core
/ GC
/ tests
] Stronger GC tests ( #27124 )
...
* stronger GC tests
* better tests and skip failing tests
* break down into 3 sub-tests
* break down into 3 sub-tests
* refactor a bit
* more refactor
* fix
* last nit
* credits contrib and suggestions
* credits contrib and suggestions
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-30 19:53:46 +01:00
Hz, Ji
5bbf671276
Device agnostic trainer testing ( #27131 )
2023-10-30 18:16:40 +00:00
Rockerz
84724efd10
Translating en/main_classes
folder docs to Japanese 🇯🇵 ( #26894 )
...
* add
* add
* add
* Add deepspeed.md
* Add
* add
* Update docs/source/ja/main_classes/callback.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/output.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/pipelines.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/text_generation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update logging.md
* Update toctree.yml
* Update docs/source/ja/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add suggesitons
* m
* Update docs/source/ja/main_classes/trainer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update toctree.yml
* Update Quantization.md
* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update toctree.yml
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-30 09:39:14 -07:00
Yeyang
9093b19b13
🌐 [i18n-ZH] Translate serialization.md into Chinese ( #27076 )
...
* docs(zh): translate serialization.md
* docs(zh): add space around links
2023-10-30 08:50:29 -07:00
Yih-Dar
3224c0c13f
Remove some Kosmos-2 copied from
( #27149 )
...
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 16:07:27 +01:00
Hz, Ji
cd19b19378
make tests of pytorch_example device agnostic ( #27081 )
2023-10-30 14:56:41 +00:00
Younes Belkada
6b466771b0
[tests
/ Quantization
] Fix bnb test ( #27145 )
...
* fix bnb test
* link to GH issue
2023-10-30 15:43:08 +01:00
Yih-Dar
576994963f
Fix some tests using "common_voice"
( #27147 )
...
* Use mozilla-foundation/common_voice_11_0
* Update expected values
* Update expected values
* For test_word_time_stamp_integration
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 15:27:15 +01:00
Yih-Dar
691fd8fdde
Add Kosmos-2
model ( #24709 )
...
* Add KOSMOS-2 model
* update
* update
* update
* address review comment - 001
* address review comment - 002
* address review comment - 003
* style
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
* address review comment - 004
* address review comment - 005
* address review comment - 006
* address review comment - 007
* address review comment - 008
* address review comment - 009
* address review comment - 010
* address review comment - 011
* update readme
* fix
* fix
* fix
* [skip ci] fix
* revert the change in _decode
* fix docstring
* fix docstring
* Update docs/source/en/model_doc/kosmos-2.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* no more Kosmos2Tokenizer
* style
* remove "returned when being computed by the model"
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* UTM5 Atten
* fix attn mask
* use present_key_value_states instead of next_decoder_cache
* style
* conversion scripts
* conversion scripts
* conversion scripts
* Add _reorder_cache
* fix doctest and copies
* rename 1
* rename 2
* rename 3
* make fixup
* fix table
* fix docstring
* rename 4
* change repo_id
* remove tip
* update md file
* make style
* update md file
* put docs/source/en/model_doc/kosmos-2.md to slow
* update conversion script
* Use CLIPImageProcessor in Kosmos2Processor
* Remove Kosmos2ImageProcessor
* Remove to_dict in Kosmos2Config
* Remove files
* fix import
* Update conversion
* normalized=False
* Not using hardcoded values like <image>
* elt --> element
* Apply suggestion
* Not using hardcoded values like </image>
* No assert
* No nested functions
* Fix md file
* copy
* update doc
* fix docstring
* fix name
* Remove _add_remove_spaces_around_tag_tokens
* Remove dummy docstring of _preprocess_single_example
* Use `BatchEncoding`
* temp
* temp
* temp
* Update
* Update
* Make Kosmos2ProcessorTest a bit pretty
* Update gradient checkpointing
* Fix gradient checkpointing test
* Remove one liner remove_special_fields
* Simplify conversion script
* fix add_eos_token
* update readme
* update tests
* Change to microsoft/kosmos-2-patch14-224
* style
* Fix doc
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-30 13:32:17 +01:00
Hz, Ji
d751dbecb2
remove the obsolete code related to fairscale FSDP ( #26651 )
...
* remove the obsolete code related to fairscale FSDP
* apple review suggestion
2023-10-30 11:55:03 +00:00
Younes Belkada
5fbed2d7ca
[Trainer
/ GC
] Add gradient_checkpointing_kwargs
in trainer and training arguments ( #27068 )
...
* add `gradient_checkpointing_kwargs` in trainer and training arguments
* add comment
* add test - currently failing
* now tests pass
2023-10-30 12:41:48 +01:00
Thien Tran
e830495c1c
Fix data2vec-audio note about attention mask ( #27116 )
...
fix data2vec audio note about attention mask
2023-10-30 10:52:24 +00:00
Younes Belkada
160432110c
[FA2
/ Mistral
] Revert previous behavior with right padding + forward ( #27125 )
...
Update modeling_mistral.py
2023-10-30 11:04:50 +01:00
Yih-Dar
211ad4c9cc
Fix slack report failing for doctest ( #27042 )
...
* fix slack report for doctest
* separate reports
* style
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 10:48:24 +01:00
Gema Parreño
722e936491
[Typo fix] flag config in WANDB ( #27130 )
...
typo fix flag config
2023-10-29 18:22:26 +00:00
Daniil
9e87618f2b
Fix docstring and type hint for resize ( #27104 )
...
fix docstring and type hint for resize
2023-10-27 16:50:10 -03:00
jiaqiw09
ef23b68ebf
translate transformers_agents.md to Chinese ( #27046 )
...
* update translation
* fix problems mentioned in reviews
2023-10-27 12:45:43 -07:00
Akhil
96f9e78f4c
Added Telugu [te] translation for README.md in main ( #27077 )
...
* Create index.md
* Create _toctree.yml
* Updated index.md in telugu
* Update _toctree.yml
* Create quicktour.md
* Update quicktour.md
* Create index.md
* Update quicktour.md
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Delete docs/source/hi/index.md
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update build_documentation.yml
Added telugu [te]
* Update build_pr_documentation.yml
Added Telugu [te]
* Update _toctree.yml
* Create README_te.md
Telugu translation for README.md
* Update README_te.md
Added Telugu translation for Readme.md
* Update README_te.md
* Update README_te.md
* Update README_te.md
* Update README_te.md
* Update README.md
* Update README_es.md
* Update README_es.md
* Update README_hd.md
* Update README_ja.md
* Update README_ko.md
* Update README_pt-br.md
* Update README_ru.md
* Update README_zh-hans.md
* Update README_zh-hant.md
* Update README_te.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-27 11:40:10 -07:00
Patrick von Platen
ac5893756b
[Attention Mask] Refactor all encoder-decoder attention mask ( #27086 )
...
* [FA2 Bart] Add FA2 to all Bart-like
* better
* Refactor attention mask
* remove all customized atteniton logic
* format
* mass rename
* replace _expand_mask
* replace _expand_mask
* mass rename
* add pt files
* mass replace & rename
* mass replace & rename
* mass replace & rename
* mass replace & rename
* Update src/transformers/models/idefics/modeling_idefics.py
* fix more
* clean more
* fix more
* make style
* fix again
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* small fix mistral
* finish
* finish
* finish
* finish
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-27 16:42:01 +02:00
Marc Sun
29c74f58ae
fix detr device map ( #27089 )
...
* fix detr device map
* add comments
2023-10-27 10:28:12 -04:00
Younes Belkada
ffff9e70ab
[core
/ gradient_checkpointing
] Refactor GC - part 2 ( #27073 )
...
* fix
* more fixes
* fix other models
* fix long t5
* use `gradient_checkpointing_func` instead
* fix copies
* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* replace it with `is_gradient_checkpointing_set`
* remove default
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-27 16:15:22 +02:00
Marc Sun
5be1fb6d1f
Fix no split modules underlying modules ( #27090 )
...
* fix no split
* style
* remove comm
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* rename modules
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-27 09:49:20 -04:00
Lucain
66b088faf0
Provide alternative when warning on use_auth_token ( #27105 )
2023-10-27 14:32:54 +02:00
Isaac Chung
e2bffcfafd
Add early stopping for Bark generation via logits processor ( #26675 )
...
* add early stopping logits processor
* black formmated
* indent
* follow method signature
* actual logic
* check for None
* address comments on docstrings and method signature
* add unit test under `LogitsProcessorTest` wip
* unit test passing
* black formatted
* condition per sample
* add to BarkModelIntegrationTests
* wip BarkSemanticModelTest
* rename and add to kwargs handling
* not add to BarkSemanticModelTest
* correct logic and assert last outputs tokens different in test
* doc-builder style
* read from kwargs as well
* assert len of with less than that of without
* ruff
* add back seed and test case
* add original impl default suggestion
* doc-builder
* rename and use softmax
* switch back to LogitsProcessor and update docs wording
* camelCase and spelling and saving compute
* assert strictly less than
* assert less than
* expand test_generate_semantic_early_stop instead
2023-10-27 11:07:33 +01:00
Arthur
90ee9cea19
Revert "add exllamav2 arg" ( #27102 )
...
Revert "add exllamav2 arg (#26437 )"
This reverts commit 8214d6e7b1
.
2023-10-27 11:23:06 +02:00
Arthur
aa4198a238
[T5Tokenizer
] Fix fast and extra tokens ( #27085 )
...
* v4.35.dev.0
* nit t5fast match t5 slow
2023-10-27 08:18:24 +02:00
Varshaa Shetty
6f31601687
Added huggingface emoji instead of the markdown format ( #27091 )
...
Added huggingface emoji instead of the markdown format as it was not displaying the required emoji in that format
2023-10-26 14:10:16 -07:00
Zach Mueller
34a640642b
Save TB logs as part of push_to_hub ( #27022 )
...
* Support runs/
* Upload runs folder as part of push to hub
* Add a test
* Add to test deps
* Update with proposed solution from Slack
* Ensure that repo gets deleted in tests
2023-10-26 12:13:19 -04:00
L. Yeung
1892592530
Correct docstrings and a typo in comments ( #27047 )
...
* docs(training_args): correct docstrings
Correct docstrings of these methods in `TrainingArguments`:
- `set_save`
- `set_logging`
* docs(training_args): adjust words in docstrings
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* docs(trainer): correct a typo in comments
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-26 08:46:17 -07:00
Marc Sun
8214d6e7b1
add exllamav2 arg ( #26437 )
...
* add_ xllamav2 arg
* add test
* style
* add check
* add doc
* replace by use_exllama_v2
* fix tests
* fix doc
* style
* better condition
* fix logic
* add deprecate msg
2023-10-26 10:15:05 -04:00
Patrick von Platen
d7cb5e138e
[Llama FA2] Re-add _expand_attention_mask and clean a couple things ( #27074 )
...
* clean
* clean llama
* fix more
* make style
* Apply suggestions from code review
* Apply suggestions from code review
* Update src/transformers/models/llama/modeling_llama.py
* Update src/transformers/models/llama/modeling_llama.py
* Apply suggestions from code review
* finish
* make style
2023-10-26 13:06:21 +02:00
Arthur
4864d08d3e
Add-support for commit description ( #26704 )
...
* fix
* update
* revert
* add dosctring
* good to go
* update
* add a test
2023-10-26 12:37:09 +02:00
Arthur
15cd096288
Create SECURITY.md
2023-10-26 12:26:47 +02:00
Younes Belkada
fe2877ce21
Remove unneeded prints in modeling_gpt_neox.py ( #27080 )
2023-10-26 11:55:31 +02:00
Younes Belkada
efba1a1744
Bumpflash_attn
version to 2.1
( #27079 )
...
* pin FA-2 to `2.1`
* fix on modeling
2023-10-26 11:21:04 +02:00
Zach Mueller
90412401e6
Bring back set_epoch
for Accelerate-based dataloaders ( #26850 )
...
* Working tests!
* Fix sampler
* Fix
* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix check
* Clean
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-26 11:20:11 +02:00
dependabot[bot]
3c2692407d
Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert ( #26888 )
...
Bump urllib3 in /examples/research_projects/lxmert
Bumps [urllib3](https://github.com/urllib3/urllib3 ) from 1.26.17 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases )
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst )
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18 )
---
updated-dependencies:
- dependency-name: urllib3
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-26 09:10:29 +02:00
dependabot[bot]
9c5240af14
Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer ( #27072 )
...
Bump werkzeug in /examples/research_projects/decision_transformer
Bumps [werkzeug](https://github.com/pallets/werkzeug ) from 2.2.3 to 3.0.1.
- [Release notes](https://github.com/pallets/werkzeug/releases )
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1 )
---
updated-dependencies:
- dependency-name: werkzeug
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-26 08:56:28 +02:00
corey hu
df2eebf1e7
Handle unsharded Llama2 model types in conversion script ( #27069 )
...
Handle all unshared models types
2023-10-26 08:41:07 +02:00
Aarya Balwadkar
a2f55a65cd
Hindi translation of pipeline_tutorial.md ( #26837 )
...
* hindi translation of pipeline_tutorial.md
* Update pipeline_tutorial.md
* Update build_documentation.yml
* Update build_pr_documentation.yml
* Updated build_documentation.yml
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-25 11:21:49 -07:00
Yeyang
ba5144f7a9
🌐 [i18n-ZH] Translate custom_models.md into Chinese ( #27065 )
...
* docs(zh): translate custom_models.md
* minor fix in customer_models
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-25 11:20:32 -07:00