Yeyang
9093b19b13
🌐 [i18n-ZH] Translate serialization.md into Chinese ( #27076 )
...
* docs(zh): translate serialization.md
* docs(zh): add space around links
2023-10-30 08:50:29 -07:00
Yih-Dar
3224c0c13f
Remove some Kosmos-2 copied from
( #27149 )
...
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 16:07:27 +01:00
Hz, Ji
cd19b19378
make tests of pytorch_example device agnostic ( #27081 )
2023-10-30 14:56:41 +00:00
Younes Belkada
6b466771b0
[tests
/ Quantization
] Fix bnb test ( #27145 )
...
* fix bnb test
* link to GH issue
2023-10-30 15:43:08 +01:00
Yih-Dar
576994963f
Fix some tests using "common_voice"
( #27147 )
...
* Use mozilla-foundation/common_voice_11_0
* Update expected values
* Update expected values
* For test_word_time_stamp_integration
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 15:27:15 +01:00
Yih-Dar
691fd8fdde
Add Kosmos-2
model ( #24709 )
...
* Add KOSMOS-2 model
* update
* update
* update
* address review comment - 001
* address review comment - 002
* address review comment - 003
* style
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
* address review comment - 004
* address review comment - 005
* address review comment - 006
* address review comment - 007
* address review comment - 008
* address review comment - 009
* address review comment - 010
* address review comment - 011
* update readme
* fix
* fix
* fix
* [skip ci] fix
* revert the change in _decode
* fix docstring
* fix docstring
* Update docs/source/en/model_doc/kosmos-2.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* no more Kosmos2Tokenizer
* style
* remove "returned when being computed by the model"
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* UTM5 Atten
* fix attn mask
* use present_key_value_states instead of next_decoder_cache
* style
* conversion scripts
* conversion scripts
* conversion scripts
* Add _reorder_cache
* fix doctest and copies
* rename 1
* rename 2
* rename 3
* make fixup
* fix table
* fix docstring
* rename 4
* change repo_id
* remove tip
* update md file
* make style
* update md file
* put docs/source/en/model_doc/kosmos-2.md to slow
* update conversion script
* Use CLIPImageProcessor in Kosmos2Processor
* Remove Kosmos2ImageProcessor
* Remove to_dict in Kosmos2Config
* Remove files
* fix import
* Update conversion
* normalized=False
* Not using hardcoded values like <image>
* elt --> element
* Apply suggestion
* Not using hardcoded values like </image>
* No assert
* No nested functions
* Fix md file
* copy
* update doc
* fix docstring
* fix name
* Remove _add_remove_spaces_around_tag_tokens
* Remove dummy docstring of _preprocess_single_example
* Use `BatchEncoding`
* temp
* temp
* temp
* Update
* Update
* Make Kosmos2ProcessorTest a bit pretty
* Update gradient checkpointing
* Fix gradient checkpointing test
* Remove one liner remove_special_fields
* Simplify conversion script
* fix add_eos_token
* update readme
* update tests
* Change to microsoft/kosmos-2-patch14-224
* style
* Fix doc
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-30 13:32:17 +01:00
Hz, Ji
d751dbecb2
remove the obsolete code related to fairscale FSDP ( #26651 )
...
* remove the obsolete code related to fairscale FSDP
* apple review suggestion
2023-10-30 11:55:03 +00:00
Younes Belkada
5fbed2d7ca
[Trainer
/ GC
] Add gradient_checkpointing_kwargs
in trainer and training arguments ( #27068 )
...
* add `gradient_checkpointing_kwargs` in trainer and training arguments
* add comment
* add test - currently failing
* now tests pass
2023-10-30 12:41:48 +01:00
Thien Tran
e830495c1c
Fix data2vec-audio note about attention mask ( #27116 )
...
fix data2vec audio note about attention mask
2023-10-30 10:52:24 +00:00
Younes Belkada
160432110c
[FA2
/ Mistral
] Revert previous behavior with right padding + forward ( #27125 )
...
Update modeling_mistral.py
2023-10-30 11:04:50 +01:00
Yih-Dar
211ad4c9cc
Fix slack report failing for doctest ( #27042 )
...
* fix slack report for doctest
* separate reports
* style
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 10:48:24 +01:00
Gema Parreño
722e936491
[Typo fix] flag config in WANDB ( #27130 )
...
typo fix flag config
2023-10-29 18:22:26 +00:00
Daniil
9e87618f2b
Fix docstring and type hint for resize ( #27104 )
...
fix docstring and type hint for resize
2023-10-27 16:50:10 -03:00
jiaqiw09
ef23b68ebf
translate transformers_agents.md to Chinese ( #27046 )
...
* update translation
* fix problems mentioned in reviews
2023-10-27 12:45:43 -07:00
Akhil
96f9e78f4c
Added Telugu [te] translation for README.md in main ( #27077 )
...
* Create index.md
* Create _toctree.yml
* Updated index.md in telugu
* Update _toctree.yml
* Create quicktour.md
* Update quicktour.md
* Create index.md
* Update quicktour.md
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Delete docs/source/hi/index.md
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/te/quicktour.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update build_documentation.yml
Added telugu [te]
* Update build_pr_documentation.yml
Added Telugu [te]
* Update _toctree.yml
* Create README_te.md
Telugu translation for README.md
* Update README_te.md
Added Telugu translation for Readme.md
* Update README_te.md
* Update README_te.md
* Update README_te.md
* Update README_te.md
* Update README.md
* Update README_es.md
* Update README_es.md
* Update README_hd.md
* Update README_ja.md
* Update README_ko.md
* Update README_pt-br.md
* Update README_ru.md
* Update README_zh-hans.md
* Update README_zh-hant.md
* Update README_te.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-27 11:40:10 -07:00
Patrick von Platen
ac5893756b
[Attention Mask] Refactor all encoder-decoder attention mask ( #27086 )
...
* [FA2 Bart] Add FA2 to all Bart-like
* better
* Refactor attention mask
* remove all customized atteniton logic
* format
* mass rename
* replace _expand_mask
* replace _expand_mask
* mass rename
* add pt files
* mass replace & rename
* mass replace & rename
* mass replace & rename
* mass replace & rename
* Update src/transformers/models/idefics/modeling_idefics.py
* fix more
* clean more
* fix more
* make style
* fix again
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* finish
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* small fix mistral
* finish
* finish
* finish
* finish
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-27 16:42:01 +02:00
Marc Sun
29c74f58ae
fix detr device map ( #27089 )
...
* fix detr device map
* add comments
2023-10-27 10:28:12 -04:00
Younes Belkada
ffff9e70ab
[core
/ gradient_checkpointing
] Refactor GC - part 2 ( #27073 )
...
* fix
* more fixes
* fix other models
* fix long t5
* use `gradient_checkpointing_func` instead
* fix copies
* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* replace it with `is_gradient_checkpointing_set`
* remove default
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-27 16:15:22 +02:00
Marc Sun
5be1fb6d1f
Fix no split modules underlying modules ( #27090 )
...
* fix no split
* style
* remove comm
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* rename modules
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-27 09:49:20 -04:00
Lucain
66b088faf0
Provide alternative when warning on use_auth_token ( #27105 )
2023-10-27 14:32:54 +02:00
Isaac Chung
e2bffcfafd
Add early stopping for Bark generation via logits processor ( #26675 )
...
* add early stopping logits processor
* black formmated
* indent
* follow method signature
* actual logic
* check for None
* address comments on docstrings and method signature
* add unit test under `LogitsProcessorTest` wip
* unit test passing
* black formatted
* condition per sample
* add to BarkModelIntegrationTests
* wip BarkSemanticModelTest
* rename and add to kwargs handling
* not add to BarkSemanticModelTest
* correct logic and assert last outputs tokens different in test
* doc-builder style
* read from kwargs as well
* assert len of with less than that of without
* ruff
* add back seed and test case
* add original impl default suggestion
* doc-builder
* rename and use softmax
* switch back to LogitsProcessor and update docs wording
* camelCase and spelling and saving compute
* assert strictly less than
* assert less than
* expand test_generate_semantic_early_stop instead
2023-10-27 11:07:33 +01:00
Arthur
90ee9cea19
Revert "add exllamav2 arg" ( #27102 )
...
Revert "add exllamav2 arg (#26437 )"
This reverts commit 8214d6e7b1
.
2023-10-27 11:23:06 +02:00
Arthur
aa4198a238
[T5Tokenizer
] Fix fast and extra tokens ( #27085 )
...
* v4.35.dev.0
* nit t5fast match t5 slow
2023-10-27 08:18:24 +02:00
Varshaa Shetty
6f31601687
Added huggingface emoji instead of the markdown format ( #27091 )
...
Added huggingface emoji instead of the markdown format as it was not displaying the required emoji in that format
2023-10-26 14:10:16 -07:00
Zach Mueller
34a640642b
Save TB logs as part of push_to_hub ( #27022 )
...
* Support runs/
* Upload runs folder as part of push to hub
* Add a test
* Add to test deps
* Update with proposed solution from Slack
* Ensure that repo gets deleted in tests
2023-10-26 12:13:19 -04:00
L. Yeung
1892592530
Correct docstrings and a typo in comments ( #27047 )
...
* docs(training_args): correct docstrings
Correct docstrings of these methods in `TrainingArguments`:
- `set_save`
- `set_logging`
* docs(training_args): adjust words in docstrings
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* docs(trainer): correct a typo in comments
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-26 08:46:17 -07:00
Marc Sun
8214d6e7b1
add exllamav2 arg ( #26437 )
...
* add_ xllamav2 arg
* add test
* style
* add check
* add doc
* replace by use_exllama_v2
* fix tests
* fix doc
* style
* better condition
* fix logic
* add deprecate msg
2023-10-26 10:15:05 -04:00
Patrick von Platen
d7cb5e138e
[Llama FA2] Re-add _expand_attention_mask and clean a couple things ( #27074 )
...
* clean
* clean llama
* fix more
* make style
* Apply suggestions from code review
* Apply suggestions from code review
* Update src/transformers/models/llama/modeling_llama.py
* Update src/transformers/models/llama/modeling_llama.py
* Apply suggestions from code review
* finish
* make style
2023-10-26 13:06:21 +02:00
Arthur
4864d08d3e
Add-support for commit description ( #26704 )
...
* fix
* update
* revert
* add dosctring
* good to go
* update
* add a test
2023-10-26 12:37:09 +02:00
Arthur
15cd096288
Create SECURITY.md
2023-10-26 12:26:47 +02:00
Younes Belkada
fe2877ce21
Remove unneeded prints in modeling_gpt_neox.py ( #27080 )
2023-10-26 11:55:31 +02:00
Younes Belkada
efba1a1744
Bumpflash_attn
version to 2.1
( #27079 )
...
* pin FA-2 to `2.1`
* fix on modeling
2023-10-26 11:21:04 +02:00
Zach Mueller
90412401e6
Bring back set_epoch
for Accelerate-based dataloaders ( #26850 )
...
* Working tests!
* Fix sampler
* Fix
* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix check
* Clean
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-26 11:20:11 +02:00
dependabot[bot]
3c2692407d
Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert ( #26888 )
...
Bump urllib3 in /examples/research_projects/lxmert
Bumps [urllib3](https://github.com/urllib3/urllib3 ) from 1.26.17 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases )
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst )
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18 )
---
updated-dependencies:
- dependency-name: urllib3
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-26 09:10:29 +02:00
dependabot[bot]
9c5240af14
Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer ( #27072 )
...
Bump werkzeug in /examples/research_projects/decision_transformer
Bumps [werkzeug](https://github.com/pallets/werkzeug ) from 2.2.3 to 3.0.1.
- [Release notes](https://github.com/pallets/werkzeug/releases )
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1 )
---
updated-dependencies:
- dependency-name: werkzeug
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-26 08:56:28 +02:00
corey hu
df2eebf1e7
Handle unsharded Llama2 model types in conversion script ( #27069 )
...
Handle all unshared models types
2023-10-26 08:41:07 +02:00
Aarya Balwadkar
a2f55a65cd
Hindi translation of pipeline_tutorial.md ( #26837 )
...
* hindi translation of pipeline_tutorial.md
* Update pipeline_tutorial.md
* Update build_documentation.yml
* Update build_pr_documentation.yml
* Updated build_documentation.yml
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-25 11:21:49 -07:00
Yeyang
ba5144f7a9
🌐 [i18n-ZH] Translate custom_models.md into Chinese ( #27065 )
...
* docs(zh): translate custom_models.md
* minor fix in customer_models
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-25 11:20:32 -07:00
Younes Belkada
c34c50cdc0
[docs
] Add MaskGenerationPipeline
in docs ( #27063 )
...
* add `MaskGenerationPipeline` in docs
* Update __init__.py
* fix repo consistency and clarify docstring
* add on check docstirngs
* actually we do have a tf sam
* oops
2023-10-25 19:31:36 +02:00
Akash Kundu
ba073ea9e3
[DOCS] minor fixes in README.md ( #27048 )
...
minor fixes
2023-10-25 10:21:13 -07:00
Jing Hua
a64f8c1f87
[docstring] fix incorrect llama docstring: encoder -> decoder ( #27071 )
...
fix incorrect docstring: encoder -> decoder
2023-10-25 18:09:04 +02:00
Nick Hill
0baa9246cb
Fix TypicalLogitsWarper tensor OOB indexing edge case ( #26579 )
...
* Fix TypicalLogitsWarper tensor OOB indexing edge case
This can be triggerd fairly quickly with low precision e.g. bfloat16 and typical_p = 0.99.
* Shift threshold index by one
* Use explicit named arg for clamp min
2023-10-25 11:36:43 +01:00
Younes Belkada
06e782da4e
[core
] Refactor of gradient_checkpointing
( #27020 )
...
* v1
* fix
* remove `create_custom_forward`
* fixup
* fixup
* add test and fix all failing GC tests
* remove all remaining `create_custom_forward` methods
* fix idefics bug
* fixup
* replace with `__call__`
* add comment
* quality
2023-10-25 12:16:15 +02:00
Arthur
9286f0ac39
Skip-test ( #27062 )
...
* skip plbart test
* nits
* update
2023-10-25 10:47:33 +02:00
Tom Aarsen
6cbc1369a3
Fix RoPE config validation for FalconConfig + various config typos ( #26929 )
...
* Resolve incorrect ValueError in RoPE config for Falcon
* Add broken codeblock tag in Falcon Config
* Fix typo: an float -> a float
* Implement copy functionality for Fuyu and Persimmon
for RoPE scaling validation
* Make style
2023-10-24 18:37:09 +01:00
JB (Don)
a0fd34483f
Add a default decoder_attention_mask for EncoderDecoderModel during training ( #26752 )
...
* Add a default decoder_attention_mask for EncoderDecoderModel during training
Since we are already creating the default decoder_input_ids from the labels, we should also
create a default decoder_attention_mask to go with it.
* Fix test constant that relied on manual_seed()
The test was changed to use a decoder_attention_mask that ignores padding instead (which is
the default one created by BERT when attention_mask is None).
* Create the decoder_attention_mask using decoder_input_ids instead of labels
* Fix formatting in test
2023-10-24 18:26:16 +01:00
Maria Khalusova
9333bf0769
[docs] Performance docs refactor p.2 ( #26791 )
...
* initial edits
* improvements for clarity and flow
* improvements for clarity and flow, removed the repetead section
* removed two docs that had no content
* Revert "removed two docs that had no content"
This reverts commit e98fa2fa0d
.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feedback addressed
* more feedback addressed
* feedback addressed
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-24 13:10:06 -04:00
Patrick von Platen
13ef14e18e
Fix config silent copy in from_pretrained ( #27043 )
...
* Fix config modeling utils
* fix more
* fix attn mask bug
* Update src/transformers/modeling_utils.py
2023-10-24 19:05:37 +02:00
Alex McKinney
9da451713d
Device agnostic testing ( #25870 )
...
* adds agnostic decorators and availability fns
* renaming decorators and fixing imports
* updating some representative example tests
bloom, opt, and reformer for now
* wip device agnostic functions
* lru cache to device checking functions
* adds `TRANSFORMERS_TEST_DEVICE_SPEC`
if present, imports the target file and updates device to function
mappings
* comments `TRANSFORMERS_TEST_DEVICE_SPEC` code
* extra checks on device name
* `make style; make quality`
* updates default functions for agnostic calls
* applies suggestions from review
* adds `is_torch_available` guard
* Add spec file to docs, rename function dispatch names to backend_*
* add backend import to docs example for spec file
* change instances of to
* Move register backend to before device check as per @statelesshz changes
* make style
* make opt test require fp16 to run
---------
Co-authored-by: arsalanu <arsalanu@graphcore.ai>
Co-authored-by: arsalanu <hzji210@gmail.com>
2023-10-24 16:49:26 +02:00
Marc Sun
41496b95da
Add fuyu device map ( #26949 )
...
* add _no_split_modules
* style
* fix _no_split_modules
* add doc
2023-10-24 09:10:23 -04:00