Commit Graph

14331 Commits

Author SHA1 Message Date
Yeyang
9093b19b13
🌐 [i18n-ZH] Translate serialization.md into Chinese (#27076)
* docs(zh): translate serialization.md

* docs(zh): add space around links
2023-10-30 08:50:29 -07:00
Yih-Dar
3224c0c13f
Remove some Kosmos-2 copied from (#27149)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 16:07:27 +01:00
Hz, Ji
cd19b19378
make tests of pytorch_example device agnostic (#27081) 2023-10-30 14:56:41 +00:00
Younes Belkada
6b466771b0
[tests / Quantization] Fix bnb test (#27145)
* fix bnb test

* link to GH issue
2023-10-30 15:43:08 +01:00
Yih-Dar
576994963f
Fix some tests using "common_voice" (#27147)
* Use mozilla-foundation/common_voice_11_0

* Update expected values

* Update expected values

* For test_word_time_stamp_integration

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 15:27:15 +01:00
Yih-Dar
691fd8fdde
Add Kosmos-2 model (#24709)
* Add KOSMOS-2 model

* update

* update

* update

* address review comment - 001

* address review comment - 002

* address review comment - 003

* style

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* address review comment - 004

* address review comment - 005

* address review comment - 006

* address review comment - 007

* address review comment - 008

* address review comment - 009

* address review comment - 010

* address review comment - 011

* update readme

* fix

* fix

* fix

* [skip ci] fix

* revert the change in _decode

* fix docstring

* fix docstring

* Update docs/source/en/model_doc/kosmos-2.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* no more Kosmos2Tokenizer

* style

* remove "returned when being computed by the model"

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* UTM5 Atten

* fix attn mask

* use present_key_value_states instead of next_decoder_cache

* style

* conversion scripts

* conversion scripts

* conversion scripts

* Add _reorder_cache

* fix doctest and copies

* rename 1

* rename 2

* rename 3

* make fixup

* fix table

* fix docstring

* rename 4

* change repo_id

* remove tip

* update md file

* make style

* update md file

* put docs/source/en/model_doc/kosmos-2.md to slow

* update conversion script

* Use CLIPImageProcessor in Kosmos2Processor

* Remove Kosmos2ImageProcessor

* Remove to_dict in Kosmos2Config

* Remove files

* fix import

* Update conversion

* normalized=False

* Not using hardcoded values like <image>

* elt --> element

* Apply suggestion

* Not using hardcoded values like </image>

* No assert

* No nested functions

* Fix md file

* copy

* update doc

* fix docstring

* fix name

* Remove _add_remove_spaces_around_tag_tokens

* Remove dummy docstring of _preprocess_single_example

* Use `BatchEncoding`

* temp

* temp

* temp

* Update

* Update

* Make Kosmos2ProcessorTest a bit pretty

* Update gradient checkpointing

* Fix gradient checkpointing test

* Remove one liner remove_special_fields

* Simplify conversion script

* fix add_eos_token

* update readme

* update tests

* Change to microsoft/kosmos-2-patch14-224

* style

* Fix doc

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-30 13:32:17 +01:00
Hz, Ji
d751dbecb2
remove the obsolete code related to fairscale FSDP (#26651)
* remove the obsolete code related to fairscale FSDP

* apple review suggestion
2023-10-30 11:55:03 +00:00
Younes Belkada
5fbed2d7ca
[Trainer / GC] Add gradient_checkpointing_kwargs in trainer and training arguments (#27068)
* add `gradient_checkpointing_kwargs` in trainer and training arguments

* add comment

* add test - currently failing

* now tests pass
2023-10-30 12:41:48 +01:00
Thien Tran
e830495c1c
Fix data2vec-audio note about attention mask (#27116)
fix data2vec audio note about attention mask
2023-10-30 10:52:24 +00:00
Younes Belkada
160432110c
[FA2/ Mistral] Revert previous behavior with right padding + forward (#27125)
Update modeling_mistral.py
2023-10-30 11:04:50 +01:00
Yih-Dar
211ad4c9cc
Fix slack report failing for doctest (#27042)
* fix slack report for doctest

* separate reports

* style

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-10-30 10:48:24 +01:00
Gema Parreño
722e936491
[Typo fix] flag config in WANDB (#27130)
typo fix flag config
2023-10-29 18:22:26 +00:00
Daniil
9e87618f2b
Fix docstring and type hint for resize (#27104)
fix docstring and type hint for resize
2023-10-27 16:50:10 -03:00
jiaqiw09
ef23b68ebf
translate transformers_agents.md to Chinese (#27046)
* update translation

* fix problems mentioned in reviews
2023-10-27 12:45:43 -07:00
Akhil
96f9e78f4c
Added Telugu [te] translation for README.md in main (#27077)
* Create index.md

* Create _toctree.yml

* Updated index.md in telugu

* Update _toctree.yml

* Create quicktour.md

* Update quicktour.md

* Create index.md

* Update quicktour.md

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Delete docs/source/hi/index.md

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/te/quicktour.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update build_documentation.yml

Added telugu [te]

* Update build_pr_documentation.yml

Added Telugu [te]

* Update _toctree.yml

* Create README_te.md

Telugu translation for README.md

* Update README_te.md

Added Telugu translation for Readme.md

* Update README_te.md

* Update README_te.md

* Update README_te.md

* Update README_te.md

* Update README.md

* Update README_es.md

* Update README_es.md

* Update README_hd.md

* Update README_ja.md

* Update README_ko.md

* Update README_pt-br.md

* Update README_ru.md

* Update README_zh-hans.md

* Update README_zh-hant.md

* Update README_te.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-27 11:40:10 -07:00
Patrick von Platen
ac5893756b
[Attention Mask] Refactor all encoder-decoder attention mask (#27086)
* [FA2 Bart] Add FA2 to all Bart-like

* better

* Refactor attention mask

* remove all customized atteniton logic

* format

* mass rename

* replace _expand_mask

* replace _expand_mask

* mass rename

* add pt files

* mass replace & rename

* mass replace & rename

* mass replace & rename

* mass replace & rename

* Update src/transformers/models/idefics/modeling_idefics.py

* fix more

* clean more

* fix more

* make style

* fix again

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* finish

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* small fix mistral

* finish

* finish

* finish

* finish

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-27 16:42:01 +02:00
Marc Sun
29c74f58ae
fix detr device map (#27089)
* fix detr device map

* add comments
2023-10-27 10:28:12 -04:00
Younes Belkada
ffff9e70ab
[core/ gradient_checkpointing] Refactor GC - part 2 (#27073)
* fix

* more fixes

* fix other models

* fix long t5

* use `gradient_checkpointing_func` instead

* fix copies

* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* replace it with `is_gradient_checkpointing_set`

* remove default

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-27 16:15:22 +02:00
Marc Sun
5be1fb6d1f
Fix no split modules underlying modules (#27090)
* fix no split

* style

* remove comm

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* rename modules

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-27 09:49:20 -04:00
Lucain
66b088faf0
Provide alternative when warning on use_auth_token (#27105) 2023-10-27 14:32:54 +02:00
Isaac Chung
e2bffcfafd
Add early stopping for Bark generation via logits processor (#26675)
* add early stopping logits processor

* black formmated

* indent

* follow method signature

* actual logic

* check for None

* address comments on docstrings and method signature

* add unit test under `LogitsProcessorTest` wip

* unit test passing

* black formatted

* condition per sample

* add to BarkModelIntegrationTests

* wip BarkSemanticModelTest

* rename and add to kwargs handling

* not add to BarkSemanticModelTest

* correct logic and assert last outputs tokens different in test

* doc-builder style

* read from kwargs as well

* assert len of with less than that of without

* ruff

* add back seed and test case

* add original impl default suggestion

* doc-builder

* rename and use softmax

* switch back to LogitsProcessor and update docs wording

* camelCase and spelling and saving compute

* assert strictly less than

* assert less than

* expand test_generate_semantic_early_stop instead
2023-10-27 11:07:33 +01:00
Arthur
90ee9cea19
Revert "add exllamav2 arg" (#27102)
Revert "add exllamav2 arg (#26437)"

This reverts commit 8214d6e7b1.
2023-10-27 11:23:06 +02:00
Arthur
aa4198a238
[T5Tokenizer] Fix fast and extra tokens (#27085)
* v4.35.dev.0

* nit t5fast match t5 slow
2023-10-27 08:18:24 +02:00
Varshaa Shetty
6f31601687
Added huggingface emoji instead of the markdown format (#27091)
Added huggingface emoji instead of the markdown format as it was not displaying the required emoji in that format
2023-10-26 14:10:16 -07:00
Zach Mueller
34a640642b
Save TB logs as part of push_to_hub (#27022)
* Support runs/

* Upload runs folder as part of push to hub

* Add a test

* Add to test deps

* Update with proposed solution from Slack

* Ensure that repo gets deleted in tests
2023-10-26 12:13:19 -04:00
L. Yeung
1892592530
Correct docstrings and a typo in comments (#27047)
* docs(training_args): correct docstrings

Correct docstrings of these methods in `TrainingArguments`:

- `set_save`
- `set_logging`

* docs(training_args): adjust words in docstrings

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* docs(trainer): correct a typo in comments

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-26 08:46:17 -07:00
Marc Sun
8214d6e7b1
add exllamav2 arg (#26437)
* add_ xllamav2 arg

* add test

* style

* add check

* add doc

* replace by use_exllama_v2

* fix tests

* fix doc

* style

* better condition

* fix logic

* add deprecate msg
2023-10-26 10:15:05 -04:00
Patrick von Platen
d7cb5e138e
[Llama FA2] Re-add _expand_attention_mask and clean a couple things (#27074)
* clean

* clean llama

* fix more

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* Update src/transformers/models/llama/modeling_llama.py

* Update src/transformers/models/llama/modeling_llama.py

* Apply suggestions from code review

* finish

* make style
2023-10-26 13:06:21 +02:00
Arthur
4864d08d3e
Add-support for commit description (#26704)
* fix

* update

* revert

* add dosctring

* good to go

* update

* add a test
2023-10-26 12:37:09 +02:00
Arthur
15cd096288
Create SECURITY.md 2023-10-26 12:26:47 +02:00
Younes Belkada
fe2877ce21
Remove unneeded prints in modeling_gpt_neox.py (#27080) 2023-10-26 11:55:31 +02:00
Younes Belkada
efba1a1744
Bumpflash_attn version to 2.1 (#27079)
* pin FA-2 to `2.1`

* fix on modeling
2023-10-26 11:21:04 +02:00
Zach Mueller
90412401e6
Bring back set_epoch for Accelerate-based dataloaders (#26850)
* Working tests!

* Fix sampler

* Fix

* Update src/transformers/trainer.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix check

* Clean

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-26 11:20:11 +02:00
dependabot[bot]
3c2692407d
Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert (#26888)
Bump urllib3 in /examples/research_projects/lxmert

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-26 09:10:29 +02:00
dependabot[bot]
9c5240af14
Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer (#27072)
Bump werkzeug in /examples/research_projects/decision_transformer

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.2.3 to 3.0.1.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-26 08:56:28 +02:00
corey hu
df2eebf1e7
Handle unsharded Llama2 model types in conversion script (#27069)
Handle all unshared models types
2023-10-26 08:41:07 +02:00
Aarya Balwadkar
a2f55a65cd
Hindi translation of pipeline_tutorial.md (#26837)
* hindi translation of pipeline_tutorial.md

* Update pipeline_tutorial.md

* Update build_documentation.yml

* Update build_pr_documentation.yml

* Updated build_documentation.yml

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-25 11:21:49 -07:00
Yeyang
ba5144f7a9
🌐 [i18n-ZH] Translate custom_models.md into Chinese (#27065)
* docs(zh): translate custom_models.md

* minor fix in customer_models

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-25 11:20:32 -07:00
Younes Belkada
c34c50cdc0
[docs] Add MaskGenerationPipeline in docs (#27063)
* add `MaskGenerationPipeline` in docs

* Update __init__.py

* fix repo consistency and clarify docstring

* add on check docstirngs

* actually we do have a tf sam

* oops
2023-10-25 19:31:36 +02:00
Akash Kundu
ba073ea9e3
[DOCS] minor fixes in README.md (#27048)
minor fixes
2023-10-25 10:21:13 -07:00
Jing Hua
a64f8c1f87
[docstring] fix incorrect llama docstring: encoder -> decoder (#27071)
fix incorrect docstring: encoder -> decoder
2023-10-25 18:09:04 +02:00
Nick Hill
0baa9246cb
Fix TypicalLogitsWarper tensor OOB indexing edge case (#26579)
* Fix TypicalLogitsWarper tensor OOB indexing edge case

This can be triggerd fairly quickly with low precision e.g. bfloat16 and typical_p = 0.99.

* Shift threshold index by one

* Use explicit named arg for clamp min
2023-10-25 11:36:43 +01:00
Younes Belkada
06e782da4e
[core] Refactor of gradient_checkpointing (#27020)
* v1

* fix

* remove `create_custom_forward`

* fixup

* fixup

* add test and fix all failing GC tests

* remove all remaining `create_custom_forward` methods

* fix idefics bug

* fixup

* replace with `__call__`

* add comment

* quality
2023-10-25 12:16:15 +02:00
Arthur
9286f0ac39
Skip-test (#27062)
* skip plbart test

* nits

* update
2023-10-25 10:47:33 +02:00
Tom Aarsen
6cbc1369a3
Fix RoPE config validation for FalconConfig + various config typos (#26929)
* Resolve incorrect ValueError in RoPE config for Falcon

* Add broken codeblock tag in Falcon Config

* Fix typo: an float -> a float

* Implement copy functionality for Fuyu and Persimmon

for RoPE scaling validation

* Make style
2023-10-24 18:37:09 +01:00
JB (Don)
a0fd34483f
Add a default decoder_attention_mask for EncoderDecoderModel during training (#26752)
* Add a default decoder_attention_mask for EncoderDecoderModel during training

Since we are already creating the default decoder_input_ids from the labels, we should also
create a default decoder_attention_mask to go with it.

* Fix test constant that relied on manual_seed()

The test was changed to use a decoder_attention_mask that ignores padding instead (which is
the default one created by BERT when attention_mask is None).

* Create the decoder_attention_mask using decoder_input_ids instead of labels

* Fix formatting in test
2023-10-24 18:26:16 +01:00
Maria Khalusova
9333bf0769
[docs] Performance docs refactor p.2 (#26791)
* initial edits

* improvements for clarity and flow

* improvements for clarity and flow, removed the repetead section

* removed two docs that had no content

* Revert "removed two docs that had no content"

This reverts commit e98fa2fa0d.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feedback addressed

* more feedback addressed

* feedback addressed

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-24 13:10:06 -04:00
Patrick von Platen
13ef14e18e
Fix config silent copy in from_pretrained (#27043)
* Fix config modeling utils

* fix more

* fix attn mask bug

* Update src/transformers/modeling_utils.py
2023-10-24 19:05:37 +02:00
Alex McKinney
9da451713d
Device agnostic testing (#25870)
* adds agnostic decorators and availability fns

* renaming decorators and fixing imports

* updating some representative example tests
bloom, opt, and reformer for now

* wip device agnostic functions

* lru cache to device checking functions

* adds `TRANSFORMERS_TEST_DEVICE_SPEC`
if present, imports the target file and updates device to function
mappings

* comments `TRANSFORMERS_TEST_DEVICE_SPEC` code

* extra checks on device name

* `make style; make quality`

* updates default functions for agnostic calls

* applies suggestions from review

* adds `is_torch_available` guard

* Add spec file to docs, rename function dispatch names to backend_*

* add backend import to docs example for spec file

* change instances of  to

* Move register backend to before device check as per @statelesshz changes

* make style

* make opt test require fp16 to run

---------

Co-authored-by: arsalanu <arsalanu@graphcore.ai>
Co-authored-by: arsalanu <hzji210@gmail.com>
2023-10-24 16:49:26 +02:00
Marc Sun
41496b95da
Add fuyu device map (#26949)
* add _no_split_modules

* style

* fix _no_split_modules

* add doc
2023-10-24 09:10:23 -04:00