Commit Graph

15053 Commits

Author SHA1 Message Date
Joao Gante
c0742b15cb
Generate - add beam indices output in contrained beam search (#25042) 2023-07-25 11:12:29 +01:00
Arthur
c53a6eae74
[RWKV] Add note in doc on RwkvStoppingCriteria (#25055)
* Add note in doc on `RwkvStoppingCriteria`

* give some breathing space to the code
2023-07-25 10:15:00 +02:00
Sylvain Gugger
d2295708a6
Better error message when signal is not supported on OS (#25049)
* Better error message when signal is not supported on OS

* Address review comments
2023-07-24 14:34:16 -04:00
seank021
c0d1c33022
🌐 [i18n-KO] Translated perf_train_cpu.md to Korean (#24911)
* dos: ko: perf_train_cpu.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* fix: manual edits

Co-authored-by: Haewon Kim <ehdvkf02@naver.com>

---------

Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
2023-07-24 17:54:13 +02:00
Younes Belkada
b08f41e62a
[8bit] Fix 8bit corner case with Blip2 8bit (#25047)
fix 8bit corner case with Blip2 8bit
2023-07-24 16:58:40 +02:00
Nate Brake
3611fc90e0
compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044)
* added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict

* check for PEFT model in compute_loss section

---------

Co-authored-by: Nathan Brake <nbrake3@mmm.com>
2023-07-24 10:53:10 -04:00
Rinat
a03d13c83d
Pvt model (#24720)
* pull and push updates

* add docs

* fix modeling

* Add and run test

* make copies

* add task

* fix tests and fix small issues

* Checks on a Pull Request

* fix docs

* add desc pvt.md
2023-07-24 15:34:19 +01:00
Sylvain Gugger
afe8bfc075
Comment again print statement 2023-07-24 10:12:20 -04:00
Sylvain Gugger
42571f6eb8
Make more test models smaller (#25005)
* Make more test models tiny

* Make more test models tiny

* More models

* More models
2023-07-24 10:08:47 -04:00
Sören Brunk
8f1f0bf50f
Fix typo in LlamaTokenizerFast docstring example (#25018) 2023-07-24 09:37:58 -04:00
Zach Mueller
3b734f5042
Add dispatch_batches to training arguments (#25038)
* Dispatch batches

* Copy items
2023-07-24 09:27:19 -04:00
Sunmin Cho
9d2b983ed0
🌐 [i18n-KO] Translated testing.md to Korean (#24900)
* docs: ko: testing.md

* feat: draft

* fix: manual edits

* fix: edit ko/_toctree.yml

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions
2023-07-24 09:24:11 -04:00
Sangam Lee
383be1b763
🌐[i18n-KO] Translated performance.md to Korean (#24883)
* dos: ko: performance.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* Update docs/source/ko/performance.md

Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>

* Update docs/source/ko/performance.md

---------

Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
2023-07-24 09:23:34 -04:00
Iskren Ivov Chernev
efb2ba666d
Better handling missing SYS in llama conversation tokenizer (#24997)
* Better handling missing SYS in llama conversation tokenizer

The existing code failed to add SYS if the conversation has history
without SYS, but did modify the passed conversation as it did.

Rearrange the code so modification to the conversation object are taken
into account for token id generation.

* Fix formatting with black

* Avoid one-liners

* Also fix fast tokenizer

* Drop List decl
2023-07-24 09:21:10 -04:00
Lucain
6704923107
Support GatedRepoError + use raise from (#25034)
* Support GatedRepoError + use raise from

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Use token instead of use_auth_token in error messages

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-24 09:12:39 -04:00
Maria Khalusova
75317aefb3
[docs] Performance docs tidy up, part 1 (#23963)
* first pass at the single gpu doc

* overview: improved clarity and navigation

* WIP

* updated intro and deepspeed sections

* improved torch.compile section

* more improvements

* minor improvements

* make style

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feedback addressed

* mdx -> md

* link fix

* feedback addressed

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-07-24 08:57:24 -04:00
Bharat Ramanathan
54ba8608d0
fix(integrations): store serialized TrainingArgs to wandb.config without sanitization. (#25035)
fix: store training args to wandb config without sanitization.

Allows resuming runs by reusing the wandb config.

Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>
2023-07-24 08:42:39 -04:00
Arthur
0906d21203
[logging.py] set default stderr path if None (#25033)
set default logger
2023-07-24 14:31:45 +02:00
Stas Bekman
c9a82be592
[check_config_docstrings.py] improve diagnostics (#25012)
* [check_config_docstrings.py] improve diagnostics

* style

* rephrase

* fix
2023-07-23 21:17:26 -07:00
Wonhyeong Seo
b257c46a07
🌐 [i18n-KO] Updated Korean serialization.md (#24686)
fix: update ko/serialization.md

* chatgpt draft
2023-07-21 19:23:59 -04:00
Sylvain Gugger
87fba947a5
Move template doc file to md (#25004) 2023-07-21 16:49:44 -04:00
Ivan Sorokin
ea41e18cfc
improve from_pretrained for zero3 multi gpus mode (#24964)
* improve from_pretrained for zero3 multi gpus mode

* Add check if torch.distributed.is_initialized

* Revert torch.distributed

---------

Co-authored-by: Stas Bekman <stas@stason.org>
2023-07-21 15:39:28 -04:00
Arthur
95f96b45ff
[Llama] remove persistent inv_freq tensor (#24998)
remove persistent tensor
2023-07-21 18:11:08 +02:00
Younes Belkada
d3ce048c20
[bnb] Add simple check for bnb import (#24995)
add simple check for bnb
2023-07-21 17:50:52 +02:00
Yih-Dar
f1a1eb4ae1
Fix llama tokenization doctest (#24990)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-21 16:47:51 +02:00
Sylvain Gugger
a7d213189d
Use main_input_name for include_inputs_for_metrics (#24993) 2023-07-21 10:30:17 -04:00
Sylvain Gugger
a6484c89b9
Fix type annotation for deepspeed training arg (#24988) 2023-07-21 09:42:05 -04:00
Sylvain Gugger
5b7ffd5492
Avoid importing all models when instantiating a pipeline (#24960)
* Avoid importing all models when instantiating a pipeline

* Remove sums that don't work
2023-07-21 09:41:56 -04:00
Sylvain Gugger
640e1b6c6f
Remove tokenizers from the doc table (#24963) 2023-07-21 09:41:36 -04:00
Arthur
0511369a8b
[LlamaConfig] Nit: pad token should be None by default (#24958)
* pad token should be None by default

* fix tests

* nits
2023-07-21 14:32:34 +02:00
Joya Chen
f74560d007
Fix missing spaces in system prompt of Llama2 tokenizer (#24930)
* Update tokenization_llama.py

* Update tokenization_llama_fast.py

* Update src/transformers/models/llama/tokenization_llama_fast.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama_fast.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-07-21 08:28:54 -04:00
Sourab Mangrulkar
f4eb459ef2
fsdp fixes and enhancements (#24980)
* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py
2023-07-21 17:52:48 +05:30
Wonhyeong Seo
ec3dfe5e24
🌐 [i18n-KO] Fixed Korean and English quicktour.md (#24664)
* fix: english/korean quicktour.md

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>

* fix: follow glossary

* 파인튜닝 -> 미세조정

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
2023-07-21 08:19:28 -04:00
Jim Allanson
83f9314d10
fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947)
* fix: cast input pixels to appropriate dtype for image_to_text tasks

* fix: add casting to pixel inputs of additional models after running copy checks
2023-07-21 08:16:57 -04:00
Sourab Mangrulkar
1c7e5e2368
fix fsdp checkpointing issues (#24926)
* fix fsdp load

* Update trainer.py

* remove saving duplicate state_dict
2023-07-21 12:17:26 +05:30
Apoorv Khandelwal
9ef5256dfb
Fallback for missing attribute Parameter.ds_numel (#24942)
* [trainer] fallback for deepspeed param count

* [trainer] more readable numel count
2023-07-20 15:19:35 -04:00
Benjamin Badger
caf5e369fc
Contrastive Search peak memory reduction (#24120)
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-20 18:46:53 +01:00
Zach Mueller
aa1b09c5d1
Change logic for logging in the examples (#24956)
Change logic
2023-07-20 12:30:10 -04:00
Younes Belkada
89a1f34271
[RWKV] Add Gradient Checkpointing support for RWKV (#24955)
add GC support for RWKV
2023-07-20 18:29:23 +02:00
dependabot[bot]
9f912ef62a
Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954)
Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-20 12:17:38 -04:00
Shauray Singh
e75cb0cb3c
fix type annotations for arguments in training_args (#24550)
* testing

* example script

* fix typehinting

* some tests

* make test

* optional update

* Union of arguments

* does this fix the issue

* remove reports

* set default to False

* documentation change

* None support

* does not need None

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574)

Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)"

This reverts commit c5e29d4381.

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* merge

* hacky fix

* fixup

---------

Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-20 10:13:13 -04:00
Shauray Singh
0c41765df4
[DOCS] Example for LogitsProcessor class (#24848)
* make docs

* fixup

* resolved

* remove debugs

* Revert "fixup"

This reverts commit 5e0f636aae.

* prev (ignore)

* fixup broke some files

* remove files

* reverting modeling_reformer

* lang fix
2023-07-20 10:09:40 -04:00
Yih-Dar
35c04596f8
Fix main_input_name in src/transformers/keras_callbacks.py (#24916)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-20 15:01:37 +02:00
Premtim Sa
85514c17d1
Update processing_vision_text_dual_encoder.py (#24950)
Fixing small typo: kwrags -> kwargs
2023-07-20 08:25:38 -04:00
dependabot[bot]
9859806608
Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949)
Bump pygments in /examples/research_projects/decision_transformer

Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-20 07:43:48 -04:00
Joao Gante
89136ff7f8
Generate: sequence bias can handle same terminations (#24822) 2023-07-20 12:23:17 +01:00
statelesshz
37d8611ac9
replace no_cuda with use_cpu in test_pytorch_examples (#24944)
* replace no_cuda with use_cpu in test_pytorch_examples

* remove codes that never be used

* fix style
2023-07-20 07:09:04 -04:00
Tom Aarsen
79444f370f
Deprecate unused OpenLlama architecture (#24922)
* Resolve typo in check_repo.py

* Specify encoding when opening modeling files

* Deprecate the OpenLlama architecture

* Add disclaimer pointing to Llama

I'm open to different wordings here

* Match the capitalisation of LLaMA
2023-07-20 07:03:24 -04:00
ranchlai
8fd8c8e49e
Add multi-label text classification support to pytorch example (#24770)
* Add text classification example

* set the problem type and finetuning task

* ruff reformated

* fix bug for unseting label_to_id for regression

* update README.md

* fixed finetuning task

* update comment

* check if label exists in feature before removing

* add useful logging
2023-07-20 07:02:44 -04:00
Jungnerd
7381987f90
🌐 [i18n-KO] Translatedtasks/document_question_answering.md to Korean (#24588)
* docs: ko: `document_question_answering.md`

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

---------

Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-07-20 06:19:36 -04:00