Younes Belkada
d3ce048c20
[bnb
] Add simple check for bnb import ( #24995 )
...
add simple check for bnb
2023-07-21 17:50:52 +02:00
Yih-Dar
f1a1eb4ae1
Fix llama
tokenization doctest ( #24990 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-21 16:47:51 +02:00
Sylvain Gugger
a7d213189d
Use main_input_name for include_inputs_for_metrics ( #24993 )
2023-07-21 10:30:17 -04:00
Sylvain Gugger
a6484c89b9
Fix type annotation for deepspeed training arg ( #24988 )
2023-07-21 09:42:05 -04:00
Sylvain Gugger
5b7ffd5492
Avoid importing all models when instantiating a pipeline ( #24960 )
...
* Avoid importing all models when instantiating a pipeline
* Remove sums that don't work
2023-07-21 09:41:56 -04:00
Sylvain Gugger
640e1b6c6f
Remove tokenizers from the doc table ( #24963 )
2023-07-21 09:41:36 -04:00
Arthur
0511369a8b
[LlamaConfig
] Nit: pad token should be None by default ( #24958 )
...
* pad token should be None by default
* fix tests
* nits
2023-07-21 14:32:34 +02:00
Joya Chen
f74560d007
Fix missing spaces in system prompt of Llama2 tokenizer ( #24930 )
...
* Update tokenization_llama.py
* Update tokenization_llama_fast.py
* Update src/transformers/models/llama/tokenization_llama_fast.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/llama/tokenization_llama_fast.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-07-21 08:28:54 -04:00
Sourab Mangrulkar
f4eb459ef2
fsdp fixes and enhancements ( #24980 )
...
* fix fsdp prepare to remove the warnings and fix excess memory usage
* Update training_args.py
* parity for FSDP+XLA
* Update trainer.py
2023-07-21 17:52:48 +05:30
Wonhyeong Seo
ec3dfe5e24
🌐 [i18n-KO] Fixed Korean and English quicktour.md
( #24664 )
...
* fix: english/korean quicktour.md
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
* fix: follow glossary
* 파인튜닝 -> 미세조정
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
2023-07-21 08:19:28 -04:00
Jim Allanson
83f9314d10
fix: cast input pixels to appropriate dtype for image_to_text pipelines ( #24947 )
...
* fix: cast input pixels to appropriate dtype for image_to_text tasks
* fix: add casting to pixel inputs of additional models after running copy checks
2023-07-21 08:16:57 -04:00
Sourab Mangrulkar
1c7e5e2368
fix fsdp checkpointing issues ( #24926 )
...
* fix fsdp load
* Update trainer.py
* remove saving duplicate state_dict
2023-07-21 12:17:26 +05:30
Apoorv Khandelwal
9ef5256dfb
Fallback for missing attribute Parameter.ds_numel
( #24942 )
...
* [trainer] fallback for deepspeed param count
* [trainer] more readable numel count
2023-07-20 15:19:35 -04:00
Benjamin Badger
caf5e369fc
Contrastive Search peak memory reduction ( #24120 )
...
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-20 18:46:53 +01:00
Zach Mueller
aa1b09c5d1
Change logic for logging in the examples ( #24956 )
...
Change logic
2023-07-20 12:30:10 -04:00
Younes Belkada
89a1f34271
[RWKV
] Add Gradient Checkpointing support for RWKV ( #24955 )
...
add GC support for RWKV
2023-07-20 18:29:23 +02:00
dependabot[bot]
9f912ef62a
Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer ( #24954 )
...
Bump aiohttp in /examples/research_projects/decision_transformer
Bumps [aiohttp](https://github.com/aio-libs/aiohttp ) from 3.8.1 to 3.8.5.
- [Release notes](https://github.com/aio-libs/aiohttp/releases )
- [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst )
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5 )
---
updated-dependencies:
- dependency-name: aiohttp
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-20 12:17:38 -04:00
Shauray Singh
e75cb0cb3c
fix type annotations for arguments in training_args ( #24550 )
...
* testing
* example script
* fix typehinting
* some tests
* make test
* optional update
* Union of arguments
* does this fix the issue
* remove reports
* set default to False
* documentation change
* None support
* does not need None
* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549 )
* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments
* Change dict to Dict
* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574 )
Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549 )"
This reverts commit c5e29d4381
.
* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549 )
* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments
* Change dict to Dict
* merge
* hacky fix
* fixup
---------
Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-20 10:13:13 -04:00
Shauray Singh
0c41765df4
[DOCS] Example for LogitsProcessor
class ( #24848 )
...
* make docs
* fixup
* resolved
* remove debugs
* Revert "fixup"
This reverts commit 5e0f636aae
.
* prev (ignore)
* fixup broke some files
* remove files
* reverting modeling_reformer
* lang fix
2023-07-20 10:09:40 -04:00
Yih-Dar
35c04596f8
Fix main_input_name
in src/transformers/keras_callbacks.py
( #24916 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-20 15:01:37 +02:00
Premtim Sa
85514c17d1
Update processing_vision_text_dual_encoder.py ( #24950 )
...
Fixing small typo: kwrags -> kwargs
2023-07-20 08:25:38 -04:00
dependabot[bot]
9859806608
Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer ( #24949 )
...
Bump pygments in /examples/research_projects/decision_transformer
Bumps [pygments](https://github.com/pygments/pygments ) from 2.11.2 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases )
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES )
- [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0 )
---
updated-dependencies:
- dependency-name: pygments
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-20 07:43:48 -04:00
Joao Gante
89136ff7f8
Generate: sequence bias can handle same terminations ( #24822 )
2023-07-20 12:23:17 +01:00
statelesshz
37d8611ac9
replace no_cuda with use_cpu in test_pytorch_examples ( #24944 )
...
* replace no_cuda with use_cpu in test_pytorch_examples
* remove codes that never be used
* fix style
2023-07-20 07:09:04 -04:00
Tom Aarsen
79444f370f
Deprecate unused OpenLlama architecture ( #24922 )
...
* Resolve typo in check_repo.py
* Specify encoding when opening modeling files
* Deprecate the OpenLlama architecture
* Add disclaimer pointing to Llama
I'm open to different wordings here
* Match the capitalisation of LLaMA
2023-07-20 07:03:24 -04:00
ranchlai
8fd8c8e49e
Add multi-label text classification support to pytorch example ( #24770 )
...
* Add text classification example
* set the problem type and finetuning task
* ruff reformated
* fix bug for unseting label_to_id for regression
* update README.md
* fixed finetuning task
* update comment
* check if label exists in feature before removing
* add useful logging
2023-07-20 07:02:44 -04:00
Jungnerd
7381987f90
🌐 [i18n-KO] Translatedtasks/document_question_answering.md
to Korean ( #24588 )
...
* docs: ko: `document_question_answering.md`
* fix: resolve suggestions
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-07-20 06:19:36 -04:00
Stas Bekman
6112b1c644
[doc] image_processing_vilt.py
wrong default documented ( #24931 )
...
[doc] image_processing_vilt.py wrong default
2023-07-19 13:57:40 -07:00
Younes Belkada
ee4250a35f
[Llama2
] replace self.pretraining_tp
with self.config.pretraining_tp
( #24906 )
...
* add possibility to disable TP
* fixup
* adapt from offline discussions
2023-07-19 14:26:27 +02:00
Travis Cline
3a43794dd6
Fix minor llama2.md model doc typos ( #24909 )
...
Update llama2.md
Fix typos in the llama2 model doc
2023-07-19 08:13:14 -04:00
lee1jun
99c1268e0a
fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST ( #24902 )
...
fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST
suno/barh should be suno/bark
2023-07-19 07:35:04 -04:00
Madhava Jay
aa4afa67f3
Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) ( #24907 )
...
- This results in cpu mode on Apple Silicon mps
2023-07-19 07:30:01 -04:00
Yih-Dar
243b2ea3fd
Fix test_model_parallelism
for FalconModel
( #24914 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-19 13:18:16 +02:00
Eliah Kagan
c035970212
Update tested versions in READMEs ( #24895 )
...
* Update supported Python and PyTorch versions in readme
* Update Python, etc. versions in non-English readmes
These were more out of date than in the English readme. This
updates all the versions the readmes claim the repository is tested
with to the same versions stated in the English readme.
Those versions are current at least in the case of the Python and
PyTorch versions (and less out of date for the others).
* Propagate trailing whitespace fix to model list
This runs "make fix-copies". The only change is the removal of
whitespace. No actual information or wording is changed.
* Update tested TensorFlow to 2.6 in all readmes
Per pinning in setup.py
Unlike Python and PyTorch, the minimum supported TensorFlow version
has not very recently changed, but old versions were listed in all
READMEs.
2023-07-19 07:17:34 -04:00
Yih-Dar
129cb6d523
Avoid some pipeline tasks to use use_cache=True
( #24893 )
...
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-19 09:49:52 +02:00
Zach Mueller
476be08c4a
Check for accelerate env var when doing CPU only ( #24890 )
...
Check for use-cpu
2023-07-18 18:40:37 -04:00
Zach Mueller
a982c0225e
Disable ipex env var if false ( #24885 )
...
Disable ipex if in use
2023-07-18 16:07:02 -04:00
Arthur
07360b6c9c
[Llama2
] Add support for Llama 2 ( #24891 )
...
* add llama
* add other readmes
* update padding id in readme
* add link to paper
* fix paths and tokenizer
* more nits
* styling
* fit operation in 2 lines when possible
* nits
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add form
* update reademe
* update readme, we don't have a default pad token
* update test and tokenization
* LLaMA instead of Llama
* nits
* add expected text
* add greeedy output
* styling
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* sequential device map
* skip relevant changes
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-18 15:18:31 -04:00
Yih-Dar
30c172fc20
Separate CircleCI cache between main
and pull
(or other branches) ( #24886 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-18 21:05:26 +02:00
Hwijeen Ahn
dd49404a89
check if eval dataset is dict ( #24877 )
...
* check if eval dataset is dict
* formatting
2023-07-18 13:33:41 -04:00
Younes Belkada
5c5cb4eeb2
[Blip
] Fix blip output name ( #24889 )
...
* fix blip output name
* add property
* oops
* fix failing test
2023-07-18 19:30:27 +02:00
Younes Belkada
a9e067a45c
[InstructBlip
] Fix int8/fp4 issues ( #24888 )
...
* fix dtype issue
* revert `.float()`
* fix copies
2023-07-18 19:24:36 +02:00
NielsRogge
3ec10e6c76
Add DINOv2 ( #24016 )
...
* First draft
* More improvements
* Convert patch embedding layer
* Convert all weights
* Make conversion work
* Improve conversion script
* Fix style
* Make all tests pass
* Add image processor to auto mapping
* Add swiglu ffn
* Add image processor to conversion script
* Fix conversion of giant model
* Fix documentation
* Fix style
* Fix tests
* Address comments
* Address more comments
* Remove unused arguments
* Remove more arguments
* Rename parameters
* Include mask token
* Address comments
* Add docstring
* Transfer checkpoints
* Empty commit
2023-07-18 15:34:06 +01:00
Yih-Dar
57da42ad05
Enable ZeroShotAudioClassificationPipelineTests::test_small_model_pt
( #24882 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-18 15:08:53 +02:00
statelesshz
9c875839c0
add ascend npu accelerator support ( #24879 )
...
* Add Ascend NPU accelerator support
* fix style warining
2023-07-18 08:20:32 -04:00
Yih-Dar
f14c7f999d
Fix CircleCI cache ( #24880 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-18 13:45:00 +02:00
Younes Belkada
ca974aff0f
[Docs
] Clarify 4bit docs ( #24878 )
...
* clarify 4bit docs
* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2023-07-18 13:39:08 +02:00
Yih-Dar
2ab75add4b
Remove tests/onnx
( #24868 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-17 22:37:28 +02:00
Sylvain Gugger
d561408cc3
Skip Add model like job ( #24865 )
2023-07-17 15:52:04 -04:00
Yih-Dar
870dfc15b2
Skip failing ZeroShotAudioClassificationPipelineTests::test_small_model_pt
for now ( #24867 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-17 15:51:50 -04:00