Commit Graph

15053 Commits

Author SHA1 Message Date
wangpeng
af1c864cdc
fix code example in mgp-str doc (#22219)
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
2023-03-17 09:40:06 +00:00
Kevin Turner
33d033d694
fix typos in llama.mdx (#22223) 2023-03-17 08:43:18 +00:00
Yih-Dar
97a3d16a69
Hotfix for natten issue with torch 2.0.0 on CircleCI (#22218)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-16 23:57:26 +01:00
Yih-Dar
5110e5748e
🔥py38 + torch 2 🔥🔥🔥🚀 (#22204)
* py38 + torch 2

* increment cache versions

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-16 22:59:23 +01:00
Susnato Dhar
fb366b9a2a
fixes a typo in WhisperFeatureExtractor docs. (#22208)
* fixes a typo

* .
2023-03-16 16:08:05 +00:00
Younes Belkada
da3ba3a167
[XGLM] Add accelerate support for XGLM (#22207)
* add `accelerate` support for XGLM

* fix order
2023-03-16 16:18:05 +01:00
SatyaJandhyalaAtMS
a88a4dae19
Temporarily fix ONNX model exporting error (#21830)
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143

* Reduced column width

* Fix formatting.

* Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143"

This reverts commit 6e95a108042118d204da447729f3834affa354fc.

* Fix export error.

* Revert "Fix formatting."

This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8.

* Propagated changes made in SwinV2 to Swin2SR
2023-03-16 10:56:26 -04:00
Yih-Dar
4c5c0af7e5
Update tiny model creation script (#22202)
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* make style and quality

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-16 14:21:58 +01:00
Jason Phang
464d420775
LLaMA Implementation (#21955)
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
2023-03-16 09:01:15 -04:00
Jason Phang
0041be5b3d
LLaMA Implementation (#21955)
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
2023-03-16 09:00:53 -04:00
Baelish03
09922da4a7
Italian Translation of migration.mdx (#22183)
* Tranlstion Italian: migration

* Update migration.mdx

minor fixes

* Update _toctree.yml

* Delete migration.mdx

* Add italian translation of migration.mdx

* Update of migration.mdx translation and toctree
2023-03-16 12:00:07 +00:00
Yih-Dar
52a57f7c7c
Update expected values in MgpstrModelIntegrationTest (#22195)
Update values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-16 11:48:52 +00:00
Alara Dirik
1485bd9c02
Fix typo in Align docs (#22199)
Fix align docs typo
2023-03-16 13:41:48 +03:00
Yih-Dar
1c4a9acc73
Fix DeepSpeed CI (#22194)
* Deal with torch-tensorrt

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-16 05:52:40 +01:00
Prathik Rao
7c4999e495
t5 remove data dependency (#22097)
* t5 remove data dependency

* make style

* make fix-copies

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
2023-03-15 16:11:15 -04:00
Anahita Bhiwandiwalla
16121bae5c
Update BridgeTowerForContrastiveLearning (#22145)
* Use return_loss for BridgeTowerForContrastiveLearning, add example

* fix tests

* Update example in BridgeTowerForContrastiveLearning

* Update test_modeling_bridgetower.py

* update model output format

* minor update

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

* make style

---------

Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-15 20:54:38 +01:00
Sylvain Gugger
42ad693b7b
Regression pipeline device (#22190)
* Fix regression in pipeline when device=-1 is passed

* Add regression test
2023-03-15 14:13:38 -04:00
amyeroberts
737681477c
Revert 22152 MaskedImageCompletionOutput changes (#22187)
Revert changes
2023-03-15 18:37:23 +01:00
浮躁的小螃蟹
7b0e2cfdfb
Fix: unfinished_sequences with correct device (#22184)
Fix: unfinished_sequences with correct device 

The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.
2023-03-15 16:27:19 +00:00
Sylvain Gugger
f7329751fe
Run all tests by default (#22162) 2023-03-14 17:30:43 -04:00
Sylvain Gugger
b7036f4912
Load optimizer state on CPU to avoid CUDA OOM (#22159) 2023-03-14 17:30:32 -04:00
Sylvain Gugger
ebdb185bef
v4.28.0.dev0 2023-03-14 13:49:10 -04:00
Sylvain Gugger
c52c5282ef
Revert "Enforce same behavior as PyTorch 2.0 for older versions" (#22163)
Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)"

This reverts commit 1c801d65eb.
2023-03-14 13:45:46 -04:00
Stas Bekman
085bf5c1fe
[trainer] add --optim adamw_torch_fused for pt-2.0+ (#22144)
* [trainer] add --optim adamw_torch_fused

* change optim default

* deal with non-torch

* revert default change; prep; add fp16/amp assert

* typo

* typo
2023-03-14 10:22:03 -07:00
amyeroberts
c6318c3788
to_pil - don't rescale if int and in range 0-255 (#22158)
* Don't rescale if in and in range 0-255

* Raise value error if int values too large

* Update tests/test_image_transforms.py

* Update tests/test_image_transforms.py
2023-03-14 15:43:44 +00:00
Alara Dirik
3b22bfbc6a
Create MaskedImageCompletionOutput and fix ViT docs (#22152)
* create MaskedImageCompletionOutput

* fix bugs

* fix bugs
2023-03-14 13:55:18 +00:00
Sylvain Gugger
b45192ec47
Fix big model inference for T5 models in float16 (#22095)
* Fix big model inference for T5 models in float16

* Apply suggestions from code review

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Style

* Trigger CI with latest release

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-03-14 09:20:16 -04:00
Nicola Procopio
7f5ad6c35b
Translation Italian: perf_train_cpu and perf_train_cpu_many (#22151)
* added translated files

added perf_train_cpu and perf_train_cpu_many

* updated toctree
2023-03-14 11:09:36 +00:00
Yih-Dar
ff88703501
Update 2 doctest expected values for torch 2.0.0 (#22148)
update values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-14 09:13:16 +00:00
Alara Dirik
cdddfbffa1
Add ConvNeXT V2 (#21679)
* Add ConvNeXt V2 to transformers
* TF model is separated from the PR to fix issues
2023-03-14 12:08:14 +03:00
Yih-Dar
6c2ad00c46
Move is_pipeline_test_to_skip to specific model test classes (#21999)
* Move `is_pipeline_test_to_skip` to specific model test classes

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-14 10:03:02 +01:00
Arthur
2beabd24f0
[🛠️] Fix-whisper-breaking-changes (#21965)
* temp fix

* temporary fix

* update

* fix tests

* fixup

* update based on reveiew

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* update to fix tests

* update docstring

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-03-14 09:23:48 +01:00
MichaelRipa
101a6cd276
docs: New terms and updates to glossary (#21982)
* Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added link to 'Pipeline for inference' tutorial

* Trigger CI

* Update docs/source/en/glossary.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added entry for self supervised learning, added deleted entries + fixed broken links

* Update docs/source/en/glossary.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-13 19:09:37 -04:00
Yih-Dar
ba9e0191de
Prepare daily CI for torch 2.0.0 (#22135)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-13 22:21:15 +01:00
Patrick von Platen
f780557a34
[Safetensors] Add explicit flag to from pretrained (#22083)
* [Safetensors] Add explicit  flag to from pretrained

* add test

* remove @

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-13 21:39:06 +01:00
Sylvain Gugger
3a35937ede
Remove backend check for torch.compile (#22140)
* Remove backend enforcment for torch.compile

* Update error

* Update src/transformers/training_args.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Style

---------

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2023-03-13 16:34:00 -04:00
Stas Bekman
618697ef53
[deepspeed docs] Activation Checkpointing (#22099)
* [deepspeed docs] Activation Checkpointing

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update deepspeed.mdx

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-13 12:52:42 -07:00
Stas Bekman
5b85add7d5
[trainer] fix bug in grad accum with multiple epochs (#22098)
* [trainer] fix bug in grad accum

* comment out debug

* fix one-off

* rename counter
2023-03-13 12:51:40 -07:00
Sylvain Gugger
1c801d65eb
Enforce same behavior as PyTorch 2.0 for older versions (#22136) 2023-03-13 15:50:50 -04:00
Joao Gante
e16cbe88ae
Trainer: let generate pick its inputs (#22108)
* Let generate pick its inputs

* fix squad seq2seq example
2023-03-13 19:00:25 +00:00
Younes Belkada
d979cf6efd
[Whiper] add get_input_embeddings to WhisperForAudioClassification (#22133)
* add `get_input_embeddings` to `WhisperForAudioClassification`

* add common tests

* fix another common test

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix style

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-03-13 19:46:01 +01:00
bishmdl76
987972377d
Update configuration_align.py (projected_dim=640) (#22139)
Update configuration_align.py

updated projected_dim=640 from 512 in arguments of AlignConfig
2023-03-13 14:12:12 -04:00
Yih-Dar
54ee56b15b
Add a new script to check model testers' config (#22063)
* Add script

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-13 19:11:19 +01:00
mollerup23
a096eaca65
Adding Type Hints to TF_Pegasus model (#21941)
* Adding Type Hints to TF_Pegasus model

* Updated some parameters per maintainer comments
2023-03-13 15:58:29 +00:00
Sylvain Gugger
6cb5132a7f
Fix doc link for MGP-STR (#22138) 2023-03-13 15:26:50 +00:00
Maria Khalusova
8def252de2
Zero-shot image classification task guide (#22132)
* WIP

* WIP

* manual inference example

* make style

* Apply suggestions from code review

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

---------

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
2023-03-13 10:57:17 -04:00
Karim Foda
e61081e725
Fix gradient checkpointing bug in trocr (#22126)
* Fix gradient checkpointing bug in trocr

* Fix format

* Update src/transformers/models/trocr/modeling_trocr.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-03-13 15:45:47 +01:00
Karim Foda
ef74e7e783
Fix gradient checkpointing bug in LongT5 (#22130) 2023-03-13 14:06:17 +00:00
Karim Foda
c1db6a3bab
Fix gradient checkpointing bug in xmod (#22129) 2023-03-13 15:05:11 +01:00
Younes Belkada
6652e7da0d
[Blip2] skip accelerate test (#22124)
skip accelerate test
2023-03-13 15:03:21 +01:00