Commit Graph

12454 Commits

Author SHA1 Message Date
Eli Simhayev
8abe4930d3
[Time-Series] informer model (#21099)
* added informer to gitignore

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* moved enc-dec init to InformerEncoder/Decoder init

* added 'init_std' to config, now model init works!

* WIP conversion script, and added code sources

* WIP conversion script: loading original informer pth works

* WIP conversion script: change defaults in the config

* WIP conversion script: supporting Informer input embedding

* WIP conversion script: added parameters for the informer embed

* WIP conversion script: change dim_feedforward=2048

* WIP conversion script: remove unused args for loading checkpoint

* just cleaning up

* DataEmbedding removed, after thinking with Kashif

* working on forward pass

* WIP forward pass: trying to establish working batch for forward pass

* cleaning and finalizing

* adding HF names and docs

* init after cleaning works

* WIP in tests

* added docs for the informer specific args

* fix style

* undo change

* cleaning informer, now need to work only enc-dec

* initial enc-dec classes

* added encoder and decoder

* added todo

* add todos for conv_layers

* added decoder docs from vanilla

* added encoder docs from vanilla

* remove encoder decoder from the original informer

* removed AttentionLayer from the original paper

* removed TriangularCausalMask, same as decoder_attention_mask

* initial sparse attention

* use conv_layers

* fixed test_config test

* fix parenthesis when itearting zip(layers, conv_layers)

* error found in prob attention, added sizes as comments

* fix sizes

* added proposal for q_reduce indexing, and remove unused

* WIP ProbMask, and changed factor=2 for testing

* remove unused libs for this PR for creating the env

* fix checking the attn_weights.size() after bmm

* Q_reduce: changed from torch.gather to simple slicing

* WIP calculate final attn_output

* finish adding v_aggregated, attn_output ready

* changed tgt_len to u in attention_mask, need to fix the size error

* comment attention_mask for encoder, and fix if cond for v_agg

* added ProbMask support (wip), removed old original code

* finished ProbMask 😃

* Revert "remove unused libs for this PR for creating the env"

This reverts commit 11a081e09e.

* fixes

* make style

* fix initial tests

* fix more tests

* dry

* make style

* remove unused files

* style

* added integration tests

* fix num_static_real_features

* fix header

* remove unused function

* fix example

* fix docs

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/modeling_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixes for reviewer

* use prediction_length from model

* fix style

* fixed informer.mdx

* added to index

* updated readme

* undo

* make fix-copies

* typo

* fix copy

* added Informer to toctree

* in order

* fixed comments

* remove unneeded new lines in docs

* make static real and cat optional

* fix use of distil conv layers

* fixed integration test

* added checkpoint for convlayer

* make fix-copies

* updated from time series model

* make fix-copies

* copy decoder

* fix unit tests

* updated scaling config

* fix integration tests

* IGNORE_NON_TESTED

* IGNORE_NON_AUTO_CONFIGURED

* IGNORE_NON_AUTO_CONFIGURED

* updated check configs

* fix formatting

* undo change from time series

* prediction_length should not be None

* aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor

* make style

* make fix-copies

* niels CR: update contributed by

* niels CR: update configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: update kashif -> huggingface

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: `sampling_factor` only relevant when `attention_type`=prob

* make style

* fixed U_part: added multiplication by `L_Q`

* fixed bug: remove `is not None` from `if config.distil`

* fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check

* fix integration tests

* updated model hub

* do not shift as in training

* undo

* fix make-copies

* make fix-copies

* added `if prediction_length is None`

* changed `ProbSparseAttention` to `InformerProbSparseAttention`

* changed `V_sum` -> `v_mean_dim_time`

* changed `ConvLayer` to `InformerConvLayer` and fixed `super()`

* TimeSeriesTansformer->Informer in decoder's Copied from

* more descriptive in ProbSparse

* make style

* fix coped from

* Revert "added `if prediction_length is None`"

This reverts commit b4cbddfa05.

* fixed indent

* use InformerSinusoidalPositionalEmbedding

* make fix-style

* fix from #21860

* fix name

* make fix-copies

* use time series utils

* fix dec num_heads

* docstring

* added time series util doc

* _import_structure

* formatting

* changes from review

* make style

* fix docs

* fix doc

* removed NegativeLogLikelihood

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2023-03-07 21:36:38 +01:00
NielsRogge
dde718e7a6
[DETR and friends] Remove is_timm_available (#21814)
* First draft

* Fix to_dict

* Improve conversion script

* Update config

* Remove timm dependency

* Fix dummies

* Fix typo, add integration test

* Upload 101 model as well

* Remove timm dummies

* Fix style

---------

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2023-03-07 15:19:39 -05:00
Arthur
2156662dea
[TF] Fix creating a PR while pushing in TF framework (#21968)
* add create pr arg

* style

* add test

* ficup

* update test

* last nit fix typo

* add `is_pt_tf_cross_test` marker for the tsts
2023-03-07 17:32:08 +01:00
Matt
d128f2ffab
Stop requiring Torch for our TF examples! (#21997)
* Stop requiring Torch for our TF examples!

* Slight tweak to logging in the example itself
2023-03-07 15:54:10 +00:00
Sanchit Gandhi
7c39318136
[Whisper] Add model for audio classification (#21754)
* [Whisper] Add model for audio classification

* make fix-copies

* add to docs

* add docstring

* empty returns

* add code example

* switch to fleurs

* stick everything on one line
2023-03-07 16:20:21 +01:00
Yih-Dar
9402788b34
Skip test_multi_gpu_data_parallel_forward for some model tests (#21991)
skip test_multi_gpu_data_parallel_forward for some model tests

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-07 14:23:36 +01:00
Yih-Dar
99c5c6079d
Update notification_service.py (#21992)
* better check

* better check

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-07 14:20:39 +01:00
regisss
10bcbcae30
Remove unneeded casts to bool (#21983)
Remove cast to Bool
2023-03-07 07:35:49 -05:00
NielsRogge
95408e9953
[DETR, YOLOS] Fix device bug (#21974)
* Fix integration test

* Add test

* Add test
2023-03-07 07:34:04 -05:00
Elad Segal
eec46b4f75
Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens (#21959)
* Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens

* fix docs

* Empty commit

* formatting
2023-03-07 11:59:22 +00:00
amyeroberts
4063fd9cba
Add check before int casting for PIL conversion (#21969)
* Add check before int casting for PIL conversion

* Line length

* Tidier logic
2023-03-07 11:14:09 +00:00
Yih-Dar
5b28b78332
Update Jukebox tests (#21984)
* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-07 04:20:14 +01:00
PD Hall
31e3c6c393
docs: improve clarity for language modeling (#21952)
* docs: improve clarity for clm/mlm

* docs: remove incorrect explanation

* docs: remove incorrect explanation

---------

Co-authored-by: pdhall99 <pdhall99>
2023-03-06 13:13:43 -05:00
Karim Foda
0ce5236dd1
Fix gradient checkpointing bug in ESM (#21980) 2023-03-06 17:44:53 +00:00
Karim Foda
de496ef08b
Fix gradient checkpointing bug in Codegen (#21979) 2023-03-06 17:44:31 +00:00
Karim Foda
4a545d18e2
Fix gradient checkpointing bug in BlipText (#21978)
Make Format
2023-03-06 17:43:52 +00:00
Karim Foda
451263b841
Fix gradient checkpointing bug in Blenderbot Small (#21977) 2023-03-06 17:43:25 +00:00
Karim Foda
4f84dedc03
Fix gradient checkpointing bug in BigBird Pegasus (#21976) 2023-03-06 17:42:52 +00:00
Yih-Dar
f2a2616b74
Update expected values for test_xglm_sample (#21975)
update expected values for xglm

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-06 18:07:31 +01:00
Matt
5d8efc79db
Add TF contrastive image text finetuning example (#21939)
* Initial commit

* stash commit

* Add model checkpointing and pushing

* Fix model name inference

* Update README

* Update README

* Remove a couple of Torch references

* Update copyright date

* make fixup

* Update PushToHubCallback args!

* Remove the torch summary

* Add strategy.scope
2023-03-06 16:57:40 +00:00
Yih-Dar
9474abdf47
Use larger atol in torch.allclose for some tests (#21966)
Use larger atol

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-06 17:41:00 +01:00
Aayush Neupane
64d95c44ec
Add missing parameter definition in layoutlm config (#21960)
Four parameters in `LayoutLM` config were missing definitions, Added their definition (copied from BertConfig).
2023-03-06 15:20:11 +00:00
Srimanth Agastyaraju
f3c75f8b44
[Generate] Fix gradient_checkpointing and use_cache bug for BLOOM (#21956)
Step 1 - Change use_cache fix
2023-03-06 14:56:40 +00:00
saswatmeher
934d0b8bdd
Fix bert issue (#21963)
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
2023-03-06 14:55:31 +00:00
aws-sangeetha
0bb17295f0
Disable DDP for neuron (#21953)
Disable DDp for neuron

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>
2023-03-06 09:33:44 -05:00
Arthur
bc33fbf956
[CI] Fix ci (#21940)
* fix `get_proposal_pos_embed`

* fix order

* style

* zero shot simplify test

* add approximate values for zero shot audio classification
2023-03-06 15:22:27 +01:00
Yih-Dar
fcf813417a
Update expected values in XLMProphetNetModelIntegrationTest (#21957)
update values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-06 09:15:44 +01:00
Batese2001
699a2293cc
Fixed gradient_checkpointing/use_cache bug in blenderbot (#21833)
* Fixed gradient_checkpointing/use_cache bug in blenderbot

* Update modeling_blenderbot.py

* Added back if statement

* Formatted using black
2023-03-04 15:45:53 +00:00
Karim Foda
6feb39b43c
Fix gradient checkpointing bug in Roformer (#21946) 2023-03-04 15:44:33 +00:00
Karim Foda
6386eb9721
Fix gradient checkpointing bug in Rembert (#21945) 2023-03-04 15:44:06 +00:00
Karim Foda
f12c74f51e
Fix gradient checkpointing bug in Pegasus (#21944) 2023-03-04 15:43:32 +00:00
Karim Foda
f932ee61b9
Fix gradient checkpointing bug in OPT (#21943) 2023-03-04 15:42:57 +00:00
bofeng huang
003a7cc608
[Whisper] Fix feature normalization in WhisperFeatureExtractor (#21938)
Fix feature normalization in WhisperFeatureExtractor
2023-03-03 14:21:13 -05:00
Arthur
718e9d777f
[CLAP] Support batched inputs for CLAP. Fixes pipeline issues (#21931)
* fix pipeline

* fix feature_extraction clap

* you can now batch the `is_longer` attribute

* add tests

* fixup

* add expected scores

* comment on is_longert
2023-03-03 18:42:18 +01:00
Victor Muštar
c5fe06c59d
Update README logo (#21933) 2023-03-03 11:57:39 -05:00
Arthur
82aac00e0f
[Flan-UL2] Add-flan-ul2 (#21929)
* add doc and readme

* add model docs

* update toctree and fix copies

* update

* update doc file

* fix

* add FLAN-UL2 to configuration mapping

* fixup

* Apply suggestions from code review

* more clarification

---------

Co-authored-by: younesbelakda <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-03-03 17:57:24 +01:00
substanc3
956ae62139
Fix wrong documentation about DataCollator padding defaults (#21919)
* Fix wrong documentation about DataCollator padding defaults

* Fix styling
2023-03-03 11:51:54 -05:00
Yih-Dar
8c40ba73d8
Avoid failure in check_repo.py due to missing backends (#21930)
* Update utils/check_repo.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update utils/check_repo.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-03 15:34:20 +01:00
Yih-Dar
d4306daea1
Fix AlignModelTest tests (#21923)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-03 14:47:09 +01:00
Zach Nussbaum
c5a1ff9ef0
feat: filter try/except when looking at custom code (#21914)
* feat: filter try/except

* Update src/transformers/dynamic_module_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-03 08:43:59 -05:00
Yih-Dar
02a77fa04c
Cleanup more auto mapping names (#21909)
* fix auto 2

* fix auto 2

* fix task guide issue

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-03 14:43:44 +01:00
Yih-Dar
b05e0bec88
Use large VM for repo_utils_job (#21928)
upgrade to large VM

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-03 14:43:03 +01:00
Yih-Dar
fa9d2ad7ec
Update model_split_percents for WhisperModelTest (#21922)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-03 14:35:08 +01:00
Karim Foda
c82bd37169
Fix gradient checkpointing megatron bert (#21921) 2023-03-03 11:50:21 +00:00
Karim Foda
99a62347fb
Fix gradient checkpointing bug in mvp (#21920) 2023-03-03 11:49:49 +00:00
Karim Foda
e407b5a323
Fix gradient checkpointing bug in MBart (#21918) 2023-03-03 11:49:27 +00:00
Arthur
dcec3277cd
faster forward following what is done for images (#21906)
* faster forward following what is done for images

* add missing licence
2023-03-03 06:18:18 +01:00
Matt
37e0974afc
Fix doctests for TFVisionTextDualEncoder (#21910) 2023-03-03 00:18:11 +00:00
Yih-Dar
9f5bfe1b99
Avoid modeling tests run in pipeline CI jobs (#21911)
* rework is_pipeline_test

* bring back 3 tests

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-02 21:23:06 +01:00
Kashif Rasul
db979f7588
[time series] Add Time series inputs tests (#21846)
* intial test of inputs

* added test for generation

* remove asserts

* fixed test

* Update tests/models/time_series_transformer/test_modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2023-03-02 20:43:35 +01:00