Commit Graph

2488 Commits

Author SHA1 Message Date
Joao Gante
a81fe4e1df
Generate: input expansion for any model input (#21624) 2023-02-14 14:16:22 +00:00
Joao Gante
13e03e619d
Generate: filter encoder inputs when its signature does not accept wildcards (#21603) 2023-02-14 10:46:46 +00:00
Joao Gante
56b03c96b8
Fix TF CTC tests (#21606) 2023-02-13 21:23:00 +00:00
Yih-Dar
cbecf121cd
Fix env. variable type issue in testing (#21609)
* fix env issue

* fix env issue

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-13 20:53:26 +01:00
Joao Gante
fa4bdb0a40
Generate: correct default model input creation for decoder-only models (#21580) 2023-02-13 17:04:49 +00:00
Yih-Dar
edc1e734bf
Fix Blip-2 CI (#21595)
* use fp16

* use fp16

* use fp16

* use fp16

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-13 16:44:27 +01:00
Younes Belkada
1666c42f0b
[bnb] Let's make the daily CI green 🍏 (#21597)
* fix bnb slow test

* make fixup
2023-02-13 16:18:50 +01:00
Joao Gante
24273268b7
Generate: Fix flaky indexing error in test_constrained_beam_search_generate_dict_output (#21561) 2023-02-13 15:12:07 +00:00
Joao Gante
4be75e9728
CI: skip failing TF hubert test (#21601)
skip test
2023-02-13 09:34:23 -05:00
Joao Gante
eb6c59bc78
Generate: TF supports multiple eos tokens (#21571) 2023-02-13 12:24:22 +00:00
amyeroberts
cb56590111
Replace input_values_processing with unpack_inputs (#21502)
* Replace input_values_prrocessing with unpack_inputs

* Skip test failing with OOM

* Update tests
2023-02-10 18:19:39 +00:00
Stas Bekman
2f5507580b
[from_pretrained] extend torch_dtype="auto" to look up config.torch_dtype first, expand docs (#21524)
* [from_pretrained] expand on torch_dtype entry

* fold 4 into 1

* style

* support torch_dtype='config' plus tests

* style

* oops

* fold config into auto, fix bug

* fix check

* better log

* better log

* clean up
2023-02-10 09:09:21 -08:00
Shubhamai
9e40bba6ba
[Tests] Improve flax test_attention_outputs (#21486)
improving flax tests
2023-02-10 11:31:49 -05:00
Patrick von Platen
b20147a3c8
[Variant] Make sure variant files are not incorrectly deleted (#21562)
* [Variant] Make sure variant files are not incorrectly deleted

* Apply suggestions from code review

* fix
2023-02-10 15:44:51 +01:00
Jannis Vamvas
b0d539ccad
Add X-MOD (#20939)
* Add X-MOD to Readme

* Add documentation for X-MOD

* Implement X-MOD

* Fix formatting of X-MOD docs

* Change signature of X-MOD forward methods to use lang_ids

* Minor changes

* Rebase with main and run make fix-copies

* Make suggested changes to docstrings

* Improve code readability

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Fix code style

* Conversion script: Remove asserts and type annotations

* Remove _TOKENIZER_FOR_DOC

* XMOD -> Xmod

* Update copyright note

* Fix doctests

* Fix docstring

* Add integration test for FillMaskPipeline

* Revert "Add integration test for FillMaskPipeline"

This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f.

* Add end-to-end integration test for mask fill

* make style

* Rebase with main and make fix-copies

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-02-10 15:32:06 +01:00
Quentin Meeus
5b72b3412b
Remove CLI spams with Whisper FeatureExtractor (#21267)
* Remove CLI spams with Whisper FeatureExtractor

Whisper feature extractor representation includes the MEL filters, a list of list that is represented as ~16,000 lines. This needlessly spams the command line. I added a `__repr__` method that replaces this list with a string "<array of shape (80, 201)>"

* Remove mel_filters from to_dict output  

Credits to @ArthurZucker

* remove unused import

* update feature extraction tests for the changes in to_dict
2023-02-10 09:15:16 -05:00
Katie Le
21a2d900ec
Added with torch.no_grad() to Camembert integration test (#21544)
add with torch.no_grad() to Camembert integration test

Co-authored-by: Bibi <Bibi@katies-mac.local>
2023-02-10 10:58:29 +01:00
Younes Belkada
f83942684d
[pipeline] A simple fix for half-precision & 8bit models (#21479)
* v1 fix

* adapt from suggestions

* make style

* fix tests

* add gpu tests

* update docs

* fix other tests

* Apply suggestions from code review

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* better fix

* make fixup

* better example

* revert changes

* proposal

* more elegant solution

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-10 10:26:17 +01:00
Sylvain Gugger
97d3390fc8
Skip failing test for now 2023-02-09 20:11:26 -05:00
Katie Le
23c146c38b
Added with torch.no_grad() to XLM-Roberta integration test (#21547)
* added with torch.no_grad() to the integration tests and applied make style

* added with torch.no_grad() to xlm roberta forward pass

---------

Co-authored-by: Bibi <Bibi@katies-mac.local>
2023-02-09 21:49:54 +01:00
Sylvain Gugger
04b2f13c37
🚨🚨🚨 Enforce single model initialization (#21431)
* Enforce single model initialization

* Add OneFormer example for problem 3

* Do it the Stas way

* Actually rename the uses...

* Rewrite test

* Try to change the test this way

* Fix all init slow/fast tests

* Break connection

* Fix more tests

* Fix test for initialization

* Remove custom test

* Quality

* Fix last failing tests

* The end?
2023-02-09 15:46:26 -05:00
Sylvain Gugger
2020ac4bd6
Fix from_pretrained API with config and state_dict (#21542) 2023-02-09 15:44:02 -05:00
NielsRogge
d7f1e7c009
Add BLIP-2 (#21441)
* First draft

* More improvements

* More improvements

* Improve conversion script

* Convert all weights

* Make forward pass work

* Make logits match

* More improvements

* More improvements

* More improvements

* Use get_input_embeddings

* Improve some more

* Improve model tests

* Improve model tests

* More improvements

* Fix processor

* Update files

* Update prepare_inputs_for_generation

* More improvements

* Fix copies

* More fixes

* Make fixup

* More improvements

* Add support for seq2seq language model

* More improvements

* Fix test

* More improvements

* Improve conversion script

* Remove some todo's

* Fix README's

* Improve conversion script

* Fix generation

* Fix style and remove Blip2Model

* Fix model outputs

* More improvements

* Set eos_token_id in config

* Fix quality

* Small improvements

* Add processor tests

* More improvements

* Apply suggestions

* Apply suggestions

* Add integration test

* Update image URL

* Add integration test

* Fix model_type

* Update style

* Improve docs

* Add doc tests

* Fix copies

* Remove tests which are passing

* Improve some more

* Add tests for seq2seq language models

* Minor fix

* Convert more checkpoints

* finalize CI

* Fix blip and blip2 processors

* add `accelerate` support for `blip2`

* clean up

* make style

* Update conversion script

* Update conversion script some more

* Update organization

* revert toc file

* add blip-2 to toc file

* Some more improvements

* Fix docstring

* Improve docs

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-02-09 16:52:11 +01:00
Joao Gante
0d33381fad
Tag tests as slow (#21537)
begone slow tests
2023-02-09 14:46:15 +00:00
Joao Gante
2edf9a857b
Generate: TF .generate() can now be exported with dynamic length (#21474) 2023-02-09 12:52:30 +00:00
Joao Gante
e69f9715eb
Generate: make TF .generate() signature == PT .generate() signature (#21525) 2023-02-09 11:10:13 +00:00
Motoki Wu
9960506cbe
Fix multiple eos_token_ids in model.generate(...) (#21461)
* add tests with multiple eos_token_ids

* make math.prod instead of sum

* make fixup

* fix long and also use np.prod since math.prod does not exist <python 3.8

* make fixup

* add prod util

* use prod util instead of np.prod

* make fixup

* previous .long location

* use tensor ops

* remove prod

* remove prod

* update device

* make fixup

* fix none
2023-02-08 13:48:46 -05:00
Stas Bekman
8ea994d3c5
[tests] add missing report_to none (#21505)
[tests] report_to none
2023-02-08 09:32:40 -08:00
Joao Gante
1d9c26a4b8
Generate: TF compute_transition_scores (#21341) 2023-02-08 16:36:43 +00:00
Guillaume Klein
ca905ba28e
Exclude the madeup words from M2M100Tokenizer.vocab_size (#20976) 2023-02-08 09:19:06 -05:00
Katie Le
cc1d0685b3
Wrap RemBert integration test forward passes with torch.no_grad() (#21503)
added with torch.no_grad() to the integration tests and applied make style

Co-authored-by: Bibi <Bibi@katies-mac.local>
2023-02-08 14:00:52 +01:00
Adrian Sager La Ganga
a3034c7004
Add inverse sqrt learning rate scheduler (#21495)
* added inverse sqrt lr scheduler

* Updated get_scheduler in src/transformers/optimization.py

* Updated src/transformers/__init__.py

* Added inverse sqrt lr scheduler test

* Updated docs/source/en/main_classes/optimizer_schedules.mdx

* Ran style and quality scripts

* Fix get_inverse_sqrt_schedule docstring

* Comment implementation URL
2023-02-07 15:00:50 -05:00
Stas Bekman
b9af152efb
[tokenizer] sanitize saved config (#21483)
* [tokenizer] sanitize saved config

* rm config["name_or_path"] test
2023-02-07 10:51:45 -08:00
Sylvain Gugger
67d074874d
Cleanup quality (#21493)
* Remove mentions of flake8/isort

* Clean up inits

* Deall with all other inits

* Last special rule for dummy files
2023-02-07 12:27:31 -05:00
Arthur
9e7f84a556
[OPT] Adds GPT2TokenizerFast to the list of tokenizer to use for OPT. (#20823)
* Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)),

* skip failing test

* Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)),

* skip failing test
2023-02-07 17:35:28 +01:00
Joao Gante
1e4cf8bb44
Generate: TF can now generate from embeddings in encoder-decoder models (#21475) 2023-02-07 11:18:23 +00:00
Arthur
12eb528b5a
[CI ] Remove past in favor of pat_key_values (#21443)
* fix past renamed to past_key_value

* update more `past`that were ski^êd

* fixup

* remove changes made to rag

* refactor `_reorder_cache` to use `past_key_values`

* fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache
2023-02-07 09:51:35 +01:00
Sylvain Gugger
cc8407522a
Fix epoch number when resuming training (#21478) 2023-02-06 19:34:34 -05:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting (#21480)
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
2023-02-06 18:10:56 -05:00
Joao Gante
4943331015
Generate: TF can now accept custom logits processors (#21454) 2023-02-06 15:44:47 +00:00
Yih-Dar
0db5d911fc
Fix SpeechT5ForSpeechToSpeechIntegrationTests device issue (#21460)
* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-06 10:43:07 +01:00
Yih-Dar
59d5edef34
Avoid flaky generation sampling tests (#21445)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-03 22:01:25 +01:00
Matthijs Hollemans
e4bacf6614
[WIP] add SpeechT5 model (#18922)
* make SpeechT5 model by copying Wav2Vec2

* add paper to docs

* whoops added docs in wrong file

* remove SpeechT5Tokenizer + put CTC back in the name

* remove deprecated class

* remove unused docstring

* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead

* remove classes we don't need right now

* initial stab at speech encoder prenet

* add more speech encoder prenet stuff

* improve SpeechEncoderPrenet

* add encoder (not finished yet)

* add relative position bias to self-attention

* add encoder CTC layers

* fix formatting

* add decoder from BART, doesn't work yet

* make it work with generate loop

* wrap the encoder into a speech encoder class

* wrap the decoder in a text decoder class

* changed my mind

* changed my mind again ;-)

* load decoder weights, make it work

* add weights for text decoder postnet

* add SpeechT5ForCTC model that uses only the encoder

* clean up EncoderLayer and DecoderLayer

* implement _init_weights in SpeechT5PreTrainedModel

* cleanup config + Encoder and Decoder

* add head + cross attention masks

* improve doc comments

* fixup

* more cleanup

* more fixup

* TextDecoderPrenet works now, thanks Kendall

* add CTC loss

* add placeholders for other pre/postnets

* add type annotation

* fix freeze_feature_encoder

* set padding tokens to 0 in decoder attention mask

* encoder attention mask downsampling

* remove features_pen calculation

* disable the padding tokens thing again

* fixup

* more fixup

* code review fixes

* rename encoder/decoder wrapper classes

* allow checkpoints to be loaded into SpeechT5Model

* put encoder into wrapper for CTC model

* clean up conversion script

* add encoder for TTS model

* add speech decoder prenet

* add speech decoder post-net

* attempt to reconstruct the generation loop

* add speech generation loop

* clean up generate_speech

* small tweaks

* fix forward pass

* enable always dropout on speech decoder prenet

* sort declaration

* rename models

* fixup

* fix copies

* more fixup

* make consistency checker happy

* add Seq2SeqSpectrogramOutput class

* doc comments

* quick note about loss and labels

* add HiFi-GAN implementation (from Speech2Speech PR)

* rename file

* add vocoder to TTS model

* improve vocoder

* working on tokenizer

* more better tokenizer

* add CTC tokenizer

* fix decode and batch_code in CTC tokenizer

* fix processor

* two processors and feature extractors

* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2

* cleanup

* more cleanup

* even more fixup

* notebooks

* fix log-mel spectrograms

* support reduction factor

* fixup

* shift spectrograms to right to create decoder inputs

* return correct labels

* add labels for stop token prediction

* fix doc comments

* fixup

* remove SpeechT5ForPreTraining

* more fixup

* update copyright headers

* add usage examples

* add SpeechT5ProcessorForCTC

* fixup

* push unofficial checkpoints to hub

* initial version of tokenizer unit tests

* add slow test

* fix failing tests

* tests for CTC tokenizer

* finish CTC tokenizer tests

* processor tests

* initial test for feature extractors

* tests for spectrogram feature extractor

* fixup

* more fixup

* add decorators

* require speech for tests

* modeling tests

* more tests for ASR model

* fix imports

* add fake tests for the other models

* fixup

* remove jupyter notebooks

* add missing SpeechT5Model tests

* add missing tests for SpeechT5ForCTC

* add missing tests for SpeechT5ForTextToSpeech

* sort tests by name

* fix Hi-Fi GAN tests

* fixup

* add speech-to-speech model

* refactor duplicate speech generation code

* add processor for SpeechToSpeech model

* add usage example

* add tests for speech-to-speech model

* fixup

* enable gradient checkpointing for SpeechT5FeatureEncoder

* code review

* push_to_hub now takes repo_id

* improve doc comments for HiFi-GAN config

* add missing test

* add integration tests

* make number of layers in speech decoder prenet configurable

* rename variable

* rename variables

* add auto classes for TTS and S2S

* REMOVE CTC!!!

* S2S processor does not support save/load_pretrained

* fixup

* these models are now in an auto mapping

* fix doc links

* rename HiFiGAN to HifiGan, remove separate config file

* REMOVE auto classes

* there can be only one

* fixup

* replace assert

* reformat

* feature extractor can process input and target at same time

* update checkpoint names

* fix commit hash
2023-02-03 12:43:46 -05:00
Yih-Dar
197e7ce911
Fix device issue in a ConvBertModelTest test (#21438)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-03 15:12:28 +01:00
Joao Gante
f21af26279
🚨🚨 Generate: standardize beam search behavior across frameworks (#21368) 2023-02-03 10:24:02 +00:00
Yih-Dar
a6d8a149a8
Fix some pipeline tests (#21401)
* fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-02 19:03:31 +01:00
Younes Belkada
8298e4ec02
[bnb] Fine-tuning HF 8-bit models (#21290)
* force `memory_efficient_backward=True`

* enhancements

- trainer support
- add new flag

* some changes

- internal changes in `Trainer`
- small refactor

* make quality

* Fixes

- add new testing util
- add new test
- change test in Trainer

* fix CI test

* educate users on how to ft 8bit models

* more checks

* fix `logger` error

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* adapt from review

* fix

* add comment

* use return instead

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-02 16:39:23 +01:00
Clémentine Fourrier
67a3920d85
Fix Graphormer test suite (#21419)
* [FIX] path for Graphormer checkpoint

* [FIX] Test suite for graphormer

* [FIX] Update graphormer default num_classes
2023-02-02 16:29:13 +01:00
Joel Lamy-Poirier
e006ab51ac
Add the GeLU activation from pytorch with the tanh approximation (#21345)
* gelu_python_tanh

* rename

* Version check, add test

* Pr comment
2023-02-02 09:33:04 -05:00
Joao Gante
92ce53aab8
Generate: decoder-only models can generate with inputs_embeds (#21405) 2023-02-01 21:50:38 +00:00