Commit Graph

10886 Commits

Author SHA1 Message Date
Lysandre
10100979ed Dev version 2022-10-10 17:25:40 -04:00
Partho
df2f28120d
wrap forward passes with torch.no_grad() (#19412) 2022-10-10 15:04:10 -04:00
Partho
5f5e264a12
wrap forward passes with torch.no_grad() (#19413) 2022-10-10 15:03:46 -04:00
Partho
c6a928cadb
wrap forward passes with torch.no_grad() (#19414) 2022-10-10 15:03:24 -04:00
Partho
d739a707d9
wrap forward passes with torch.no_grad() (#19416) 2022-10-10 15:03:09 -04:00
Partho
870a9542be
wrap forward passes with torch.no_grad() (#19438) 2022-10-10 14:54:54 -04:00
Partho
692c5be74e
wrap forward passes with torch.no_grad() (#19439) 2022-10-10 14:54:36 -04:00
Yih-Dar
a7bc4221c0
fix (#19469)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-10 14:35:23 -04:00
Mikail Duzenli
25cfd911d0
Fixed a non-working hyperlink in the README.md file (#19434)
* Fixed a non-working hyperlink in the README.md file

The hyperlink to the community notebooks was outdated.

* Fixing missing double slash in hyperlink
2022-10-10 12:57:28 -04:00
Bartosz Szmelczynski
9df953a855
Fix misspelled word in docstring (#19415) 2022-10-10 17:33:57 +01:00
Shivang Mishra
d866b4858a
Generate: corrected exponential_decay_length_penalty type hint (#19376) 2022-10-10 17:32:03 +01:00
amyeroberts
4dd784c32f
Fix momentum and epsilon values (#19454)
The momentum value for PyTorch and TensorFlow batch normalization layers is not equivalent. The TensorFlow value should be (1 - pytorch_momentum) in order to ensure the correct updates are applied to the running mean and running variance calculations. We wouldn't observe a difference loading a pretrained model and performing inference, but evaluation outputs would change after some training steps.
2022-10-10 15:17:41 +01:00
Stefano Bosisio
b0b962ccca
Add Italian translation for add_new_model.mdx (#18713)
* fix conflicts

* start translating

* proof check

* add toc

* fix errors and typos
2022-10-10 10:12:40 -04:00
Kaiyu Yang
e150c4e2fe
Fix the error message in run_t5_mlm_flax.py (#19282) 2022-10-10 14:51:11 +01:00
amyeroberts
e3f028f3af
Add TF whisper (#19378)
* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* Add model to README

* Fix doc tests

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* remove torch depencies for feature extraction and style

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* All weights loading

* Remove hanging pdb

* Fixup and tidy up

* Use same copied from as PT model

* Remove whitespace changes

* Remove torch references

* Tie embeddings

* Remove logits processor input to generate

* Update logit values

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* Add logit processors and update generation code & tests

* Some tidy up

* Update docstring

* update

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update to reflect changes on the PT model branch

* Tidy up

* Remove extra whitespace

* Fix test - make input ids small enough we can append

* Include upstream changes on main

* PR comments - add batch tests, remove comments & defaults

* Fix model output imports

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/whisper/test_modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update docstring example

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove changes to adjust_logits_during_generation function

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Tidy up imports that don't require TF

* Update tests - skip and no more skip

* Update tests/generation/test_generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Add training flags

* Add (skipped) XLA generation tests

* Add embedding correctness test

* Add constant ids for generation tests

* Make logits finding a bit tidier

* Remove unused args

* xla generation enabled

* Don't skip XLA tests anymore

* Fix tests - add position ids to expected signature and update rag generation

* Undo method reorder

* Remove added whitespace

* Remove copy-paste gradient checkopint ref

* Remove

* Trigger CI - (issue with refs when pulling)

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2022-10-10 14:48:17 +01:00
APAVOU Clément
af69360bf9
Add OPTForQuestionAnswering (#19402)
* Add `OPTForQuestionAnswering`

- added `OPTForQuestionAnswering` class based on `BloomForQuestionAnswering`
- added `OPTForQuestionAnswering` in common tests
- all common tests pass
- make fixup done

* added docstrings for OPTForQuestionAnswering

* Fix docstrings for OPTForQuestionAnswering
2022-10-10 09:30:59 -04:00
Aritra Roy Gosthipaty
ba71bf4cae
fix: renamed variable name (#18850)
The sequence_masked variable is actually the part of the sequence that is kept unmasked for the encoder. This commit renames the variable.
2022-10-10 09:26:36 -04:00
Ryan Chan
4824741c4c
Remove dependency of Roberta in Blenderbot (#19411)
* Remove dependency of Roberta in Blenderbot

* Move Copied from statements to each method of the Roberta classes

* Remove copied from line for mask_token.setter

* update output from example in docs
2022-10-10 09:25:22 -04:00
Mohit Sharma
3080bb4754
Add onnx support for VisionEncoderDecoder (#19254)
* Add onnx support for VisionEncoderDecoder

* Add onnx support for VisionEncoderDecoder

* Removed unused import

* Rename encoder hidden state

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update docstrings and removed redundant code

* Added test function for enc-dec models

* Update doc string text

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* fixed code style

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-10 09:20:19 -04:00
Lysandre Debut
298f6a98c2
Stop relying on huggingface_hub's private methods (#19392)
* Leverage hfh for move cache

* Style
2022-10-10 15:19:33 +02:00
wei zhao
7d5ce6802e
Fix typo in image-classification/README.md (#19424)
Fix link typo of the following content.
PyTorch version, Trainer
PyTorch version, no Trainer
2022-10-10 09:16:58 -04:00
Rak Alexey
c523a86929
fix marianMT convertion to onnx (#19287)
* fix marianMT convertion to onnx

* Update src/transformers/onnx/convert.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update src/transformers/onnx/convert.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-10 09:11:29 -04:00
Darío Hereñú
3410705730
Fixed duplicated line (paragraph #83) Documentation: @sgugger (#19436)
* Fixed duplicated line (paragraph #83) @omarespejel @sgugger

* Datasets map denomination fixed (paragraph 42)
2022-10-10 09:08:34 -04:00
Darío Hereñú
83dc49b69b
Backtick fixed (paragraph 68) (#19440) 2022-10-10 08:47:14 -04:00
Druhin Abrol
1241a4993b
remove RobertaConfig inheritance from MarkupLMConfig (#19404)
* remove RobertaConfig inheritance from MarkupLMConfig

* Update src/transformers/models/markuplm/configuration_markuplm.py

fixed typo in docstring

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-10 08:44:59 -04:00
Matt
4107445a0f
Fix repo names for ESM tests (#19451) 2022-10-10 13:20:00 +01:00
Yih-Dar
cbb8a37929
Skip BloomEmbeddingTest.test_embeddings for PyTorch < 1.10 (#19261)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-10 10:05:30 +02:00
Yih-Dar
8b6bba54a7
Fix ViTMSNForImageClassification doctest (#19275)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-10 09:51:30 +02:00
Sylvain Gugger
d92e22d1f2
Remove ref to is_pipeline_test 2022-10-07 21:38:07 -04:00
Sylvain Gugger
9ac586b3c8
Rework pipeline tests (#19366)
* Rework pipeline tests

* Try to fix Flax tests

* Try to put it before

* Use a new decorator instead

* Remove ignore marker since it doesn't work

* Filter pipeline tests

* Woopsie

* Use the fitlered list

* Clean up and fake modif

* Remove init

* Revert fake modif
2022-10-07 18:01:58 -04:00
Alara Dirik
983451a13e
Improve and fix ImageSegmentationPipeline (#19367)
- Fixes the image segmentation pipeline test failures caused by changes to the postprocessing methods of supported models
- Updates the ImageSegmentationPipeline tests
- Improves docs, adds 'task' argument to optionally perform semantic, instance or panoptic segmentation
2022-10-07 23:34:41 +03:00
Vishwas
de4d71ea07
Removed Bert dependency from BertGeneration code base. (#19370)
* Copied all the code required from transformers.models.bert.modeling_bert to here

* Fixed styling issues

* Reformatted copied names with Model specific name.

* Reverted BertEncoder part as there is already a class called BertGenerationEncoder

* Added prefixes in missing places.

Co-authored-by: vishwaspai <vishwas.pai@emplay.net>
2022-10-07 13:45:24 -04:00
mustapha ajeghrir
34e0cc6d86
Make Camembert TF version independent from Roberta (#19364)
* camembert tf version independent

* fixup

* fixup, all working

* remove comments

* Adding copied from roberta

Co-authored-by: Mustapha AJEGHRIR <mustapha.ajeghrir@kleegroup.com>
2022-10-07 13:42:24 -04:00
Blip blop
7418a48e34
Removed Bert interdependency in tokenization_electra.py (#19356)
* Copied from BertTokenizer() in tokenization_bert

* Added BasicTokenizer and WordPieceTokenizer Class

* Update src/transformers/models/electra/tokenization_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Added copied from comments for basicTokenizer and WordPieceTokenizer

* Updated the comments for the tokenizerClasses

* Update src/transformers/models/electra/tokenization_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/electra/tokenization_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Formatted tokenization_electra with `make style`

* Fix repo inconsistencies

* Update src/transformers/models/electra/tokenization_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Set the logger

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-07 12:24:04 -04:00
Infrared1029
6ef16f2b67
Remove Dependency between Bart and LED (slow/fast) (#19408)
* removed dependency from bart(slow)

* removed dependency from bart(slow)

* adding copying comments (copied from bart to led)

* updated led docstring

* updated led docstring

* removed dependency from Bart (fast)

* replaced bart with LED in docstrings

* complying flake8

* added more copy comments

* fixing copying comments

* added comments back

* fix copy comments

* fixing copied from comments

* fixing copied from comments
2022-10-07 12:19:50 -04:00
Patrick von Platen
06514b3e1a
Clip device map (#19409)
* add first generation tutorial

* uP

* [Clip] Add text model to device map
2022-10-07 18:19:15 +02:00
harry7337
c2b83d540e
Removed Bert and XML Dependency from Herbert (#19410)
Co-authored-by: harry7337 <hari.8jan@gmail.com>
2022-10-07 11:49:09 -04:00
Ryan Chan
e6fc2016ad
Remove dependency of Bert from Squeezebert tokenizer (#19403)
* Remove dependency of Bert from Squeezebert tokenizer

* run style corrections

* update copies from BertTokenizers

* Update changes and style to Squeezebert files

* update copies for bert-fast
2022-10-07 11:32:55 -04:00
Arthur
994b7a4eea
update attention mask handling (#19385)
* update feature extractor params

* update attention mask handling
2022-10-07 16:54:08 +02:00
Dean Wyatte
a26d71d6ae
Export TensorFlow models to ONNX with dynamic input shapes (#19255)
* validate onnx models with a different input geometry than saved with

* only test working features for now

* simpler test skipping

* rm TODO

* expose batch_size/seq_length on vit

* skip certain name, feature, framework parameterizations known to fail validation

* Trigger CI

* Trigger CI
2022-10-07 10:53:03 -04:00
David Yang
5fef17f490
Copy BertTokenizer dependency into retribert tokenizer (#19371) 2022-10-07 10:14:00 -04:00
ddobokki
fa4bcd5274
edit: cast attention_mask to long in DataCollatorCTCWithPadding (#19369)
* edit: casting attention_mask to long in DataCollatorCTCWithPadding

* edit: casting attention_mask to long in DataCollatorCTCWithPadding
2022-10-07 10:05:48 -04:00
Amrit Sahu
e9a49babee
[WIP] Add ZeroShotObjectDetectionPipeline (#18445) (#18930)
* Add ZeroShotObjectDetectionPipeline (#18445)

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add batching for ZeroShotObjectDetectionPipeline

* Fix doc-string ZeroShotObjectDetectionPipeline

* Fix output format: ZeroShotObjectDetectionPipeline
2022-10-07 10:00:19 -04:00
Omar Sanseviero
331ea019d7
Remove unneded words from audio-related feature extractors (#19405) 2022-10-07 15:52:52 +02:00
Sourab Mangrulkar
56af8df359
HF <-> megatron checkpoint reshaping and conversion for GPT (#19317)
* HF <-> megatron checkpoint conversion handling reshaping from different tensor and parallel sizes

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* add doc strings and  🐛 fixes

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-07 19:16:55 +05:30
Thomas
41ec5d0ced
Added type hints for TF: TransfoXL (#19380)
* Added type hints for TF: TransfoXL
* Added type hints for TF: TransfoXL

* Change type hints for training

* Change type hints for training
2022-10-07 14:44:58 +01:00
h
b29ebdf4d8
removes prophet config dependencies from xlm-prophet (#19400) 2022-10-07 09:26:23 -04:00
Bibhabasu Mohapatra
e162cebfa3
add ONNX support for swin transformer (#19390)
* swin transformer onnx support

* Updated image dimensions as dynamic

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-07 09:23:24 -04:00
IMvision12
969534af4b
Added Type hints for XLM TF (#19333)
* Update modeling_tf_xlm.py

* Updates

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-10-07 13:44:50 +01:00
Zachary Mueller
46fd04b481
Fix gather for metrics (#19389) 2022-10-07 08:36:05 -04:00