Commit Graph

180 Commits

Author SHA1 Message Date
Matt
5f3ea66bc0
Add TF port of BLIP (#22090)
* Initial commit

* more stash commit

* Yet another stash commit

* yet more stash commit

* Mostly working except for docs / repo consistency

* Stop importing model list from torch file

* Add TF BLIP models to docs

* Add auto classes

* Move get_text_features and get_image_features

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/blip/test_modeling_tf_blip_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use channels_last convolutions in TF (better performance + compatibility)

* Remove _shape function

* Move multi-line statement to one line in PT + TF

* Specify tf.keras.layers instead of importing from it

* Remove test_gradient_checkpointing and empty test_training methods

* move some multi-line statements to one line

* Update docstring for generate

* Remove pruned heads set

* Remove self.seq_len_dim

* Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states

* ensure original model follows config in more cases

* Skip the same cross-attention tests in the PT tests - didn't realize we did it twice!

* Add training args throughout the models and layers

* make fixup

* Fix docstring for inputs_embeds

* Add docstring for is_decoder

* Add docstrings to text models

* Remove redundant computation

* Add unpack_inputs / keras_serializable

* Add modeling_tf_blip to doctests

* Add config classes for keras serialization

* Changes to allow model porting with pt-to-tf

* Quick fix to decoder head and test tweaks

* Revert an issue with masking the embeddings outputs

* Allow missing keys in some equivalence tests (for unused layers)

* Add tf-pt equivalence tests back in

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Refactor invert_attention_mask out into tf_utils

* Re-enable cross-tests on the PT side too

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-04 16:05:22 +01:00
Lucain
923110b74f
Remove set_access_token usage + fail tests if FutureWarning (#22051)
* Remove set_access_token usage + fail tests if FutureWarning

* do not fail on FutureWarning in CI

---------

Co-authored-by: testbot <lucainp@hf.co>
2023-03-09 09:23:48 -05:00
Arthur
2156662dea
[TF] Fix creating a PR while pushing in TF framework (#21968)
* add create pr arg

* style

* add test

* ficup

* update test

* last nit fix typo

* add `is_pt_tf_cross_test` marker for the tsts
2023-03-07 17:32:08 +01:00
Matt
acfb714bdf
Improve TF weight loading, especially PT crossloading (#21792)
* First commit for the improved PT-TF weight loading

* Remove workarounds from TFEncoderDecoder tests

* Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder

* make fixup

* First attempt at visionencoderdecoder

* Disable tensorfloat32 in tests to get consistent outputs

* Quick fix to tf_vision_encoder_decoder tests

* make fixup

* Update Blenderbot tests

* Remove unused arg in modeling_tf_opt

* load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False.

* Support prefixes when loading sharded TF checkpoints

* make fixup

* Add test to load sharded models with a weight prefix

* Fix sharded weight loading test

* Add a test for transfer from a sharded checkpoint

* make fixup

* Add test to check that crossloading from PT with a prefix works

* Refactor from_pretrained in the encoderdecoder classes

* Refactor from_pretrained in the encoderdecoder classes

* missmatched -> mismatched

* Explicitly check for None

* No comments showing my very impressive and attractive knowledge of Py3.9+

* Disable TF32 across all TF tests
2023-02-28 18:41:34 +00:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting (#21480)
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
2023-02-06 18:10:56 -05:00
Joao Gante
19d67bfecb
Generate: fix TF XLA tests on models with max_position_embeddings or max_target_positions (#21389) 2023-01-31 15:49:34 +00:00
Joao Gante
1eda4a4102
Generate: save generation config with the models' .save_pretrained() (#21264) 2023-01-23 16:21:44 +00:00
Yih-Dar
c8849583ad
Make test_save_pretrained_signatures slow test (#21105)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-18 10:43:05 +01:00
Joao Gante
a6c850e4f4
Generate: TF uses GenerationConfig as the basis for .generate() parametrization (#20994) 2023-01-04 18:23:20 +00:00
NielsRogge
11745b4e45
[Tests] Improve test_attention_outputs (#20701)
* Improve tests

* Improve TF tests

* Apply suggestion

* Fix test

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-14 14:41:40 +01:00
Arthur
84c9bf7421
cross platform from_pretrained (#20538)
* add support for `from_pt`

* add tf_flax utility file

* Update src/transformers/modeling_tf_flax_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove flax related modifications

* add test

* remove FLAX related commits

* fixup

* remove safetensor todos

* revert deletion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-05 16:56:17 +01:00
Matt
de4159a318
More TF int dtype fixes (#20384)
* Add a test to ensure int dummy inputs are int64

* Move the test into the existing int64 test and update a lot of existing dummies

* Fix remaining dummies

* Fix remaining dummies

* Test for int64 serving sigs as well

* Update core tests to use tf.int64

* Add better messages to the assertions

* Update all serving sigs to int64

* More sneaky hiding tf.int32s

* Add an optional int32 signature in save_pretrained

* make fixup

* Add Amy's suggestions

* Switch all serving sigs back to tf.int32

* Switch all dummies to tf.int32

* Adjust tests to check for tf.int32 instead of tf.int64

* Fix base dummy_inputs dtype

* Start casting to tf.int32 in input_processing

* Change dtype for unpack_inputs test

* Add proper tf.int32 test

* Make the alternate serving signature int64
2022-11-28 13:24:44 +00:00
Joao Gante
f2e7d270ec
Generate: shorter XLA contrastive search tests (#20354) 2022-11-22 11:47:12 +00:00
Joao Gante
0f78529f98
Generate: general TF XLA constrastive search are now slow tests (#20277)
* move contrastive search test to slow
2022-11-17 12:34:46 +00:00
Joao Gante
2062c28552
TF: add test for PushToHubCallback (#20231)
* test hub tf callback

* create repo before cloning it
2022-11-17 12:33:44 +00:00
Matt
26ec7928d0
Slightly alter Keras dummy loss (#20232)
* Slightly alter Keras dummy loss

* Slightly alter Keras dummy loss

* Add sample weight to test_keras_fit

* Fix test_keras_fit for datasets

* Skip the sample_weight stuff for models where the model tester has no batch_size
2022-11-15 16:58:43 +00:00
Joao Gante
f270b960d6
Generate: move generation_*.py src files into generation/*.py (#20096)
* move generation_*.py src files into generation/*.py

* populate generation.__init__ with lazy loading

* move imports and references from generation.xxx.object to generation.object
2022-11-09 15:34:08 +00:00
Joao Gante
a0f8674303
Generate: TF contrastive search with XLA support (#20050)
* Add contrastive search
2022-11-07 10:54:29 +00:00
Sylvain Gugger
6c24443ff5
Safetensors tf (#19900)
* Wip

* Add safetensors support for TensorFlow

* First tests

* Add final test for now

* Retrigger CI like this

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-10-27 15:56:29 -04:00
David Yang
a23819ed6a
Clean up deprecation warnings (#19654)
* Clean up deprecation warnings

Notes:
Changed some strings in tests to raw strings, which will change the literal content of the strings as they are fed into whatever machine handles them.
Test cases for past in the past/past_key_values switch changed/removed due to warning of impending removal

* Add PILImageResampling abstraction for PIL.Image.Resampling
2022-10-18 13:34:47 -04:00
Arthur
2bd2de62c9
Sharding fails in TF when absolute scope was modified if . in layer name (#19124)
* simplify loop

* fix layer map split

* update

* update for special variables

* add rag test

* fixup

* revert change : for next PR
2022-10-14 18:34:33 +02:00
Dimitre Oliveira
df8faba4db
Enabling custom TF signature draft (#19249)
* Custom TF signature draft

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Adding tf signature tests

* Fixing signature check and adding asserts

* fixing model load path

* Adjusting signature tests

* Formatting file

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Dimitre Oliveira <dimitreoliveira@Dimitres-MacBook-Air.local>
2022-10-11 10:56:08 +01:00
amyeroberts
e3f028f3af
Add TF whisper (#19378)
* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* Add model to README

* Fix doc tests

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* remove torch depencies for feature extraction and style

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* All weights loading

* Remove hanging pdb

* Fixup and tidy up

* Use same copied from as PT model

* Remove whitespace changes

* Remove torch references

* Tie embeddings

* Remove logits processor input to generate

* Update logit values

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* Add logit processors and update generation code & tests

* Some tidy up

* Update docstring

* update

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update to reflect changes on the PT model branch

* Tidy up

* Remove extra whitespace

* Fix test - make input ids small enough we can append

* Include upstream changes on main

* PR comments - add batch tests, remove comments & defaults

* Fix model output imports

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/whisper/test_modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update docstring example

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove changes to adjust_logits_during_generation function

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Tidy up imports that don't require TF

* Update tests - skip and no more skip

* Update tests/generation/test_generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Add training flags

* Add (skipped) XLA generation tests

* Add embedding correctness test

* Add constant ids for generation tests

* Make logits finding a bit tidier

* Remove unused args

* xla generation enabled

* Don't skip XLA tests anymore

* Fix tests - add position ids to expected signature and update rag generation

* Undo method reorder

* Remove added whitespace

* Remove copy-paste gradient checkopint ref

* Remove

* Trigger CI - (issue with refs when pulling)

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2022-10-10 14:48:17 +01:00
Aritra Roy Gosthipaty
0dc7b3a785
[TensorFlow] Adding GroupViT (#18020)
* chore: initial commit

* chore: adding util methods

yet to work on the nn.functional.interpolate port with align_corener=True

* chore: refactor the utils

* used tf.compat.v1.image.resize to align the F.interpolate function
* added type hints to the method signatures
* added references to the gists where one 2 one alignment of torch and tf has been shown

* chore: adding the layers

* chore: porting all the layers from torch to tf

This is the initial draft, nothing is tested yet.

* chore: aligning the layers with reference to tf clip

* chore: aligning the modules

* added demaraction comments
* added copied and adapted from comments

* chore: aligning with CLIP

* chore: wrangling the layers to keep it tf compatible

* chore: aligning the names of the layers for porting

* chore: style changes

* chore: adding docs and inits

* chore: adding tfp dependencis

the code is taken from TAPAS

* chore: initial commit for testing

* chore: aligning the vision embeddings with the vit implementatino

* chore: changing model prefix

* chore: fixing the name of the model and the layer normalization test case

* chore: every test passes but the slow ones

* chore: fix style and integration test

* chore: moving comments below decorators

* chore: make fixup and fix-copies changes

* chore: adding the Vision and Text Model to check_repo

* chore: modifying the prefix name to align it with the torch implementation

* chore: fix typo in configuration

* choer: changing the name of the model variable

* chore: adding segmentation flag

* chore: gante's review

* chore: style refactor

* chore: amy review

* chore: adding shape_list to parts that have been copied from other snippets

* chore: init batchnorm with torch defaults

* chore: adding shape_list to pass the tests

* test fix: adding seed as 0

* set seed

* chore: changing the straight through trick to fix -ve dimensinos

* chore: adding a dimension to the loss

* chore: adding reviewers and contributors names to the docs

* chore: added changes after review

* chore: code quality fixup

* chore: fixing the segmentation snippet

* chore: adding  to the layer calls

* chore: changing int32 to int64 for inputs of serving

* chore: review changes

* chore: style changes

* chore: remove from_pt=True

* fix: repo consistency

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 10:48:04 +01:00
Sylvain Gugger
ca485e562b
Add tests for legacy load by url and fix bugs (#19078) 2022-09-16 23:20:02 +02:00
Joao Gante
658010c739
TF: tests for (de)serializable models with resized tokens (#19013)
* resized models that we can actually load

* separate embeddings check

* add test for embeddings out of bounds

* add fake slows
2022-09-16 16:38:08 +01:00
Matt
2322eb8e2f
Update serving signatures and make sure we actually use them (#19034)
* Override save() to use the serving signature as the default

* Replace int32 with int64 in all our serving signatures

* Remember one very important line so as not to break every test at once

* Dtype fix for TFLED

* dtype fix for shift_tokens_right in general

* Dtype fixes in mBART and RAG

* Fix dtypes for test_unpack_inputs

* More dtype fixes

* Yet more mBART + RAG dtype fixes

* Yet more mBART + RAG dtype fixes

* Add a check that the model actually has a serving method
2022-09-15 14:34:22 +01:00
Joao Gante
00cbadb870
RFC: Replace custom TF embeddings by Keras embeddings (#18939) 2022-09-10 11:34:49 +01:00
Matt
660e0b97bd
Fix train_step, test_step and tests for CLIP (#18684)
* Fix train_step and test_step, correctly enable CLIP fit test

* Stop using get_args on older Python versions

* Don't use get_origin either

* UnionType is actually even newer, don't use that either

* Apply the same fix to test_loss_computation

* Just realized I was accidentally skipping a bunch of tests!

* Fix test_loss_computation for models without separable labels

* Fix scalar losses in test_step and train_step

* Stop committing your breakpoints

* Fix Swin loss shape

* Fix Tapas loss shape

* Shape fixes for TAPAS, DeIT, HuBERT and ViTMAE

* Add loss computation to TFMobileBertForPreTraining

* make fixup and move copied from statement

* make fixup and move copied from statement

* Correct copied from

* Add labels and next_sentence_label inputs to TFMobileBERT

* Make sure total_loss is always defined

* Update tests/test_modeling_tf_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix copied from

* Ensure CTC models get labels in tests

* Ensure CTC models get labels in tests

* Fix tests for vit_mae

* Fix tests for vit_mae

* Fix tests for vit_mae

* Reduce batch size for wav2vec2 testing because it was causing OOM

* Skip some TAPAS tests that are failing

* Skip a failing HuBERT test

* make style

* Fix mobilebertforpretraining test

* Skip Wav2Vec2 tests that use huge amounts of mem

* Skip keras_fit for Wav2Vec2 as well

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2022-09-09 20:01:02 +01:00
Ankur Goyal
2ef7742117
Add DocumentQuestionAnswering pipeline (#18414)
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models

* Fixup

* Use the full encoding

* Basic refactoring to DocumentQuestionAnsweringPipeline

* Cleanup

* Improve args, docs, and implement preprocessing

* Integrate OCR

* Refactor question_answering pipeline

* Use refactored QA code in the document qa pipeline

* Fix tests

* Some small cleanups

* Use a string type annotation for Image.Image

* Update encoding with image features

* Wire through the basic docs

* Handle invalid response

* Handle empty word_boxes properly

* Docstring fix

* Integrate Donut model

* Fixup

* Incorporate comments

* Address comments

* Initial incorporation of tests

* Address Comments

* Change assert to ValueError

* Comments

* Wrap `score` in float to make it JSON serializable

* Incorporate AutoModeLForDocumentQuestionAnswering changes

* Fixup

* Rename postprocess function

* Fix auto import

* Applying comments

* Improve docs

* Remove extra assets and add copyright

* Address comments

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-07 13:38:49 -04:00
Joao Gante
9196f48b95
Generate: validate model_kwargs on TF (and catch typos in generate arguments) (#18651) 2022-09-02 16:25:26 +01:00
Joao Gante
6e016634f1
Generate: smaller TF serving test (#18840) 2022-09-01 10:53:39 +01:00
Lucain
169b8cde47
Fix mock in test_cached_files_are_used_when_internet_is_down (#18804) 2022-08-29 15:56:08 +02:00
Sylvain Gugger
5cd4032368
Use new huggingface_hub tools for download models (#18438)
* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments
2022-08-05 10:12:40 -04:00
Yih-Dar
14928921e2
Add TF_MODEL_FOR_SEMANTIC_SEGMENTATION_MAPPING (#18469)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-04 20:41:15 +02:00
nlpcat
fc1d841b2d
change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372)
* change shape to support dynamic batch input in tf.generate

* add tests

Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>
2022-08-04 11:26:11 +01:00
Daniel Suess
8fb7c908c8
Fix failing tests for XLA generation in TF (#18298)
* Fix failing test_xla_generate_slow tests

* Fix failing speech-to-text xla_generate tests
2022-08-03 09:45:15 -04:00
Sylvain Gugger
01db72abd4
Rewrite push_to_hub to use upload_files (#18366)
* Rewrite push_to_hub to use upload_files

* Adapt the doc a bit

* Address review comments and clean doc
2022-08-01 12:07:30 -04:00
amyeroberts
8e8384663d
Update serving code to enable saved_model=True (#18153)
* Add serving_output and serving methods to some vision models

* Add serving outputs for DeiT

* Don't convert hidden states - differing shapes

* Make saveable

* Fix up

* Make swin saveable

* Add in tests

* Fix funnel tests (can't convert to tensor)

* Fix numpy call

* Tidy up a bit

* Add in hidden states - resnet

* Remove numpy

* Fix failing tests - tensor shape and skipping tests

* Remove duplicated function

* PR comments - formatting and var names

* PR comments
Add suggestions made by Joao Gante:
* Use tf.shape instead of shape_list
* Use @tooslow decorator on tests
* Simplify some of the logic

* PR comments
Address Yih-Dar Sheih comments - making tensor names consistent and make types float

* Types consistent with docs; disable test on swin (slow)

* CI trigger

* Change input_features to float32

* Add serving_output for segformer

* Fixup

Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-22 18:05:38 +01:00
Yih-Dar
6561fbcc6e
Update TF(Vision)EncoderDecoderModel PT/TF equivalence tests (#18073)
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:29:14 +02:00
Joao Gante
20509ab0e0
TF: unpack_inputs decorator independent from main_input_name (#18110) 2022-07-13 10:43:41 +01:00
Matt
96d833b211
Return scalar losses instead of per-sample means (#18013)
* Return scalar losses instead of per-sample means

* Make loss shape (1,) instead of scalar

* Allow scalar losses in test_loss_computation

* Allow scalar losses in test_loss_computation

* Allow scalar losses in test_loss_computation

* Remove XLA loss function for RAG
2022-07-04 17:26:19 +01:00
Matt
d6cec45801
XLA train step fixes (#17973)
* Copy inputs to train and test step before modifying them, as this breaks things

* Add XLA tests, fix our loss functions to be XLA-compatible

* make fixup

* Update loss computation test to expect vector of per-sample losses

* Patch loss for TFLED

* Patch loss for TFAlbert

* Add a tf_legacy_loss config flag that enables old loss functions

* Stop using config.get() because it's not a dict

* Skip loss computation test for RAG because its loss is very strange and I'm afraid to rewrite it

* make fixup

* Add XLA-compatible RAG loss

* Fix dtype of loss mask for TFAlbert

* Fix test for XLNet too because it overrides the default one

* make fixup

* Fix config test

* No more depending on GPU NaN behaviour

* Add test, avoid potential zero division

* Fix test item assignment

* Fix loss computation masking test

* make fixup

* Fix dtype bugs
2022-07-01 19:11:14 +01:00
Joao Gante
91e1f24ef3
CLI: convert sharded PT models (#17959)
* sharded conversion; add flag to control max hidden error

* better hidden name matching

* Add test: load TF from PT shards

* fix test (PT data must be local)
2022-06-30 16:51:03 +01:00
Joao Gante
e6d27ca5c8
TF: XLA beam search + most generation-compatible models are now also XLA-generate-compatible (#17857)
* working beam search 🎉

* XLA generation compatible with ALL classes

* add xla generation slow test
2022-06-29 12:41:01 +01:00
Matt
1a7ef3349f
Fix broken test for models with batchnorm (#17841)
* Fix tests that broke when models used batchnorm

* Initializing the model twice does not actually...
...give you the same weights each time.
I am good at machine learning.

* Fix speed regression
2022-06-23 15:59:53 +01:00
Arthur
7cced021fa
TF Sharded (#17713)
* initial commit

* update modeeling tf utils

* quality

* clean and update args

* update

* remove potential bug

* code quality

* update

* update max shard

* update tests for sharding from pretrained

* fix remaining test

* make style

* h5py if tf available

* update and fix test

* fix test

* style

* modified push to hub to support shard for TF

* quick fix

* update code

* merge branch main and style

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update based on reviews

* update doc

* update and style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* fix typo

* style

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 18:01:08 +02:00
Lysandre Debut
6a5272b205
Prepare transformers for v0.8.0 huggingface-hub release (#17716)
* Prepare CI for v0.8.0

* pin hfh (revert before merge)

* Revert "pin hfh (revert before merge)"

This reverts commit a0103140e1.

* Test rc3

* Test latest rc

* Unpin to the RC

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-06-21 11:51:18 -04:00
Joao Gante
132402d752
TF: BART compatible with XLA generation (#17479)
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
2022-06-20 11:07:46 +01:00