* Add XLA torchrun support
* Clarify that currently DDP doesn't work with torch.distributed XLA backend yet
* Enable DDP with torchrun and XLA (now available in PT-XLA 1.13)
* Add check for AWS Neuron availability and AWS Neuron specific compiler flag
* Change the new test's name to TestTrainerDistributedNeuronCore
* Remove "assert" and replace raised exception
* Remove compiler flag as it is optional. If needed, will be another PR.
* Use TORCHELASTIC_RUN_ID to determine whether torchrun is used
* `blip` support for training
* remove labels creation
* remove unneeded `decoder_input_ids` creation
* final changes
- add colab link to documentation
- reduction = mean for loss
* fix nits
* update link
* clearer error message
* Add epsilon- and eta-sampling.
Add epsilon- and eta-sampling, following the official code from https://github.com/john-hewitt/truncation-sampling and adapting to be more configurable, as required by Huggingface transformers.
* Add unit tests for epsilon- and eta-sampling.
* Black: fix code formatting.
* Fix docstring spacing.
* Clean up newlines.
* Fix implementation bugs and their associated tests.
* Remove epsilon- and eta-sampling parameters from PretrainedConfig.
* Clarify and clean up the documentation.
* Remove parameters for PretrainedConfig test.
* add draft logit processor
* add template functions
* update timesapmt processor parameters
* draft script
* simplify code
* cleanup
* fixup and clean
* update pipeline
* style
* clean up previous idea
* add tokenization utils
* update tokenizer and asr output
* fit whisper type
* style and update test
* clean test
* style test
* update tests
* update error test
* udpate code (not based on review yet)
* update tokenization
* update asr pipeline
* update code
* cleanup and update test
* fmt
* remove text verificatino
* cleanup
* cleanup
* add model test
* update tests
* update code add docstring
* update code and add docstring
* fix pipeline tests
* add draft logit processor
add template functions
update timesapmt processor parameters
draft script
simplify code
cleanup
fixup and clean
update pipeline
style
clean up previous idea
add tokenization utils
update tokenizer and asr output
fit whisper type
style and update test
clean test
style test
update tests
update error test
udpate code (not based on review yet)
update tokenization
update asr pipeline
update code
cleanup and update test
fmt
remove text verificatino
cleanup
cleanup
add model test
update tests
update code add docstring
update code and add docstring
fix pipeline tests
* Small update.
* Fixup.
* Tmp.
* More support.
* Making `forced_decoder_ids` non mandatory for users to set.
* update and fix first bug
* properly process sequence right after merge if last
* tofo
* allow list inputs + compute begin index better
* start adding tests
* add the 3 edge cases
* style
* format sequences
* fixup
* update
* update
* style
* test passes, edge cases should be good
* update last value
* remove Trie
* update tests and expec ted values
* handle bigger chunk_length
* clean tests a bit
* refactor chunk iter and clean pipeline
* update tests
* style
* refactor chunk iter and clean pipeline
* upade
* resolve comments
* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* take stride right into account
* update test expected values
* Update code based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* Fixing #20783
* Update src/transformers/pipelines/base.py
* Fixing some tests.
* Fixup.
* Remove ffmpeg dep + a bit more relaxed for bigbird QA precision.
* Better dataset.
* Prevent failing on TF.
* Better condition. We can't use `can_use_iterator` since we cannot use it
directly.
* Add StopIdStoppingCriteria
* add a working test for stop id criteria
* add to global scope
* add stop_ids to generate
* add pipeline test
* use tokenizer encode in test
* add test to generation utils
* reformat
* fixup
* make-fix-copies
* rename to stop_token_id
* use stop_tokens instead
* add to text to text generation
* make fixup
* make repo-consistency
* Add support for list of ints for eos_token_id inside generation/utils.py
* Instead of having if elses, cast the eos_token_id into a List[int]
* Add List[int] support for logits_process.py
* add List[int] for beam_search.py
* add List[int] for forced_eos_token_id
* revert stop token id stopping criteria changes
* make fixup
* fix tests
* add eos_token_id to generation/utils.py and added tests test_utils.py
* add eos_token_id type hints and fix for pad tokens
* add comments
* remove some prints and remove forced false test
* fix
* put back test_stop_sequence_stopping_criteria
* remove unused import and make fixup
* add a none check
* update docstring
* add more docstring for list ints
* make fixup
* add torch_dtype attribute to Pipeline
* Use torch_dtype to cast input tensor type in AutomaticSpeechRecognitionPipeline
* Fix code quality
* Add TextGenerationPipeline fp16 test
* Fix code quality
* Remove useless require in tests
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* torch.jit._state
* Fix past CI
* Fix for perceiver
* Fix REALM
* Fix for Bloom
* Fix for SwinMode
* Fix for TrajectoryTransformerModel
* Fix for test_wav2vec2_with_lm
* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Supporting `fp16` for asr pipeline
* Adding test.
* Style.
* Oops.
* Flake8 update ?
* Fixing flake8 ?
* Revert "Flake8 update ?"
This reverts commit 0b917fcb52.
* Style (acctidentally deleted flake8 F401.)
* Move to a bigger test (no small whisper model, and s2t doesn't seem to
accept torch_dtype=fp16).
Also we need to use a GPU to actually compute on fp16.
* Using BatchFeature capability.
* Add support for binary segmentation
* Fix loss calculation and add test
* Remove space
* use fstring
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
* Copy RoBERTa
* formatting
* implement RoBERTa with prelayer normalization
* update test expectations
* add documentation
* add convertion script for DinkyTrain weights
* update checkpoint repo
Unfortunately the original checkpoints assumes a hacked roberta model
* add to RoBERTa-PreLayerNorm docs to toc
* run utils/check_copies.py
* lint files
* remove unused import
* fix check_repo reporting wrongly a test is missing
* fix import error, caused by rebase
* run make fix-copies
* add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS
* Fix documentation <Facebook> -> Facebook
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup: Fix documentation <Facebook> -> Facebook
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add missing Flax header
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* expected_slice -> EXPECTED_SLICE
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update copies after rebase
* add missing copied from statements
* make fix-copies
* make prelayernorm explicit in code
* fix checkpoint path for the original implementation
* add flax integration tests
* improve docs
* update utils/documentation_tests.txt
* lint files
* Remove Copyright notice
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make fix-copies
* Remove EXPECTED_SLICE calculation comments
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* generate from config mvp
* fix failing tests
* max_time test
* Load default gen config at model load time; Update docs
* further documentation; add tests
* adapt rag to the new structure
* handle models not instantiated with from_pretained (like in tests)
* better default generation config
* add can_generate fn
* handle legacy use case of ad hoc model config changes
* initialize gen config from config in individual methods, if gen config is none
* fix _get_decoder_start_token_id when called outside GenerationMixin
* correct model config load order (set attr > model config > decoder config)
* update rag to match latest changes
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* load gen config from model config in model.from_pretrained
* fix can_generate fn
* handle generate calls without a previous from_pretrained (e.g. tests)
* add legacy behavior (and a warning)
* lower logger severity
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add templates for gpt-sw3
* Add templates for gpt-sw3
* Added sentencepiece tokenizer
* intermediate commit with many changes
* fixed conflicts
* Init commit for tokenization port
* Tokenization progress
* Remove fast tokenizer
* Clean up and rename spm.model -> spiece.model
* Remove TF -> PT conversion script template, Clean up Megatron -> PT script
* Optimize encode & decode performance
* added new attention
* added new attention
* attention for gpt-sw3 working
* attention good
* Cache is now working
* fixed attention mask so that it works with causal attention
* fixed badbmm bug for cpu and caching
* updated config with correct parameters
* Refactor and leave optimizations as separate functions to avoid breaking expected functionality
* Fix special tokens mapping for both tokenizers
* cleaning up of code and comments
* HF compatible attention outputs
* Tokenizer now passing tests, add documentation
* Update documentation
* reverted back to base implementation after checking that it is identical to pretrained model
* updated gpt-sw3 config
* updated conversion script
* aligned parameters with gpt-sw3 config
* changed default scale_attn_by_inverse_layer_idx to true
* removed flag from conversion script
* added temporary model path
* reverted back to functioning convert script
* small changes to default config
* updated tests for gpt-sw3
* make style, make quality, minor cleanup
* Change local paths to testing online repository
* Change name: GptSw3 -> GPTSw3
* Remove GPTSw3TokenizerFast references
* Use official model repository and add more model sizes
* Added reference to 6.7b model
* Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel
* Remove pointers to non-existing TFGPTSw3
* Add GPTSw3 to docs/_toctree.yml
* Remove TF artifacts from GPTSw3 in __init__ files
* Update README:s with 'make fix-copies'
* Add 20b model to archive list
* Add documentation for GPT-Sw3
* Fix typo in documentation for GPT-Sw3
* Do 'make fix-copies' again after having updated docs
* Fix some typos in docs
* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Resolve comments from PR feedback
* Resolve more comments from PR feedback, also set use_cache=True in convert script
* Add '# Copied from' comments for GPTSw3 modeling
* Set 'is_parallelizable = False'
* Remove '# Copied from' where code was modified and add 'with x->y' when appropriate
* Remove parallelize in mdx
* make style, make quality
* Update GPTSw3Config default values and corresponding documentation
* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available
* Make style, make quality
* Add dummy object for GPTSw3Tokenizer via 'make fix-copies'
* make fix-copies
* Remove GPTSw3 modeling classes
* make style, make quality
* Add GPTSw3 auto-mappings for other GPT2 heads
* Update docs/source/en/model_doc/gpt-sw3.mdx
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Remove old TODO-comment
* Add example usage to GPTSw3Tokenizer docstring
* make style, make quality
* Add implementation details and example usage to gpt-sw3.mdx
Co-authored-by: JoeyOhman <joeyoh@kth.se>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* read to load
* base functionality
* revert init
* fix dummy data
* moving right along
* moving right along
* finally
* cleanup
* pull out comment
* add test
* update docstring for main class
* flake comments and rewriting copies from make repo-consistency`
* remove irrelevant differences/accidental spaces
* put copies back after space removals
* mid
* final test pass
* stray comment
* update test file
* update test file
* fixup
* black
* missed
* black missed one more
* sytle
* add doc update
* fix order of output class
* comment
* Revert "comment"
This reverts commit 03f86b6948.
* remove redundant function, and redundant reshape
* move change out of common
* style
* put common spaces back
* reorder kwargs in output
* doc style
* add `dpt-hybrid` support
* refactor
* final changes, all tests pass
* final cleanups
* final changes
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix docstring
* fix typo
* change `vit_hybrid` to `hybrid`
* replace dataclass
* add docstring
* move dataclasses
* fix test
* add `PretrainedConfig` support for `backbone_config`
* fix docstring
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* remove `embedding_type` and replace it by `is_hybrid`
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix cuda OOM by using single Prior
* only send to device when used
* use custom model
* Skip the big slow test
* Update tests/models/jukebox/test_modeling_jukebox.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* biogpt initial commit
* updated init
* fix faster decoding with use_cache
* 1. fix input_ids and input_embeds with correct device
2. added _keys_to_ignore_on_load_missing
3. updated prepare_inputs_for_generation
* add activation_dropout and scale_embedding
* replace fsmt attention with bart attention
* added test
* run make fix-copies
* doc init and fix build
* updated README with proper information
* 1. added tips to docs
2. updated BioGptTokenizer func
* 1. added tokenizer test
2. refactor tokenizer
* make fixup
* add biogpt fairseq to hf converter
* updated layer names more
similar to original checkpoints
* config update doc string and set defaults
* added "#copied" from bart model and
updated doc strings
* enable model_input_names in tokenizer
* 1. positionalembedding depending on attention_mask
2. added attention mask to prepare for generation
* added test to verify past and generation
* BioGptLMHeadModel -> BioGptForCausalLM
* fix typo
* tokenization and test
Copyright and updated assertion
* updated Copyright and
one func at time in line
* Copyright updates and
minor doc fix
* replace assertion with ValueError
* rm extra space
* added code syntax
* revert cmnt position change
* add tokenizer to auto
* updated doc string
* tokenizer doc string update
* biogpt hub model update to microsoft/biogpt
* make fixup
* rm cmnt to fix flake8 5.0.4 vs 6 error
* Fixed test_saved_model_extended
* Fix TFGPT2 tests
* make fixup
* Make sure keras-nlp utils are available for type hinting too
* Update src/transformers/testing_utils.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* make fixup
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* add minimal working gpt2 tokenizer
* graph mode and output equivalence tests working
* not today tensorflow. serialization test passing!
* fix style, documentation, docstrings and all that jazz
* passing consistency checks
* move keras nlp to tf dependencies
* fix tf modeling utils and gpt2 attention to enable compiling
* fix (I hope) keras nlp dependencies
* rever changes on generation
* remove debug prints
* remove redundant tf dummy objects
* add from config, get config and max length settings to address review
* let flake ignore the error on distillation you are welcome
* test from config
* add padding test
* address sgugger review
* Add Donut image processor
* Update src/transformers/image_transforms.py
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
* Fix docstrings
* Full var names in docstring
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
* First draft
* Fix backwards compatibility
* More fixes
* More fixes
* Make backbone more general
* Improve backbone
* Improve test
* Fix config checkpoint
* Address comments
* Use model_type
* Address more comments
* Fix special model names
* Remove MaskFormerSwinModel and MaskFormerSwinPreTrainedModel from main init
* Fix typo
* Update backbone
* Apply suggestion
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add hidden states and attentions to backbone outputs
* Update ResNet
* Fix more tests
* Debug test
* Fix test_determinism
* Fix test_save_load
* Remove file
* Disable fx tests
* Test
* Add fx support for backbones
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add a test to ensure int dummy inputs are int64
* Move the test into the existing int64 test and update a lot of existing dummies
* Fix remaining dummies
* Fix remaining dummies
* Test for int64 serving sigs as well
* Update core tests to use tf.int64
* Add better messages to the assertions
* Update all serving sigs to int64
* More sneaky hiding tf.int32s
* Add an optional int32 signature in save_pretrained
* make fixup
* Add Amy's suggestions
* Switch all serving sigs back to tf.int32
* Switch all dummies to tf.int32
* Adjust tests to check for tf.int32 instead of tf.int64
* Fix base dummy_inputs dtype
* Start casting to tf.int32 in input_processing
* Change dtype for unpack_inputs test
* Add proper tf.int32 test
* Make the alternate serving signature int64
* change the way sentinel tokens can retrived
* Fix line length for doc string
* Fix line length for doc string
* Add more stronger test for t5 tokenization
* Format file changes
* Make a stronger test for filtering sentinel tokens
* fix file format issues
* Optimizes DonutProcessor token2json method for speed
* Applies black formatting
* Updates Donut pretrained model name in test file
* remaining pytorch type hints (#20217)
* Update modeling_flava.py
* Update modeling_markuplm.py
* Update modeling_glpn.py
* Update modeling_roc_bert.py
* Update modeling_segformer.py
* Update modeling_tapas.py
* Update modeling_tapas.py
* Update modeling_tapas.py
* Update modeling_tapas.py
* Update modeling_trocr.py
* Update modeling_videomae.py
* Update modeling_videomae.py
* Update modeling_videomae.py
* Update modeling_yolos.py
* Update modeling_wav2vec2.py
* Update modeling_jukebox.py
* Update modeling_jukebox.py
* Update modeling_jukebox.py
* Update modeling_jukebox.py
* Data collator for token classification pads labels column when receives pytorch tensors (#20244)
* token cls data_collator pads labels column
* remove walrus operator for code quality
* remove redundat space
* remove comment that was fixed
* PR comments fix
Co-authored-by: Alexander Markov <amarkov.me@gmail.com>
* [Doctest] Add configuration_deformable_detr.py (#20273)
* Update configuration_deformable_detr.py comment
* Add DeformableDetrConfig to documentation_tests.txt
* Fix summarization script (#20286)
* [DOCTEST] Fix the documentation of RoCBert (#20142)
* update part of the doc
* add temp values, fix part of the doc
* add template outputs
* add correct models and outputss
* style
* fixup
* [bnb] Let's warn users when saving 8-bit models (#20282)
* add warning on 8-bit models
- added tests
- added wrapper
* move to a private attribute
- remove wrapper
- changed `save_pretrained` method
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Adding `zero-shot-object-detection` pipeline doctest. (#20274)
* Adding `zero-shot-object-detection` pipeline doctest.
* Remove nested_simplify.
* Adding doctest for `object-detection` pipeline. (#20258)
* Adding doctest for `object-detection` pipeline.
* Removed nested_simplify.
* Image transforms functionality used instead (#20278)
* Image transforms functionality used instead
* Import torch
* Import rather than copy
* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py
* TF: add test for `PushToHubCallback` (#20231)
* test hub tf callback
* create repo before cloning it
* Generate: general TF XLA constrastive search are now slow tests (#20277)
* move contrastive search test to slow
* Fixing the doctests failures. (#20294)
* Fixing the doctests failures.
* Fixup.
* set the default cache_enable to True, aligned with the default value in pytorch cpu/cuda amp autocast (#20289)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Add docstrings for canine model (#19457)
* Add docstrings for canine model
* Update CanineForTokenClassification
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Add AutoBackbone + ResNetBackbone (#20229)
* Add ResNetBackbone
* Define channels and strides as property
* Remove file
* Add test for backbone
* Update BackboneOutput class
* Remove strides property
* Fix docstring
* Add backbones to SHOULD_HAVE_THEIR_OWN_PAGE
* Fix auto mapping name
* Add sanity check for out_features
* Set stage names based on depths
* Update to tuple
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add missing report button for Example test (#20293)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* refactor test (#20300)
- simplifies the devce checking test
* [Tiny model creation] deal with `ImageProcessor` (#20298)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Fix blender bot missleading doc (#20301)
* fix the doc to specify that add_prefix_space = False
* add correct expected output
* remove two tokens that should not be suppressed (#20302)
* [ASR Examples] Update README for Whisper (#20230)
* [ASR Examples] Update README for seq2seq
* add language info
* add training results
* re-word
* Add padding image transformation (#19838)
* Add padding transformation
* Add in upstream changes
* Update tests & docs
* Code formatting tuples in docstring
* Pin TensorFlow (#20313)
* Pin to the right version...
* Also pin TensorFlow CPU
* Add AnyPrecisionAdamW optimizer (#18961)
* Add AnyPrecisionAdamW optimizer
* Add optim_args argument to TrainingArgs
* Add tests for AnyPrecisionOptimizer
* Change AnyPrecisionAdam default params to float32
* Move default_anyprecision_kwargs in trainer test
* Rename AnyPrecisionAdamW
* [Proposal] Breaking change `zero-shot-object-detection` for improved consistency. (#20280)
* [Proposal] Breaking change `zero-shot-object-detection` for improved
consistency.
This is a proposal to modify the output of `zero-shot-object-detection`
to provide better alignment with other pipelines.
The output is now strictly the same as `object-detection` whereas before
it would output lists of lists.
The name `candidate_labels` is used throughout for consistency with
other `zero-shot` pipelines.
The pipeline is changed to `ChunkPipeline` to support batching cleanly.
This removes all the lists and list of lists shenanigans, it's now a
matter of the base pipeline handling all this not this specific one.
**Breaking change**: It did remove complex calls potentials `pipe(images = [image1, image2],
text_queries=[candidates1, candidates2])` to support only
`pipe([{"image": image1, "candidate_labels": candidates1}, {"image": image2, "candidate_labels": candidates2}])`
when dealing with lists and/or datasets.
We could keep them, but it will add a lot of complexity to the code
base, since the pipeline is rather young, I'd rather break to keep the
code simpler, but we can revert this.
**Breaking change**: The name of the argument is now `image` instead of
`images` since it expects by default only 1 image. This is revertable
like the previous one.
**Breaking change**: The types is now simplified and flattened:
`pipe(inputs) == [{**object1}, {**object2}]`
instead of the previous
`pipe(inputs) == [[{**object1}, {**object1}], [{**object2}]]`
Where the different instances would be grouped by candidate labels
within lists.
IMHO this is not really desirable, since it would output empty lists and
is only adding superflous indirection compared to
`zero-shot-object-detection`.
It is relatively change free in terms of how the results, it does change
computation however since now the batching is handled by the pipeline
itself. It **did** change the results for the small models so there
seems to be a real difference in how the models handle this.
* Fixing the doctests.
* Behind is_torch_available.
* Fix flakey test with seed (#20318)
* Pin TF 2.10.1 for Push CI (#20319)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Remove double brackets (#20307)
* remove double brackets
* oops get other bracket
* TF: future proof our keras imports (#20317)
* future proof our tf code
* parse tf versions
* Add Neighborhood Attention Transformer (NAT) and Dilated NAT (DiNAT) models (#20219)
* Add DiNAT
* Adds DiNAT + tests
* Minor fixes
* Added HF model
* Add natten to dependencies.
* Cleanup
* Minor fixup
* Reformat
* Optional NATTEN import.
* Reformat & add doc to _toctree
* Reformat (finally)
* Dummy objects for DiNAT
* Add NAT + minor changes
Adds NAT as its own independent model + docs, tests
Adds NATTEN to ext deps to ensure ci picks it up.
* Remove natten from `all` and `dev-torch` deps, add manual pip install to ci tests
* Minor fixes.
* Fix READMEs.
* Requested changes to docs + minor fixes.
* Requested changes.
* Add NAT/DiNAT tests to layoutlm_job
* Correction to Dinat doc.
* Requested changes.
* organize pipelines by modality (#20306)
* Fix torch device issues (#20304)
* fix device issue
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Generate: add generation config class (#20218)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* translate zh quicktour(#20095) (#20181)
* zh quicktour(#20095)
* add zh to doc workflow
* remove untranslation from toctree
Co-authored-by: BeifangSusu <BeifangSusu@bfss.com>
* Add Spanish translation of serialization.mdx (#20245)
* Update _toctree and clone original content
* Translate first three sections
* Add more translated chapters. Only 3 more left.
* Finish translation
* Run style from doc-builder
* Address recommended changes from reviewer
* Add LayerScale to NAT/DiNAT (#20325)
* Add LayerScale to NAT/DiNAT.
Completely dropped the ball on LayerScale in the original PR (#20219).
This is just an optional argument in both models, and is only activated for larger variants in order to provide training stability.
* Add LayerScale to NAT/DiNAT.
Minor error fixed.
Co-authored-by: Ali Hassani <ahassanijr@gmail.com>
* [Switch Transformers] Fix failing slow test (#20346)
* run slow test on GPU
* remove unnecessary device assignment
* use `torch_device` instead
* fix: "BigSicence" typo in docs (#20331)
* add MobileNetV1 model (#17799)
* add model files etc for MobileNetV2
rename files for MobileNetV1
initial implementation of MobileNetV1
fix conversion script
cleanup
write docs
tweaks
fix conversion script
extract hidden states
fix test cases
make fixup
fixup it all
remove main from doc link
fixes
fix tests
fix up
use google org
fix weird assert
* fixup
* use google organization for checkpoints
* Generate: `model_kwargs` can also be an input to `prepare_inputs_for_generation` (#20353)
* Update Special Language Tokens for PLBART (#19980)
* Update Special Language Tokens for PLBART
* fix format
* making mapping for language codes and updating tests:
* fix format
* fix consistency
* add assert to both tokenizer tests.
* fix format
* Update src/transformers/models/plbart/tokenization_plbart.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* improvin readability, setting self.tgt_lang
* fixing
* readability
Co-authored-by: jordiclive <jordiclive19@imperial.ac.uk>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add resources (#20296)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Enhance HfArgumentParser functionality and ease of use (#20323)
* Enhance HfArgumentParser
* Fix type hints for older python versions
* Fix and add tests (+formatting)
* Add changes
* doc-builder formatting
* Remove unused import "Call"
* Add Audio Spectogram Transformer (#19981)
* First draft
* Make conversion script work
* Add id2label mapping, run code quality
* Fix copies
* Add first draft of feature extractor
* Update conversion script to use feature extractor
* Make more tests pass
* Add docs
* update input_features to input_values + pad by default to max length
* Fix doc tests
* Add feature extractor tests
* Add proper padding/truncation to feature extractor
* Add support for conversion of all audioset checkpoints
* Improve docs and extend conversion script
* Fix README
* Rename spectogram to spectrogram
* Fix copies
* Add integration test
* Remove dummy conv
* Update to ast
* Update organization
* Fix init
* Rename model to AST
* Add require_torchaudio annotator
* Move import of ASTFeatureExtractor under a is_speech_available
* Fix rebase
* Add pipeline config
* Update name of classifier head
* Rename time_dimension and frequency_dimension for clarity
* Remove print statement
* Fix pipeline test
* Fix pipeline test
* Fix index table
* Fix init
* Fix conversion script
* Rename to ForAudioClassification
* Fix index table
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add inference section to task guides (#18781)
* 📝 start adding inference section to task guides
* ✨ make style
* 📝 add multiple choice
* add rest of inference sections
* make style
* add compute_metric, push_to_hub, pipeline
* make style
* add updated sequence and token classification
* make style
* make edits in token classification
* add audio classification
* make style
* add asr
* make style
* add image classification
* make style
* add summarization
* make style
* add translation
* make style
* add multiple choice
* add language modeling
* add qa
* make style
* review and edits
* apply reviews
* make style
* fix call to processor
* apply audio reviews
* update to better asr model
* make style
* Fix toctree for Section 3 in Spanish Documentation (#20360)
* Order and group topics in the right section
* Translate "Computer Vision"
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: IMvision12 <88665786+IMvision12@users.noreply.github.com>
Co-authored-by: Alexander Markov <almarkv@yandex.ru>
Co-authored-by: Alexander Markov <amarkov.me@gmail.com>
Co-authored-by: Saad Mahmud <shuvro.mahmud79@gmail.com>
Co-authored-by: Zachary Mueller <muellerzr@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: raghavanone <115454562+raghavanone@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: atturaioe <76523524+atturaioe@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Ali Hassani <68103095+alihassanijr@users.noreply.github.com>
Co-authored-by: BFSS <31245245+bfss@users.noreply.github.com>
Co-authored-by: BeifangSusu <BeifangSusu@bfss.com>
Co-authored-by: Ian C <7807897+donelianc@users.noreply.github.com>
Co-authored-by: Ali Hassani <ahassanijr@gmail.com>
Co-authored-by: Raj Rajhans <me@rajrajhans.com>
Co-authored-by: Matthijs Hollemans <mail@hollance.com>
Co-authored-by: Jordan Clive <jordan.clive19@imperial.ac.uk>
Co-authored-by: jordiclive <jordiclive19@imperial.ac.uk>
Co-authored-by: Konstantin Dobler <konstantin.j.dobler@gmail.com>
* First draft
* Make conversion script work
* Add id2label mapping, run code quality
* Fix copies
* Add first draft of feature extractor
* Update conversion script to use feature extractor
* Make more tests pass
* Add docs
* update input_features to input_values + pad by default to max length
* Fix doc tests
* Add feature extractor tests
* Add proper padding/truncation to feature extractor
* Add support for conversion of all audioset checkpoints
* Improve docs and extend conversion script
* Fix README
* Rename spectogram to spectrogram
* Fix copies
* Add integration test
* Remove dummy conv
* Update to ast
* Update organization
* Fix init
* Rename model to AST
* Add require_torchaudio annotator
* Move import of ASTFeatureExtractor under a is_speech_available
* Fix rebase
* Add pipeline config
* Update name of classifier head
* Rename time_dimension and frequency_dimension for clarity
* Remove print statement
* Fix pipeline test
* Fix pipeline test
* Fix index table
* Fix init
* Fix conversion script
* Rename to ForAudioClassification
* Fix index table
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Update Special Language Tokens for PLBART
* fix format
* making mapping for language codes and updating tests:
* fix format
* fix consistency
* add assert to both tokenizer tests.
* fix format
* Update src/transformers/models/plbart/tokenization_plbart.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* improvin readability, setting self.tgt_lang
* fixing
* readability
Co-authored-by: jordiclive <jordiclive19@imperial.ac.uk>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add model files etc for MobileNetV2
rename files for MobileNetV1
initial implementation of MobileNetV1
fix conversion script
cleanup
write docs
tweaks
fix conversion script
extract hidden states
fix test cases
make fixup
fixup it all
remove main from doc link
fixes
fix tests
fix up
use google org
fix weird assert
* fixup
* use google organization for checkpoints
* Add DiNAT
* Adds DiNAT + tests
* Minor fixes
* Added HF model
* Add natten to dependencies.
* Cleanup
* Minor fixup
* Reformat
* Optional NATTEN import.
* Reformat & add doc to _toctree
* Reformat (finally)
* Dummy objects for DiNAT
* Add NAT + minor changes
Adds NAT as its own independent model + docs, tests
Adds NATTEN to ext deps to ensure ci picks it up.
* Remove natten from `all` and `dev-torch` deps, add manual pip install to ci tests
* Minor fixes.
* Fix READMEs.
* Requested changes to docs + minor fixes.
* Requested changes.
* Add NAT/DiNAT tests to layoutlm_job
* Correction to Dinat doc.
* Requested changes.
* [Proposal] Breaking change `zero-shot-object-detection` for improved
consistency.
This is a proposal to modify the output of `zero-shot-object-detection`
to provide better alignment with other pipelines.
The output is now strictly the same as `object-detection` whereas before
it would output lists of lists.
The name `candidate_labels` is used throughout for consistency with
other `zero-shot` pipelines.
The pipeline is changed to `ChunkPipeline` to support batching cleanly.
This removes all the lists and list of lists shenanigans, it's now a
matter of the base pipeline handling all this not this specific one.
**Breaking change**: It did remove complex calls potentials `pipe(images = [image1, image2],
text_queries=[candidates1, candidates2])` to support only
`pipe([{"image": image1, "candidate_labels": candidates1}, {"image": image2, "candidate_labels": candidates2}])`
when dealing with lists and/or datasets.
We could keep them, but it will add a lot of complexity to the code
base, since the pipeline is rather young, I'd rather break to keep the
code simpler, but we can revert this.
**Breaking change**: The name of the argument is now `image` instead of
`images` since it expects by default only 1 image. This is revertable
like the previous one.
**Breaking change**: The types is now simplified and flattened:
`pipe(inputs) == [{**object1}, {**object2}]`
instead of the previous
`pipe(inputs) == [[{**object1}, {**object1}], [{**object2}]]`
Where the different instances would be grouped by candidate labels
within lists.
IMHO this is not really desirable, since it would output empty lists and
is only adding superflous indirection compared to
`zero-shot-object-detection`.
It is relatively change free in terms of how the results, it does change
computation however since now the batching is handled by the pipeline
itself. It **did** change the results for the small models so there
seems to be a real difference in how the models handle this.
* Fixing the doctests.
* Behind is_torch_available.
* Add ResNetBackbone
* Define channels and strides as property
* Remove file
* Add test for backbone
* Update BackboneOutput class
* Remove strides property
* Fix docstring
* Add backbones to SHOULD_HAVE_THEIR_OWN_PAGE
* Fix auto mapping name
* Add sanity check for out_features
* Set stage names based on depths
* Update to tuple
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Adds image-guided object detection method to OwlViTForObjectDetection class as described in the original paper. One-shot/ image-guided object detection enables users to use a query image to search for similar objects in the input image.
Co-Authored-By: Dhruv Karan k4r4n.dhruv@gmail.com
* Slightly alter Keras dummy loss
* Slightly alter Keras dummy loss
* Add sample weight to test_keras_fit
* Fix test_keras_fit for datasets
* Skip the sample_weight stuff for models where the model tester has no batch_size
* allow loading projection in text and vision model
* begin tests
* finish test for CLIPTextModelTest
* style
* add slow tests
* add new classes for projection heads
* remove with_projection
* add in init
* add in doc
* fix tests
* fix some more tests
* fix copies
* fix docs
* remove leftover from fix-copies
* add the head models in IGNORE_NON_AUTO_CONFIGURED
* fix docstr
* fix tests
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add docstr for models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Try PT1.13 by removing torch scatter
* Skip failing tests
* Style
* Remvoe testing extras for repo utils
* Try with all decorators
* Try to wipe the cache
* Fix all tests?
* Try this way
* Fix comma
* Update to main
* Try with less deps
* Quality
* add `accelerate` support for `ViT` family
- add `_no_split_modules`
- manually cast to the right `dtype`: to change
* enable `float16` for `deit`
* fix `make fixup`
* add `slow` test for `fp16` inference
* another safety check
* Update src/transformers/models/deit/modeling_deit.py
* update relative positional embedding
* make fix copies
* add `use_cache` to list of arguments
* fixup
* 1line fucntion
* add `test_decoder_model_past_with_large_inputs_relative_pos_emb`
* add relative pos embedding test for more models
* style
* Fix ImageSegmentationPipelineTests
* Use 0.9
* no zip
* links to show images
* links to show images
* rebase
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* add model files etc for MobileNetV2
* rename files for MobileNetV1
* initial implementation of MobileNetV1
* fix conversion script
* cleanup
* write docs
* tweaks
* fix conversion script
* extract hidden states
* fix test cases
* make fixup
* fixup it all
* rename V1 to V2
* fix checkpoints
* fixup
* implement first block + weight conversion
* add remaining layers
* add output stride and dilation
* fixup
* add tests
* add deeplabv3+ head
* a bit of fixup
* finish deeplab conversion
* add link to doc
* fix issue with JIT trace
in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value.
* cleanup
* fix order of models
* fix rebase error
* remove main from doc link
* add image processor
* remove old feature extractor
* fix converter + other issues
* fixup
* fix unit test
* add to onnx tests (but these appear broken now)
* add post_process_semantic_segmentation
* use google org
* remove unused imports
* move args
* replace weird assert
* Apply fix
* Fix test
* Remove another argument which is not used
* Fix pipeline test
* Add argument back, add deprecation warning
* Add warning add other location
* Use warnings instead
* Add num_channels to config
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
* Adding support for LayoutLMvX variants for `object-detection`.
* Revert bogs `layoutlm` feature extractor which does not exist (it was a
V2 model) .
* Updated condition.
* Handling the comments.
* move generation_*.py src files into generation/*.py
* populate generation.__init__ with lazy loading
* move imports and references from generation.xxx.object to generation.object
* Attempting to test automatically the `_keys_to_ignore`.
* Style.
* First fix pass.
* Moving test on its own.
* Another batch.
* Second round removing BatchNorm
* Fixing layoutlmv{2,3} + support older Python.
* Disable miss missing warning.
* Removing dodgy additions.
* Big pass.
* mbart.
* More corrections.
* Fixup.
* Updating test_correct_missing_keys
* Add escape hatch for when the head has no extra params so doesn't need
the missing keys check.
* Fixing test.
* Greener.
* Green ! (except for weird splinter bug).
* Adding a test about `named_parameters` usage.
* Shorten message.
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* After rebase modifications.
* More explicit condition checking.
* Fixing slow tests issues.
* Remove extra pdb.
* Remove print.
* Attempt to make failure consistent + fixing roc_bert.
* Removing the seed (all tests passing with it).
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add first draft
* Update conversion script
* Improve conversion script
* Improve conversion script some more
* Add conditional embeddings
* Add initial decoder
* Fix activation function of decoder
* Make decoder outputs match original implementation
* Make decoder outputs match original implementation
* Add more copied from statements
* Improve model outputs
* Fix auto tokenizer file
* Fix more tests
* Add test
* Improve README and docs, improve conditional embeddings
* Fix more tests
* Remove print statements
* Remove initial embeddings
* Improve conversion script
* Add interpolation of position embeddings
* Finish addition of interpolation of position embeddings
* Add support for refined checkpoint
* Fix refined checkpoint
* Remove unused parameter
* Improve conversion script
* Add support for training
* Fix conversion script
* Add CLIPSegFeatureExtractor
* Fix processor
* Fix CLIPSegProcessor
* Fix conversion script
* Fix most tests
* Fix equivalence test
* Fix README
* Add model to doc tests
* Use better variable name
* Convert other checkpoint as well
* Update config, add link to paper
* Add docs
* Update organization
* Replace base_model_prefix with clip
* Fix base_model_prefix
* Fix checkpoint of config
* Fix config checkpoint
* Remove file
* Use logits for output
* Fix tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add test for SentencePiece not adding special tokens to strings
* Add SentencePieceStringConversionMixin to fix issue 15003
* Fix conversion from tokens to string for most SentencePiece tokenizers
Tokenizers fixed:
- AlbertTokenizer
- BarthezTokenizer
- CamembertTokenizer
- FNetTokenizer
- M2M100Tokenizer
- MBart50Tokenizer
- PegasusTokenizer
- Speech2TextTokenizer
* Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab
* Fix DebertaV2Tokenizer
* Ignore LayoutXLMTokenizer in SentencePiece string conversion test
* Run 'make style' and 'make quality'
* Clean convert_tokens_to_string test
Instead of explicitly ignoring LayoutXLMTokenizer in the test,
override the test in LayoutLMTokenizationTest and do nothing in it.
* Remove commented out code
* Improve robustness of convert_tokens_to_string test
Instead of comparing lengths of re-tokenized text and input_ids,
check that converting all special tokens to string yields a string
with all special tokens.
* Inline and remove SentencePieceStringConversionMixin
The convert_tokens_to_string method is now implemented
in each relevant SentencePiece tokenizer.
* Run 'make style' and 'make quality'
* Revert removal of space in convert_tokens_to_string
* Remove redundant import
* Revert test text to original
* Uncomment the lowercasing of the reverse_text variable
* Mimic Rust tokenizer behavior for tokenizers
- Albert
- Barthez
- Camembert
- MBart50
- T5
* Fix accidentally skipping test in wrong tokenizer
* Add test for equivalent Rust and slow tokenizer behavior
* Override _decode in BigBirdTokenizer to mimic Rust behavior
* Override _decode in FNetTokenizer to mimic Rust behavior
* Override _decode in XLNetTokenizer to mimic Rust behavior
* Remove unused 're' import
* Update DebertaV2Tokenizer to mimic Rust tokenizer
* Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested.
* Ignore problematic tests in Deberta V2
* Add comment on why the Deberta V2 tests are skipped
* initial commit
* First draft that gets outputs without crashing!
* Add all the ported openfold dependencies
* testing
* Restructure config files for ESMFold
* Debugging to find output discrepancies
* Mainly style
* Make model runnable without extra deps
* Remove utils and merge them to the modeling file
* Use correct gelu and remove some debug prints
* More cleanup
* Update esm docs
* Update conversion script to support ESMFold properly
* Port some top-level changes from ESMFold repo
* Expand EsmFold docstrings
* Make attention_mask optional (default to all 1s)
* Add inference test for ESMFold
* Use config and not n kwargs
* Add modeling output class
* Remove einops
* Remove chunking in ESM FFN
* Update tests for ESMFold
* Quality
* REpo consistency
* Remove tree dependency from ESMFold
* make fixup
* Add an error in case my structure map function breaks later
* Remove needless code
* Stop auto-casting the LM to float16 so CPU tests pass
* Stop auto-casting the LM to float16 so CPU tests pass
* Final test updates
* Split test file
* Copyright and quality
* Unpin PyTorch to see built doc
* Fix config file to_dict() method
* Add some docstrings to the output
* Skip TF checkpoint tests for ESM until we reupload those
* make fixup
* More docstrings
* Unpin to get even with main
* Flag example to write
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* Support segformer fx
* Add fx_compatible attribute to test_modeling_segformer.py
* Update glpn model (fx support)
glpn model was copied from segformer.
* Update utils/fx.py | add semantic-segmentation
for SegformerForSemanticSegmentation model
* Fix minor import order(isort)
* Add random input generation for segformer fx
Co-authored-by: noelbird <lduldu00228@gmail.com>
* Wip
* Add safetensors support for TensorFlow
* First tests
* Add final test for now
* Retrigger CI like this
* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>