* No more Tuple, List, Dict
* make fixup
* More style fixes
* Docstring fixes with regex replacement
* Trigger tests
* Redo fixes after rebase
* Fix copies
* [test all]
* update
* [test all]
* update
* [test all]
* make style after rebase
* Patch the hf_argparser test
* Patch the hf_argparser test
* style fixes
* style fixes
* style fixes
* Fix docstrings in Cohere test
* [test all]
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* added the configuartion for sam_hq
* added the modeelling for sam_hq
* added the sam hq mask decoder with hq features
* added the code for the samhq
* added the code for the samhq
* added the code for the samhq
* Delete src/transformers/models/sam_hq/modelling_sam_hq.py
* added the code for the samhq
* added the code for the samhq
* added the chnages for the modeelling
* added the code for sam hq for image processing
* added code for the sam hq model
* added the required changes
* added the changes
* added the key mappings for the sam hq
* adding the working code of samhq
* added the required files
* adding the pt object
* added the push to hub account
* added the args for the sam maks decoder
* added the args for the sam hq vision config
* aded the some more documentation
* removed the unecessary spaces
* all required chnages
* removed the image processor
* added the required file
* added the changes for the checkcopies
* added the code for modular file
* added the changes for the __init file
* added the code for the interm embeds
* added the code for sam hq
* added the changes for modular file
* added the test file
* added the changes required
* added the changes required
* added the code for the
* added the cl errors
* added the changes
* added the required changes
* added the some code
* added the code for the removing image processor
* added the test dimensins
* added the code for the removing extra used variables
* added the code for modeluar file hf_mlp for a better name
* removed abbrevaation in core functionality
* removed abbrevaation in core functionality
* .contiguous() method is often used to ensure that the tensor is stored in a contiguous block of memory
* added the code which is after make fixup
* added some test for the intermediate embeddings test
* added the code for the torch support in sam hq
* added the code for the updated modular file
* added the changes for documentations as mentioned
* removed the heading
* add the changes for the code
* first mentioned issue resolved
* added the changes code to processor
* added the easy loading to init file
* added the changes to code
* added the code to changes
* added the code to work
* added the code for sam hq
* added the code for sam hq
* added the code for the point pad value
* added the small test for the image embeddings and intermediate embedding
* added the code
* added the code
* added the code for the tests
* added the code
* added ythe code for the processor file
* added the code
* added the code
* added the code
* added the code
* added the code
* added the code for tests and some checks
* added some code
* added the code
* added the code
* added some code
* added some code
* added the changes for required
* added the code
* added the code
* added the code
* added the code
* added the code
* added the code
* added the code
* added the code
* added the code
* added the code
* added some changes
* added some changes
* removed spaces and quality checks
* added some code
* added some code
* added some code
* added code quality checks
* added the checks for quality checks
* addded some code which fixes test_inference_mask_generation_no_point
* added code for the test_inference_mask_generation_one_point_one_bb
* added code for the test_inference_mask_generation_one_point_one_bb_zero
* added code for the test_inference_mask_generation_one_box
* added some code in modelling for testing
* added some code which sort maks with high score
* added some code
* added some code
* added some code for the move KEYS_TO_MODIFY_MAPPING
* added some code for the unsqueeze removal
* added some code for the unsqueeze removal
* added some code
* added some code
* add some code
* added some code
* added some code
* added some testign values changed
* added changes to code in sam hq for readbility purpose
* added pre commit checks
* added the fix samvisionmodel for compatibilty
* added the changes made on sam by cyyever
* fixed the tests for samhq
* added some the code
* added some code related to init file issue during merge conflicts
* remobved the merge conflicts
* added changes mentioned by aruther and mobap
* added changes mentioned by aruther and mobap
* solving quality checks
* added the changes for input clearly
* added the changes
* added changes in mask generation file rgearding model inputs and sam hq quargs in processor file
* added changes in processor file
* added the Setup -> setupclass conversion
* added the code mentioned for processor
* added changes for the code
* added some code
* added some code
* added some code
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Add MLCD model
* Update codes for auto-mapping
* Add test scripts for MLCD
* Update doc for MLCD model
* Fix import error
* Fix import error
* Fix CI error for attention_outputs
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix CI error for initialization
* Fix code style for CI
* Fix code style for CI
* Reformat codes and docs for CI test
* Reformat codes and docs for CI test
* Remove unused attributes for CI test
* Fix style for CI test
* List MLCD in flash_attn doc
* Fix: typos, modulars, refactors from suggestions
* Refactoring convert_mlcd_weights_to_hf.py from suggestions
* Fix: docs conflicts
* Fix error for CI test
* Fix style for CI test
* Add integration test for MLCD
* Refactoring by class inheritance
* Fix: refactor attention interface, adjust codes
* Fix: merging conflicts
* Fix: merging conflicts
* Fix: style for CI test
* Fix: style for CI test
* Fix: set test_resize_embeddings to be False
* Fix: initializer for CI test
* Fix: conflicts, CI test, warning and refactoring
* Fix: merging conflicts
* Refactor
* Update docs
* Fix mistakes
* Remove unused args and fix multi-gpu error
* Revert position_embeddings
* Solve conflicts
* Solve conflicts
* Remove dummy
* Update _init_weights
* Update _init_weights
* Update _init_weights for CI test
* update for fixes
* more fixes
* fuxix dynamic cache?
* style
* fix both traiining and generating. Eager seems alright
* dynamic does not work
* fix most cases, use_cache or not, eager or not, no default cache (ex: not training but you want to get cache states)
* should be final fixes
* fix more stuff no cat
* style
* fix
* style
* final sytle
* qualityeioiwhjfaopsejdpofqsdjkfjha;wesdhgfkjlqsw.denghjkaswednkgs
* fix
* revert
* add init and base image processing functions
* add add_fast_image_processor to transformers-cli
* add working fast image processor clip
* add fast image processor to doc, working tests
* remove "to be implemented" SigLip
* fix unprotected import
* fix unprotected vision import
* update ViTImageProcessorFast
* increase threshold slow fast ewuivalence
* add fast img blip
* add fast class in tests with cli
* improve cli
* add fast image processor convnext
* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision
* add device kwarg to ImagesKwargs for fast processing on cuda
* cleanup
* fix unprotected import
* group images by sizes and add batch processing
* Add batch equivalence tests, skip when center_crop is used
* cleanup
* update init and cli
* fix-copies
* refactor convnext, cleanup base
* fix
* remove patching mixins, add piped torchvision transforms for ViT
* fix unbatched processing
* fix f strings
* protect imports
* change llava onevision to class transforms (test)
* fix convnext
* improve formatting (following Pavel review)
* fix handling device arg
* improve cli
* fix
* fix inits
* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs
* uniformize qwen2_vl fast
* fix docstrings
* add add fast image processor llava
* remove min_pixels max_pixels from accepted size
* nit
* nit
* refactor fast image processors docstrings
* cleanup and remove fast class transforms
* update add fast image processor transformers cli
* cleanup docstring
* uniformize pixtral fast and make _process_image explicit
* fix prepare image structure llava next/onevision
* Use typed kwargs instead of explicit args
* nit fix import Unpack
* clearly separate pops and gets in base preprocess. Use explicit typed kwargs
* make qwen2_vl preprocess arguments hashable
* Standardize image-text-to-text-models-output
add post_process_image_text_to_text to chameleon and cleanup
Fix legacy kwarg behavior and deprecation warning
add post_process_image_text_to_text to qwen2_vl and llava_onevision
Add post_process_image_text_to_text to idefics3, mllama, pixtral processor
* nit var name post_process_image_text_to_text udop
* nit fix deprecation warnings
* Add image-text-to-text pipeline
* add support for image url in chat template for pipeline
* Reformat to be fully compatible with chat templates
* Add tests chat template
* Fix imports and tests
* Add pipeline tag
* change logic handling of single prompt ans multiple images
* add pipeline mapping to models
* fix batched inference
* fix tests
* Add manual batching for preprocessing
* Fix outputs with nested images
* Add support for all common processing kwargs
* Add default padding when multiple text inputs (batch size>1)
* nit change version deprecation warning
* Add support for text only inference
* add chat_template warnings
* Add pipeline tests and add copied from post process function
* Fix batched pipeline tests
* nit
* Fix pipeline tests blip2
* remove unnecessary max_new_tokens
* revert processing kosmos2 and remove unnecessary max_new_tokens
* fix pipeline tests idefics
* Force try loading processor if pipeline supports it
* revert load_processor change
* hardcode loading only processor
* remove unnecessary try except
* skip imagetexttotext tests for kosmos2 as tiny model causes problems
* Make code clearer
* Address review comments
* remove preprocessing logic from pipeline
* fix fuyu
* add BC resize fuyu
* Move post_process_image_text_to_text to ProcessorMixin
* add guard in post_process
* fix zero shot object detection pipeline
* add support for generator input in pipeline
* nit
* change default image-text-to-text model to llava onevision
* fix owlv2 size dict
* Change legacy deprecation warning to only show when True
* more precise name
* better docstrings
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* tmp
* skip files not in the diff
* use git.Repo instead of an external subprocess
* add tiny change to confirm that the diff is working on pushed changes
* add make quality task
* more profesh main commit reference
* Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal
* Update not-doctested.txt
* Fix JA and ZH docs
* Fix JA and ZH docs some more
* Fix JA and ZH docs some more
* Duplicate swiftformer
* Convert SwiftFormerPatchEmbedding
* Convert SwiftFormerEmbeddings
* Convert TFSwiftFormerMlp
* Convert TFSwiftFormerConvEncoder
* Convert TFSwiftFormerLocalRepresentation
* convert TFSwiftFormerEncoderBlock
* Convert SwiftFormerStage
* Convert SwiftFormerEncoder
* Add TFSWiftFormerPreTrainedModel
* Convert SwiftFormerForImageClassification
* Add kwargs and start drop path
* Fix syntax
* Change Model class name
* Add TFSwiftFormer to __init__
* Duplicate test_modeling_swiftformer
* First test conversions
* Change require_torch to require_tf
* Add exports to swiftformer __init__
* Add TFSwiftFormerModel wrapper
* Fix __init__ and run black
* Remove docstring from MainLayer, fix padding
* Use keras.layers.Activation on keras.Sequential
* Fix swiftformer exports
* Fix activation layer from config
* Remove post_inits
* Use tf.keras.layers.ZeroPadding2D
* Convert torch normalize
* Change tf test input shape
* Fix softmax and reduce_sum
* Convert expand_dims and repeat
* Add missing reshape and tranpose
* Simplify TFSwiftFormerEncoderBlock.call
* Fix mismatch in patch embeddings
* Fix expected output shape to match channels last
* Fix swiftformer typo
* Disable test_onnx
* Fix TFSwiftFormerForImageClassification call
* Add unpack inputs
* Convert flatten(2).mean(-1)
* Change vision dummy inputs (to be reviewed)
* Change test_forward_signature to use .call
* Fix @unpack_inputs
* Set return_tensors="tf" and rename class
* Rename wrongly named patch_embeddings layer
* Add serving_output and change dummy_input shape
* Make dimensions BCHW and transpose inside embedding layer
* Change SwiftFormerEncoderBlock
* Fix ruff problems
* Add image size to swiftformer config
* Change tranpose to MainLayer and use -1 for reshape
* Remove serving_outputs and dummy_inputs
* Remove test_initialization test from tf model
* Make Sequential component a separate layer
* Fix layers' names
* Tranpose encoder outputs
* Fix tests and check if hidden states is not None
* Fix TFSwiftFormerForImageClassification
* Run make fixup
* Run make fix-copies
* Update modeling_tf_auto
* Update docs
* Fix modeling auto mapping
* Update modelint_tf_swiftformer docs
* Fill image_size doc and type
* Add reduction=None to loss computation
* Update docs
* make style
* Debug: Delete the tip to see if that changes anything
* Re-add tip
* Remove add_code_sample_docstrings
* Remove unused import
* Get the debug to actually tell us the problem it has with the docs
* Try a substitution to match the PyTorch file?
* Add swiftformer to ignore list
* Add build() methods
* Update copyright year
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Remove FIXME comment
* Remove from_pt
* Update copyright year
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Rename one-letter variables
* Remove FIXMEs related to momentum
* Remove old TODO comment
* Remove outstanding FIXME comments
* Get dropout rate from config
* Add specific dropout config for MLP
* Add convencoder dropout to config
* Pass config to SwiftFormerDropPath layer
* Fix drop_path variable name and add Adapted from comment
* Run ruff
* Removed copied from comment
* Run fix copies
* Change drop_path to identity to match pt
* Cleanup build() methods and move to new keras imports
* Update docs/source/en/model_doc/swiftformer.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Raise error if drop_path_rate > 0.0
* Apply suggestions from code review
Replace (self.dim), with self.dim,
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Remove drop_path function
* Add training to TFSwiftFormerEncoder
* Set self.built = True last
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Should have been added to previous commit
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Change default_feature_extractor to default_image_processor
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Import Keras from modeling_tf_utils
* Remove relative import
* Run ruff --fix
* Move import keras to tf_available
* Add copied from comment to test_forward_signature
* Reduce batch size and num_labels
* Extract loss logic to hf_compute_loss
* Run ruff format
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* first commit
* correct default value non causal
* update config and modeling code
* update converting checkpoint
* clean modeling and fix tests
* make style
* add new config parameters to docstring
* fix copied from statements
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* make position_embeddings_type docstrings clearer
* clean converting script
* remove function not used
* clean modeling file
* apply suggestion for test file + add convert script to not_doctested
* modify tests according to review - cleaner logic and more tests
* Apply nit suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add checker of valid position embeddings type
* instantiate new layer norm layer with the right eps
* fix freeze_feature_encoder since it can be None in some cases
* add test same output in convert script
* restore wav2vec2conformer and add new model
* create processor and FE + clean
* add new model code
* fix convert script and set default config parameters
* correct model id paths
* make style
* make fix-copies and cleaning files
* fix copied from statements
* complete .md and fixe copies
* clean convert script argument defaults
* fix config parameters docstrings
* fix config docstring
* add copied from and enrich FE tests
* fix copied from and repo-consistency
* add autotokenizer
* make test input length shorter and change docstring code
* fix docstrings and copied from
* add add_adapter to ASR training example
* make testing of adapters more robust
* adapt to multi adapter layers
* refactor input_values->input_features and remove w2v2-bert feature extractor
* remove pretraining model
* remove depreciated features and useless lines
* add copied from and ignore statements to modeling tests
* remove pretraining model #2
* change import in convert script
* change default in convert script
* update readme and remove useless line
* Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* refactor BERT to Bert for consistency
* remove useless ignore copy statement
* add persistent to buffer in rotary
* add eps in LayerNorm init and remove copied from
* add adapter activation parameters and add copied from statements
* Fix copied statements and add unitest.skip reasons
* add copied statement in test_processor
* refactor processor
* make style
* replace numpy random by torch rand
* remove expected output CTC
* improve converting script with processor class
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove gumbel class
* remove tests related to previously deleted class
* Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* correct typos
* remove uused parameters
* update processor to takes both text and audio
* update checkpoints
* update expected output and add ctc expected output
* add label_attention_mask
* replace pt with np in processor tests
* fix typo
* revert to behaviour with labels_attention_mask
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix TF Regnet docstring
* Fix TF Regnet docstring
* Make a change to the PyTorch Regnet too to make sure the CI is checking it
* Add skips for TFRegnet
* Update error message for docstring checker
* Remove ErnieConfig, ErnieMConfig check_docstrings
* Run fix_and_overwrite for ErnieConfig, ErnieMConfig
* Replace <fill_type> and <fill_docstring> in configuration_ernie, configuration_ernie_m.py with type and docstring values
---------
Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>
* Copies `modeling_flax_gpt_neo.py` to start
* MLP Block. WIP Attention and Block
* Adds Flax implementation of `LlamaMLP`
Validated with in-file test.
Some slight numeric differences, but assuming it isn't an issue
* Adds `FlaxLlamaRMSNorm` layer
`flax.linen` includes `RMSNorm` layer but not necessarily in all
versions. Hence, we add in-file.
* Adds FlaxLlamaAttention
Copied from GPT-J as it has efficient caching implementation as well as
rotary embeddings.
Notice numerically different, but not by a huge amount. Needs
investigating
* Adds `FlaxLlamaDecoderLayer`
numerically inaccurate, debugging..
* debugging rotary mismatch
gptj uses interleaved whilst llama uses contiguous
i think they match now but still final result is wrong.
maybe drop back to just debugging attention layer?
* fixes bug with decoder layer
still somewhat numerically inaccurate, but close enough for now
* adds markers for what to implement next
the structure here diverges a lot from the PT version.
not a big fan of it, but just get something working for now
* implements `FlaxLlamaBlockCollection`]
tolerance must be higher than expected, kinda disconcerting
* Adds `FlaxLlamaModule`
equivalent PyTorch model is `LlamaModel`
yay! a language model🤗
* adds `FlaxLlamaForCausalLMModule`
equivalent to `LlamaForCausalLM`
still missing returning dict or tuple, will add later
* start porting pretrained wrappers
realised it probably needs return dict as a prereq
* cleanup, quality, style
* readds `return_dict` and model output named tuples
* (tentatively) pretrained wrappers work 🔥
* fixes numerical mismatch in `FlaxLlamaRMSNorm`
seems `jax.lax.rsqrt` does not match `torch.sqrt`.
manually computing `1 / jax.numpy.sqrt` results in matching values.
* [WIP] debugging numerics
* numerical match
I think issue was accidental change of backend. forcing CPU fixes test.
We expect some mismatch on GPU.
* adds in model and integration tests for Flax Llama
summary of failing:
- mul invalid combination of dimensions
- one numerical mismatch
- bf16 conversion (maybe my local backend issue)
- params are not FrozenDict
* adds missing TYPE_CHECKING import and `make fixup`
* adds back missing docstrings
needs review on quality of docstrings, not sure what is required.
Furthermore, need to check if `CHECKPOINT_FOR_DOC` is valid. See TODO
* commenting out equivalence test as can just use common
* debugging
* Fixes bug where mask and pos_ids were swapped in pretrained models
This results in all tests passing now 🔥
* cleanup of modeling file
* cleanup of test file
* Resolving simpler review comments
* addresses more minor review comments
* fixing introduced pytest errors from review
* wip additional slow tests
* wip tests
need to grab a GPU machine to get real logits for comparison
otherwise, slow tests should be okay
* `make quality`, `make style`
* adds slow integration tests
- checking logits
- checking hidden states
- checking generation outputs
* `make fix-copies`
* fix mangled function following `make fix-copies`
* adds missing type checking imports
* fixes missing parameter checkpoint warning
* more finegrained 'Copied from' tags
avoids issue of overwriting `LLAMA_INPUTS_DOCSTRING`
* swaps import guards
??? how did these get swapped initially?
* removing `inv_freq` again as pytorch version has now removed
* attempting to get CI to pass
* adds doc entries for llama flax models
* fixes typo in __init__.py imports
* adds back special equivalence tests
these come from the gpt neo flax tests. there is special behaviour for these models that needs to override the common version
* overrides tests with dummy to see if CI passes
need to fill in these tests later
* adds my contribution to docs
* `make style; make quality`
* replaces random masking with fixed to work with flax version
* `make quality; make style`
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* updates `x`->`tensor` in `rotate_half`
* addresses smaller review comments
* Update docs/source/en/model_doc/llama.md
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* adds integration test class
* adds `dtype` to rotary embedding to cast outputs
* adds type to flax llama rotary layer
* `make style`
* `make fix-copies`
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* applies suggestions from review
* Update modeling_flax_llama.py
* `make fix-copies`
* Update tests/models/llama/test_modeling_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_flax_llama.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* fixes shape mismatch in FlaxLlamaMLP
* applies some suggestions from reviews
* casts attn output logits to f32 regardless of dtype
* adds attn bias using `LlamaConfig.attention_bias`
* adds Copied From comments to Flax Llama test
* mistral and persimmon test change -copy from llama
* updates docs index
* removes Copied from in tests
it was preventing `make fix-copies` from succeeding
* quality and style
* ignores FlaxLlama input docstring
* adds revision to `_CHECKPOINT_FOR_DOC`
* repo consistency and quality
* removes unused import
* removes copied from from Phi test
now diverges from llama tests following FlaxLlama changes
* adds `_REAL_CHECKPOINT_FOR_DOC`
* removes refs from pr tests
* reformat to make ruff happy
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* add working convertion script
* first non-working version of modeling code
* update modeling code (working)
* make style
* make fix-copies
* add config docstrings
* add config to ignore docstrings formatage due to unconventional markdown
* fix copies
* fix generation num_return_sequences
* enrich docs
* add and fix tests beside integration tests
* update integration tests
* update repo id
* add tie weights and make style
* correct naming in .md
* fix imports and so on
* correct docstrings
* fix fp16 speech forward
* fix speechencoder attention
* make style
* fix copied from
* rename SeamlessM4Tv2-v2 to SeamlessM4Tv2
* Apply suggestions on configuration
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove useless public models
* fix private models + better naming for T2U models
* clean speech encoder relative position embeddings
* refactor chunk attention
* add docstrings to chunk attention method
* improve naming and docstrings
* rename some attention variables + add temperature sampling in T2U model
* rename DOCSTRINGS variable names
* make style + remove 2 useless config parameters
* enrich model card
* remove any attention_head reference + fix temperature in T2U
* new fmt and make style
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* rename spkr_id->speaker_id and change docstrings of get_char_input_ids
* simplify v2attention
* make style
* Update seamless_m4t_v2.md
* update code and tests with last update
* update repo ids
* fill article name, abstract andauthors
* update not_doctested and slow_doc tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add type annotations to TFConvNextDropPath
* Use tf.debugging.assert_equal for TFConvNextEmbeddings shape check
* Add TensorFlow implementation of ConvNeXTV2
* check_docstrings: add TFConvNextV2Model to exclusions
TFConvNextV2Model and TFConvNextV2ForImageClassification have docstrings
which are equivalent to their PyTorch cousins, but a parsing issue prevents them
from passing the test.
Adding exclusions for these two classes as discussed in #25558.
* [docstring] Fix docstring for AltCLIPVisionConfig, AltCLIPTextConfig + cleaned some docstring
* Removed entries from check_docstring.py
* Removed entries from check_docstring.py
* Removed entry from check_docstring.py
* [docstring] Fix docstring for AltCLIPTextConfig, AltCLIPVisionConfig and AltCLIPConfig
* add `MaskGenerationPipeline` in docs
* Update __init__.py
* fix repo consistency and clarify docstring
* add on check docstirngs
* actually we do have a tf sam
* oops
* first raw commit
* still POC
* tentative convert script
* almost working speech encoder conversion scripts
* intermediate code for encoder/decoders
* add modeling code
* first version of speech encoder
* make style
* add new adapter layer architecture
* add adapter block
* add first tentative config
* add working speech encoder conversion
* base model convert works now
* make style
* remove unnecessary classes
* remove unecessary functions
* add modeling code speech encoder
* rework logics
* forward pass of sub components work
* add modeling codes
* some config modifs and modeling code modifs
* save WIP
* new edits
* same output speech encoder
* correct attention mask
* correct attention mask
* fix generation
* new generation logics
* erase comments
* make style
* fix typo
* add some descriptions
* new state
* clean imports
* add tests
* make style
* make beam search and num_return_sequences>1 works
* correct edge case issue
* correct SeamlessM4TConformerSamePadLayer copied from
* replace ACT2FN relu by nn.relu
* remove unecessary return variable
* move back a class
* change name conformer_attention_mask ->conv_attention_mask
* better nit code
* add some Copied from statements
* small nits
* small nit in dict.get
* rename t2u model -> conditionalgeneration
* ongoing refactoring of structure
* update models architecture
* remove SeamlessM4TMultiModal classes
* add tests
* adapt tests
* some non-working code for vocoder
* add seamlessM4T vocoder
* remove buggy line
* fix some hifigan related bugs
* remove hifigan specifc config
* change
* add WIP tokenization
* add seamlessM4T working tokenzier
* update tokenization
* add tentative feature extractor
* Update converting script
* update working FE
* refactor input_values -> input_features
* update FE
* changes in generation, tokenizer and modeling
* make style and add t2u_decoder_input_ids
* add intermediate outputs for ToSpeech models
* add vocoder to speech models
* update valueerror
* update FE with languages
* add vocoder convert
* update config docstrings and names
* update generation code and configuration
* remove todos and update config.pad_token_id to generation_config.pad_token_id
* move block vocoder
* remove unecessary code and uniformize tospeech code
* add feature extractor import
* make style and fix some copies from
* correct consistency + make fix-copies
* add processor code
* remove comments
* add fast tokenizer support
* correct pad_token_id in M4TModel
* correct config
* update tests and codes + make style
* make some suggested correstion - correct comments and change naming
* rename some attributes
* rename some attributes
* remove unecessary sequential
* remove option to use dur predictor
* nit
* refactor hifigan
* replace normalize_mean and normalize_var with do_normalize + save lang ids to generation config
* add tests
* change tgt_lang logic
* update generation ToSpeech
* add support import SeamlessM4TProcessor
* fix generate
* make tests
* update integration tests, add option to only return text and update tokenizer fast
* fix wrong function call
* update import and convert script
* update integration tests + update repo id
* correct paths and add first test
* update how new attention masks are computed
* update tests
* take first care of batching in vocoder code
* add batching with the vocoder
* add waveform lengths to model outputs
* make style
* add generate kwargs + forward kwargs of M4TModel
* add docstrings forward methods
* reformate docstrings
* add docstrings t2u model
* add another round of modeling docstrings + reformate speaker_id -> spkr_id
* make style
* fix check_repo
* make style
* add seamlessm4t to toctree
* correct check_config_attributes
* write config docstrings + some modifs
* make style
* add docstrings tokenizer
* add docstrings to processor, fe and tokenizers
* make style
* write first version of model docs
* fix FE + correct FE test
* fix tokenizer + add correct integration tests
* fix most tokenization tests
* make style
* correct most processor test
* add generation tests and fix num_return_sequences > 1
* correct integration tests -still one left
* make style
* correct position embedding
* change numbeams to 1
* refactor some modeling code and correct one test
* make style
* correct typo
* refactor intermediate fnn
* refactor feedforward conformer
* make style
* remove comments
* make style
* fix tokenizer tests
* make style
* correct processor tests
* make style
* correct S2TT integration
* Apply suggestions from Sanchit code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* correct typo
* replace torch.nn->nn + make style
* change Output naming (waveforms -> waveform) and ordering
* nit renaming and formating
* remove return None when not necessary
* refactor SeamlessM4TConformerFeedForward
* nit typo
* remove almost copied from comments
* add a copied from comment and remove an unecessary dropout
* remove inputs_embeds from speechencoder
* remove backward compatibiliy function
* reformate class docstrings for a few components
* remove unecessary methods
* split over 2 lines smthg hard to read
* make style
* replace two steps offset by one step as suggested
* nice typo
* move warnings
* remove useless lines from processor
* make generation non-standard test more robusts
* remove torch.inference_mode from tests
* split integration tests
* enrich md
* rename control_symbol_vocoder_offset->vocoder_offset
* clean convert file
* remove tgt_lang and src_lang from FE
* change generate docstring of ToText models
* update generate docstring of tospeech models
* unify how to deal withtext_decoder_input_ids
* add default spkr_id
* unify tgt_lang for t2u_model
* simplify tgt_lang verification
* remove a todo
* change config docstring
* make style
* simplify t2u_tgt_lang_id
* make style
* enrich/correct comments
* enrich .md
* correct typo in docstrings
* add torchaudio dependency
* update tokenizer
* make style and fix copies
* modify SeamlessM4TConverter with new tokenizer behaviour
* make style
* correct small typo docs
* fix import
* update docs and add requirement to tests
* add convert_fairseq2_to_hf in utils/not_doctested.txt
* update FE
* fix imports and make style
* remove torchaudio in FE test
* add seamless_m4t.md to utils/not_doctested.txt
* nits and change the way docstring dataset is loaded
* move checkpoints from ylacombe/ to facebook/ orga
* refactor warning/error to be in the 119 line width limit
* round overly precised floats
* add stereo audio behaviour
* refactor .md and make style
* enrich docs with more precised architecture description
* readd undocumented models
* make fix-copies
* apply some suggestions
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* correct bug from previous commit
* refactor a parameter allowing to clean the code + some small nits
* clean tokenizer
* make style and fix
* make style
* clean tokenizers arguments
* add precisions for some tests
* move docs from not_tested to slow
* modify tokenizer according to last comments
* add copied from statements in tests
* correct convert script
* correct parameter docstring style
* correct tokenization
* correct multi gpus
* make style
* clean modeling code
* make style
* add copied from statements
* add copied statements
* add support with ASR pipeline
* remove file added inadvertently
* fix docstrings seamlessM4TModel
* add seamlessM4TConfig to OBJECTS_TO_IGNORE due of unconventional markdown
* add seamlessm4t to assisted generation ignored models
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove docstrings CodeGen from objects_to_ignore
* autofix codegen docstrings
* fill in the missing types and docstrings
* fixup
* change descriptions to be in a separate line
* apply docstring suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* update n_ctx description in CodeGenConfig
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Remove ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig from check_docstrings
* Run fix_and_overwrite for ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig
* Replace <fill_type> and <fill_docstring> in configuration_chinese_clip.py, image_processing_chinese_clip.py with type and docstring values
---------
Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>
* Remove BertGenerationTokenizer from objects to ignore
The file BertGenerationTokenizer is removed from
objects to ignore as a first step to fix docstring.
* Docstrings fix for BertGenerationTokenizer
Docstring fix is generated for BertGenerationTokenizer
by using check_docstrings.py.
* Fix docstring for BertGenerationTokenizer
Added sep_token type and docstring in BertGenerationTokenizer.
* Remove CanineConfig from check_docstrings
* Run fix_and_overwrite for CanineConfig
* Replace <fill_type> and <fill_docstring> in configuration_canine.py with type and docstring values
---------
Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>