transformers/utils
Yoach Lacombe d2cdefb9ec
Add new meta w2v2-conformer BERT-like model (#28165)
* first commit

* correct default value non causal

* update config and modeling code

* update converting checkpoint

* clean modeling and fix tests

* make style

* add new config parameters to docstring

* fix copied from statements

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* make position_embeddings_type docstrings clearer

* clean converting script

* remove function not used

* clean modeling file

* apply suggestion for test file + add convert script to not_doctested

* modify tests according to review - cleaner logic and more tests

* Apply nit suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add checker of valid position embeddings type

* instantiate new layer norm layer with the right eps

* fix freeze_feature_encoder since it can be None in some cases

* add test same output in convert script

* restore wav2vec2conformer and add new model

* create processor and FE + clean

* add new model code

* fix convert script and set default config parameters

* correct model id paths

* make style

* make fix-copies and cleaning files

* fix copied from statements

* complete .md and fixe copies

* clean convert script argument defaults

* fix config parameters docstrings

* fix config docstring

* add copied from and enrich FE tests

* fix copied from and repo-consistency

* add autotokenizer

* make test input length shorter and change docstring code

* fix docstrings and copied from

* add add_adapter to ASR training example

* make testing of adapters more robust

* adapt to multi adapter layers

* refactor input_values->input_features and remove w2v2-bert feature extractor

* remove pretraining model

* remove depreciated features and useless lines

* add copied from and ignore statements to modeling tests

* remove pretraining model #2

* change import in convert script

* change default in convert script

* update readme and remove useless line

* Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor BERT to Bert for consistency

* remove useless ignore copy statement

* add persistent to buffer in rotary

* add eps in LayerNorm init and remove copied from

* add adapter activation parameters and add copied from statements

* Fix copied statements and add unitest.skip reasons

* add copied statement in test_processor

* refactor processor

* make style

* replace numpy random by torch rand

* remove expected output CTC

* improve converting script with processor class

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove gumbel class

* remove tests related to previously deleted class

* Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* correct typos

* remove uused parameters

* update processor to takes both text and audio

* update checkpoints

* update expected output and add ctc expected output

* add label_attention_mask

* replace pt with np in processor tests

* fix typo

* revert to behaviour with labels_attention_mask

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-18 13:37:34 +00:00
..
test_module AutoImageProcessor (#20111) 2022-11-08 19:54:41 +00:00
tf_ops Check TF ops for ONNX compliance (#10025) 2021-02-15 07:55:10 -05:00
add_pipeline_model_mapping_to_test.py A script to add/update pipeline_model_mapping systematically (#22180) 2023-04-06 18:08:14 +02:00
check_build.py Clean up CUDA kernels (#23455) 2023-05-18 14:14:43 -04:00
check_config_attributes.py Add FastSpeech2Conformer (#23439) 2024-01-03 18:01:06 +00:00
check_config_docstrings.py [Check] Fix config docstring (#26222) 2023-09-18 19:58:01 +02:00
check_copies.py Add SigLIP (#26522) 2024-01-08 18:17:16 +01:00
check_doc_toc.py Doc checks (#25408) 2023-08-10 10:53:22 +02:00
check_docstrings.py Add new meta w2v2-conformer BERT-like model (#28165) 2024-01-18 13:37:34 +00:00
check_doctest_list.py Avoid many failing tests in doctesting (#27262) 2023-11-03 12:47:07 +01:00
check_dummies.py Doc checks (#25408) 2023-08-10 10:53:22 +02:00
check_inits.py Make using safetensors files automated. (#27571) 2023-12-01 15:51:10 +01:00
check_model_tester.py Add a new script to check model testers' config (#22063) 2023-03-13 19:11:19 +01:00
check_repo.py improve dev setup comments and hints (#28495) 2024-01-15 18:36:40 +00:00
check_self_hosted_runner.py Tiny fix for check_self_hosted_runner.py (#24052) 2023-06-06 18:17:41 +02:00
check_support_list.py Fix the check of models supporting FA/SDPA not run (#28202) 2023-12-22 12:56:11 +01:00
check_table.py Add SigLIP (#26522) 2024-01-08 18:17:16 +01:00
check_task_guides.py More utils doc (#25457) 2023-08-17 07:58:35 +02:00
check_tf_ops.py Check TF ops for ONNX compliance (#10025) 2021-02-15 07:55:10 -05:00
create_dummy_models.py Update tiny model creation script (#27674) 2023-11-28 10:05:34 +01:00
custom_init_isort.py More utils doc (#25457) 2023-08-17 07:58:35 +02:00
download_glue_data.py Raise exceptions instead of asserts (#13907) 2021-10-07 12:44:23 +05:30
extract_warnings.py Make Slack CI reporting stronger (#21823) 2023-02-28 17:12:44 +01:00
get_ci_error_statistics.py Show diff between 2 CI runs on Slack reports (#22798) 2023-04-19 19:27:37 +02:00
get_github_job_time.py Make Slack CI reporting stronger (#21823) 2023-02-28 17:12:44 +01:00
get_modified_files.py exclude deleted files in the fixup script (#21436) 2023-02-03 12:57:02 -05:00
get_previous_daily_ci.py Fix a minor bug in CI slack report (#22906) 2023-04-21 20:36:35 +02:00
get_test_info.py Add an utility file to get information from test files (#21856) 2023-03-01 17:53:29 +01:00
not_doctested.txt Add new meta w2v2-conformer BERT-like model (#28165) 2024-01-18 13:37:34 +00:00
notification_service_doc_tests.py Fix slack report failing for doctest (#27042) 2023-10-30 10:48:24 +01:00
notification_service.py Fix notification_service.py (#27903) 2023-12-08 14:55:02 +01:00
past_ci_versions.py (Re-)Enable Nightly + Past CI (#22393) 2023-03-30 21:06:35 +02:00
print_env.py Print more library versions in CI (#17384) 2022-06-02 10:24:16 +02:00
release.py More utils doc (#25457) 2023-08-17 07:58:35 +02:00
slow_documentation_tests.txt Add SeamlessM4T v2 (#27779) 2023-11-30 20:24:43 +01:00
sort_auto_mappings.py More utils doc (#25457) 2023-08-17 07:58:35 +02:00
tests_fetcher.py Trigger corresponding pipeline tests if tests/utils/tiny_model_summary.json is modified (#27693) 2023-11-28 17:21:21 +01:00
update_metadata.py Update processor mapping for hub snippets (#27477) 2023-11-14 20:05:54 +00:00
update_tiny_models.py Update tiny model summary file for recent models (#22637) 2023-04-06 22:52:59 +02:00