* PoC for a ProcessorMixin class
* Documentation
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Roll out to other processors
* Add base feature extractor class in init
* Use args and kwargs
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add wrapper classes
* convert inner layers to tf
* Add TF Encoder and Decoder layers
* TFSpeech2Text models
* Loadable model
* TF model with same outputs as PT model
* test skeleton
* correct tests and run the fixup
* correct attention expansion
* TFSpeech2Text pask_key_values with TF format
* use_cache = False for PT models if labels is passed
* Fix for BigBirdPegasusForConditionalGeneration
* add warning if users specify use_cache=True
* Use logger.warning instead of warnings.warn
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* electra is added to onnx supported model
* add google/electra-base-generator for test onnx module
Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>
* Change the way tracing happens, enabling dynamic axes out of the box
* Update the tests and modeling xlnet
* Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).
* Comments and making tracing work for gpt-j and xlnet
* Refactore things related to num_choices (and batch_size, sequence_length)
* Update fx to work on PyTorch 1.10
* Postpone autowrap_function feature usage for later
* Add copyrights
* Remove unnecessary file
* Fix issue with add_new_model_like
* Apply suggestions
* Wav2Vec2 models must either throw or deal with add_apater
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add pre-add_adapter backwards compatibility
* Add pre-add_adapter backwards compatibility
* Fix issue in tests/test_modeling_wav2vec2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
# Add support for W&B hyperparameter sweep
This PR:
* allows using wandb for running hyperparameter search.
* The runs are visualized on W&B sweeps dashboard
* This supports runnning sweeps on parallel devices, all reporting to the same central dashboard.
### Usage
**To run new a hyperparameter search:**
```
trainer.hyperparameter_search(
backend="wandb",
project="transformers_sweep", # name of the project
n_trials=5,
metric="eval/loss", # metric to be optimized, default 'eval/loss'. A warning is raised if the passed metric is not found
)
```
This outputs a sweep id. Eg. `my_project/sweep_id`
**To run sweeps on parallel devices:**
Just pass sweep id which you want to run parallel
```
trainer.hyperparameter_search(
backend="wandb",
sweep_id = "my_project/sweep_id"
)
```
* Adding support for `microphone` streaming within pipeline.
- Uses `ffmpeg` to get microphone data.
- Makes sure alignment is made to `size_of_sample`.
- Works by sending `{"raw": ..data.., "stride": (n, left, right),
"partial": bool}`
directly to the pipeline enabling to stream partial results and still
get inference.
- Let's `partial` information flow through the pipeline to enable caller
to get it back and choose to display text or not.
- The striding reconstitution is bound to have errors since CTC does not
keep previous state. Currently most of the errors are we don't know if
there's a space or not between two chunks.
Since we have some left striding info, we could use that during decoding
to choose what to do with those spaces and even extra letters maybe (if
the stride is long enough, it's bound to cover at least a few symbols)
Fixing tests.
Protecting with `require_torch`.
`raw_ctc` support for nicer demo.
Post rebase fixes.
Revamp to split raw_mic_data from it's live chunking.
- Requires a refactor to make everything a bit cleaner.
Automatic resampling.
Small fix.
Small fix.
* Post rebase fix (need to let super handle more logic, reorder args.)
* Update docstrings
* Docstring format.
* Remove print.
* Prevent flow of `input_values`.
* Fixing `stride` too.
* Fixing the PR by removing `raw_ctc`.
* Better docstrings.
* Fixing init.
* Update src/transformers/pipelines/audio_utils.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Quality.
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Add torchvision's resize
* Rename torch_resize to default_to_square
* Apply suggestions from code review
* Add support for default_to_square and tuple of length 1