transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Matthijs Hollemans 4ece3b9433 add VITS model (#24085 ) * add VITS model * let's vits * finish TextEncoder (mostly) * rename VITS to Vits * add StochasticDurationPredictor * ads flow model * add generator * correctly set vocab size * add tokenizer * remove processor & feature extractor * add PosteriorEncoder * add missing weights to SDP * also convert LJSpeech and VCTK checkpoints * add training stuff in forward * add placeholder tests for tokenizer * add placeholder tests for model * starting cleanup * let the great renaming begin! * use config * global_conditioning * more cleaning * renaming variables * more renaming * more renaming * it never ends * reticulating the splines * more renaming * HiFi-GAN * doc strings for main model * fixup * fix-copies * don't make it a PreTrainedModel * fixup * rename config options * remove training logic from forward pass * simplify relative position * use actual checkpoint * style * PR review fixes * more review changes * fixup * more unit tests * fixup * fix doc test * add integration test * improve tokenizer tests * add tokenizer integration test * fix tests on GPU (gave OOM) * conversion script can handle repos from hub * add conversion script for all MMS-TTS checkpoints * automatically create a README for the converted checkpoint * small changes to config * push README to hub * only show uroman note for checkpoints that need it * remove conversion script because code formatting breaks the readme * make WaveNet layers configurable * rename variables * simplifying the math * output attentions and hidden states * remove VitsFlip in flow model * also got rid of the other flip * fix tests * rename more variables * rename tokenizer, add phonemization * raise error when phonemizer missing * re-order config docstrings to match method * change config naming * remove redundant str -> list * fix copyright: vits authors -> kakao enterprise * (mean, log_variances) -> (prior_mean, prior_log_variances) * if return dict -> if not return dict * speed -> speaking rate * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update fused tanh sigmoid * reduce dims in tester * audio -> output_values * audio -> output_values in tuple out * fix return type * fix return type * make _unconstrained_rational_quadratic_spline a function * all nn's to accept a config * add spectro to output * move {speaking rate, noise scale, noise scale duration} to config * path -> attn_path * idxs -> valid idxs -> padded idxs * output values -> waveform * use config for attention * make generation work * harden integration test * add spectrogram to dict output * tokenizer refactor * make style * remove 'fake' padding token * harden tokenizer tests * ron norm test * fprop / save tests deterministic * move uroman to tokenizer as much as possible * better logger message * fix vivit imports * add uroman integration test * make style * up * matthijs -> sanchit-gandhi * fix tokenizer test * make fix-copies * fix dict comprehension * fix config tests * fix model tests * make outputs consistent with reverse/not reverse * fix key concat * more model details * add author * return dict * speaker error * labels error * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vits/convert_original_checkpoint.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove uromanize * add docstrings * add docstrings for tokenizer * upper-case skip messages * fix return dict * style * finish tests * update checkpoints * make style * remove doctest file * revert * fix docstring * fix tokenizer * remove uroman integration test * add sampling rate * fix docs / docstrings * style * add sr to model output * fix outputs * style / copies * fix docstring * fix copies * remove sr from model outputs * Update utils/documentation_tests.txt Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add sr as allowed attr --------- Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>		2023-09-01 10:50:06 +01:00
..
benchmark	[Test refactor 1/5] Per-folder tests reorganization (#15725 )	2022-02-23 15:46:28 -05:00
bettertransformer	Add methods to PreTrainedModel to use PyTorch's BetterTransformer (#21259 )	2023-04-27 11:03:42 +02:00
deepspeed	🚨🚨🚨 [`Refactor`] Move third-party related utility files into `integrations/` folder 🚨🚨🚨 (#25599 )	2023-08-25 17:13:34 +02:00
extended	[tests] switch to torchrun (#22712 )	2023-04-12 08:25:45 -07:00
fixtures	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
generation	Generate: general test for decoder-only generation from `inputs_embeds` (#25687 )	2023-08-23 19:17:01 +01:00
models	add VITS model (#24085 )	2023-09-01 10:50:06 +01:00
optimization	Make schedulers picklable by making lr_lambda fns global (#21768 )	2023-03-02 12:08:43 -05:00
peft_integration	[`PEFT`] Fix PeftConfig save pretrained when calling `add_adapter` (#25738 )	2023-08-25 08:19:11 +02:00
pipelines	Save image_processor while saving pipeline (ImageSegmentationPipeline) (#25884 )	2023-08-31 16:08:20 +02:00
quantization	🚨🚨🚨 [`Refactor`] Move third-party related utility files into `integrations/` folder 🚨🚨🚨 (#25599 )	2023-08-25 17:13:34 +02:00
repo_utils	Document check copies (#25291 )	2023-08-04 14:56:29 +02:00
sagemaker	Avoid invalid escape sequences, use raw strings (#22936 )	2023-04-25 09:17:56 -04:00
tokenization	[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053 )	2023-07-25 18:45:01 +02:00
tools	Add support for for loops in python interpreter (#24429 )	2023-06-26 09:58:14 -04:00
trainer	Make training args fully immutable (#25435 )	2023-08-15 11:47:47 -04:00
utils	Support loading base64 images in pipelines (#25633 )	2023-08-29 19:24:24 +01:00
__init__.py	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
test_backbone_common.py	Add ViTDet (#25524 )	2023-08-29 10:03:52 +01:00
test_configuration_common.py	Deal with nested configs better in base class (#25237 )	2023-08-04 14:56:09 +02:00
test_configuration_utils.py	Deal with nested configs better in base class (#25237 )	2023-08-04 14:56:09 +02:00
test_feature_extraction_common.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_feature_extraction_utils.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_image_processing_common.py	Input data format (#25464 )	2023-08-16 17:45:02 +01:00
test_image_processing_utils.py	Run hub tests (#24807 )	2023-07-13 15:25:45 -04:00
test_image_transforms.py	Add input_data_format argument, image transforms (#25462 )	2023-08-11 15:09:31 +01:00
test_modeling_common.py	[`resize_embedding`] Introduce `pad_to_multiple_of` and guidance (#25088 )	2023-08-17 17:00:32 +02:00
test_modeling_flax_common.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_modeling_flax_utils.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_modeling_tf_common.py	Skip `test_onnx_runtime_optimize` for now (#25560 )	2023-08-17 11:23:16 +02:00
test_modeling_tf_utils.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00
test_modeling_utils.py	Generate: Load generation config when `device_map` is passed (#25413 )	2023-08-10 10:54:26 +01:00
test_pipeline_mixin.py	Add Text-To-Speech pipeline (#24952 )	2023-08-17 17:34:47 +01:00
test_sequence_feature_extraction_common.py	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
test_tokenization_common.py	[`split_special_tokens`] Add support for `split_special_tokens` argument to encode (#25081 )	2023-08-18 13:26:27 +02:00
test_tokenization_utils.py	Split common test from core tests (#24284 )	2023-06-15 07:30:24 -04:00