transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-24 14:58:56 +06:00

History

Arindam Jati b242d0f297 [Time series] Add PatchTSMixer (#26247 ) * patchtsmixer initial commit * x,y->context_values,target_values, unittest addded * cleanup code * minor * return hidden states * model tests, partial integration tests * ettm notebook temporary * minor * config mask bug fix, tests updated * final ETT notebooks * add selfattn * init * added docstrings * PatchTSMixerForPretraining -> PatchTSMixerForMaskPretraining * functionality tests added * add start and input docstrings * docstring edits * testcase edits * minor changes * docstring error fixed * ran make fixup * finalize integration tests and docs * minor * cleaned gitignore * added dataclass decorator, ran black formatter * ran ruff * formatting * add slow decorator * renamed in_Channel to input_size and default to 1 * shorten dataclass names * use smaller model for testing * moved the 3 heads to the modeling file * use scalers instead of revin * support forecast_channel_indices * fix regression scaling * undo reg. scaling * removed unneeded classes * forgot missing * add more layers * add copied positional_encoding * use patchmask from patchtst * removed dependency on layers directory * formatting * set seed * removed unused imports * fixed forward signature test * adding distributional head for PatchTSMixerForecasting * add generate to forecast * testcases for generate * add generate and distributional head for regression * raise Exception for negative values for neg binominal distribution * formatting changes * remove copied from patchtst and add TODO for test passing * make copies * doc edits * minor changes * format issues * minor changes * minor changes * format docstring * change some class names to PatchTSMixer + class name Transpose to PatchTSMixerTranspose GatedAttention to PatchTSMixerGatedAttention * change NormLayer to PatchTSMixerNormLayer * change MLP to PatchTSMixerMLP * change PatchMixer to PatchMixerBlock, FeatureMixer to FeatureMixerBlock * change ChannelFeatureMixer to ChannelFeatureMixerBlock * change PatchMasking to PatchTSMixerMasking * change Patchify to PatchTSMixerPatchify * list to `list` * fix docstrings * formatting * change bs to batch_size, edit forecast_masking * edit random_masking * change variable name and update docstring in PatchTSMixerMasking * change variable name and update docstring in InjectScalerStatistics4D * update forward call in PatchTSMixerTranspose * change variable name and update docstring in PatchTSMixerNormLayer * change variable name and update docstring in PatchTSMixerMLP * change variable name and update docstring in ChannelFeatureMixerBlock * formatting * formatting issues * docstring issue * fixed observed_mask type in docstrings * use FloatTensor type * formatting * fix rescaling issue in forecasting, fixed integration tests * add docstring from decorator * fix docstring * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * PatchTSMixerChannelFeatureMixerBlock * formatting * ForPretraining * use num_labels instead of n_classes * remove commented out code * docstring fixed * nn.functional used instead of one letter F * x_tmp renamed * one letter variable x removed from forward calls * one letter variable y removed * remove commented code * rename patch_size, in_channels, PatchTSMixerBackbone * add config to heads * add config to heads tests * code reafactoring to use config instead of passing individual params * Cdocstring fixes part 1 * docstring fixes part 2 * removed logger.debug * context_values -> past_values * formatting changes * pe -> positional_encoding * removed unused target variable * self.mode logic fixed * formatting change * edit docstring and var name * change n_targets to num_targets * rename input_size to num_input_channels * add head names with prefix PatchTSMixer * edit docstring in PatchTSMixerForRegression * fix var name change in testcases * add PatchTSMixerAttention * return dict for all exposed classes, test cases added * format * move loss function to forward call * make style * adding return dict/tuple * make repo-consistency * remove flatten mode * code refactoring * rename data * remove PatchTSMixer and keep only PatchTSMixerEncoder * docstring fixes * removed unused code * format * format * remove contiguous and formatting changes * remove model description from config * replace asserts with ValueError * remove nn.Sequential from PatchTSMixerNormLayer * replace if-else with map * remove all nn.Sequential * format * formatting * fix gradient_checkpointing error after merge, and formatting * make fix-copies * remove comments * reshape * doesnt support gradient checkpointing * corect Patchify * masking updates * batchnorm copy from * format checks * scaler edits * remove comments * format changes * remove self.config * correct class PatchTSMixerMLP(nn.Module): * makr fix * doc updates * fix-copies * scaler class correction * doc edits * scaler edits * update readme with links * injectstatistics add * fix-copies * add norm_eps option to LayerNorm * format changes * fix copies * correct make copies * use parametrize * fix doc string * add docs to toctree * make style * doc segmenting * docstring edit * change forecast to prediction * edit doc * doc edits * remove PatchTSMixerTranspose * add PatchTSMixerPositionalEncoding and init position_enc * remove positional_encoding * edit forecast_masking, remove forecast_mask_ratios * fix broken code * var rename target_values -> future_values * num_features -> d_model * fix broken code after master merge * repo consistency * use postional embedding * prediction_logits -> prediction_outputs, make fix-copies * uncommented @slow * minor changes * loss first in tuple * tuple and dict same ordering * style edits * minor changes * dict/tuple consistent enablement * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix formatting * formatting * usage tip * test on cpu only * add sample usage * change PatchTSMixerForClassification to PatchTSMixerForTimeSeriesClassification * push changes * fix copies * std scaling set to default True case * minor changes * stylechanges --------- Co-authored-by: Arindam Jati <arindam.jati@ibm.com> Co-authored-by: vijaye12 <vijaye12@in.ibm.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: nnguyen <nnguyen@us.ibm.com> Co-authored-by: vijaye12 <vijaykr.e@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Nam Nguyen <namctin@gmail.com> Co-authored-by: Wesley Gifford <79663411+wgifford@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>		2023-12-05 15:31:35 +01:00
..
internal	translate internal folder files to chinese (#27638 )	2023-12-04 10:04:28 -08:00
main_classes	[docs] Quantization (#27641 )	2023-11-28 08:41:47 -08:00
model_doc	[Time series] Add PatchTSMixer (#26247 )	2023-12-05 15:31:35 +01:00
tasks	Translate `en/tasks` folder docs to Japanese 🇯🇵 (#27098 )	2023-12-04 14:10:54 -08:00
_config.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00
_redirects.yml	Extended semantic segmentation to image segmentation (#27039 )	2023-11-23 15:58:21 +00:00
_toctree.yml	[Time series] Add PatchTSMixer (#26247 )	2023-12-05 15:31:35 +01:00
accelerate.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
add_new_model.md	Update add_new_model.md (#26365 )	2023-09-25 12:58:11 +02:00
add_new_pipeline.md	Update add_new_pipeline.md (#26197 )	2023-09-19 00:41:16 +02:00
add_tensorflow_model.md	Remove `utils/documentation_tests.txt` (#26213 )	2023-09-18 13:33:01 +02:00
attention.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
autoclass_tutorial.md	Update autoclass_tutorial.md (#25929 )	2023-09-04 11:16:49 +01:00
benchmarks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
chat_templating.md	Update chat template warnings/guides (#27634 )	2023-11-27 18:40:10 +00:00
community.md	Update community.md (#25928 )	2023-09-04 11:16:34 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
create_a_model.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
custom_models.md	Reorder the code on the Hub to explicit that sharing on the Hub isn't a requirement (#27691 )	2023-11-27 09:38:18 +01:00
custom_tools.md	[doc] Always call it Agents for consistency (#25958 )	2023-09-05 12:27:20 +01:00
debugging.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
generation_strategies.md	[docs] navigation improvement between text gen pipelines and text gen params (#26477 )	2023-09-29 09:43:39 +02:00
glossary.md	[docs] Performance docs refactor p.2 (#26791 )	2023-10-24 13:10:06 -04:00
hpo_train.md	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
index.md	[Time series] Add PatchTSMixer (#26247 )	2023-12-05 15:31:35 +01:00
installation.md	[docs] Update offline mode docs (#26478 )	2023-09-29 09:42:21 +02:00
llm_tutorial_optimization.md	Generate: Update docs regarding reusing `past_key_values` in `generate` (#27612 )	2023-11-21 10:48:14 +00:00
llm_tutorial.md	Generate: update basic llm tutorial (#26937 )	2023-10-19 16:53:28 +01:00
model_memory_anatomy.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
model_sharing.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
model_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
multilingual.md	Fix typo in example code (#25583 )	2023-08-18 07:58:59 +02:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
peft.md	[`Peft`] `modules_to_save` support for peft integration (#27466 )	2023-11-14 10:32:57 +01:00
perf_hardware.md	docs: replace torch.distributed.run by torchrun (#27528 )	2023-11-27 16:26:33 +00:00
perf_infer_cpu.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perf_infer_gpu_one.md	Flash Attention 2 support for RoCm (#27611 )	2023-12-04 21:52:17 +09:00
perf_torch_compile.md	Fix rendering for `torch.compile()` docs (#25432 )	2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_cpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_gpu_many.md	docs: replace torch.distributed.run by torchrun (#27528 )	2023-11-27 16:26:33 +00:00
perf_train_gpu_one.md	Reflect RoCm support in the documentation (#27636 )	2023-11-25 00:59:17 +09:00
perf_train_special.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_tpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perplexity.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_tutorial.md	[ASR Pipe] Improve docs and error messages (#26476 )	2023-09-29 18:32:37 +01:00
pipeline_webserver.md	Suggestions on Pipeline_webserver (#25570 )	2023-08-18 10:17:44 +02:00
pr_checks.md	Docstring check (#26052 )	2023-10-04 15:13:37 +02:00
preprocessing.md	Broken links fixed related to datasets docs (#27569 )	2023-11-17 13:44:09 -08:00
quantization.md	Faster generation using AWQ + Fused modules (#27411 )	2023-12-05 12:14:45 +01:00
quicktour.md	[TYPO] fix typo/format in quicktour.md (#25519 )	2023-08-16 08:03:23 +02:00
run_scripts.md	docs: replace torch.distributed.run by torchrun (#27528 )	2023-11-27 16:26:33 +00:00
sagemaker.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
serialization.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
task_summary.md	Fix doctest (#25031 )	2023-07-25 22:10:06 +02:00
tasks_explained.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
testing.md	Device agnostic testing (#25870 )	2023-10-24 16:49:26 +02:00
tf_xla.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tflite.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tokenizer_summary.md	Fix typo: Roberta -> RoBERTa (#25302 )	2023-08-03 14:17:30 -07:00
torchscript.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
training.md	Fix semantic error in evaluation section (#27675 )	2023-11-24 12:41:16 +01:00
transformers_agents.md	[doc] Always call it Agents for consistency (#25958 )	2023-09-05 12:27:20 +01:00
troubleshooting.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00