transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 02:28:24 +06:00

History

Eli Simhayev 8abe4930d3 [Time-Series] informer model (#21099 ) * added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit `11a081e09e`. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commit `b4cbddfa05`. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>		2023-03-07 21:36:38 +01:00
..
internal	[Time-Series] informer model (#21099 )	2023-03-07 21:36:38 +01:00
main_classes	[Time-Series] informer model (#21099 )	2023-03-07 21:36:38 +01:00
model_doc	[Time-Series] informer model (#21099 )	2023-03-07 21:36:38 +01:00
tasks	[Whisper] Add model for audio classification (#21754 )	2023-03-07 16:20:21 +01:00
_config.py	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
_toctree.yml	[Time-Series] informer model (#21099 )	2023-03-07 21:36:38 +01:00
accelerate.mdx	✨ update to use interlibrary links instead of Markdown (#18500 )	2022-08-08 10:53:52 -05:00
add_new_model.mdx	🚨🚨🚨 Enforce single model initialization (#21431 )	2023-02-09 15:46:26 -05:00
add_new_pipeline.mdx	Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569 )	2022-12-12 09:23:23 -05:00
add_tensorflow_model.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
attention.mdx	Refactor model summary (#21408 )	2023-02-15 10:35:14 -08:00
autoclass_tutorial.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
benchmarks.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
bertology.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
big_models.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
community.mdx	Fix en documentation typos (#21799 )	2023-02-27 08:36:36 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
converting_tensorflow_models.mdx	Docs - Guide to add a new TensorFlow model (#19256 )	2022-09-30 20:30:38 +01:00
create_a_model.mdx	Documentation code sample fixes (#21302 )	2023-01-25 11:33:39 -05:00
custom_models.mdx	Replace awkward timm link with the expected one (#20109 )	2022-11-07 13:57:39 -05:00
debugging.mdx	Spanish translation of the file debugging.mdx (#20566 )	2022-12-12 10:38:56 -05:00
fast_tokenizers.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
generation_strategies.mdx	Add: An introductory guide for text generation (#21090 )	2023-01-17 12:23:22 -05:00
glossary.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
hpo_train.mdx	update doc for perf_train_cpu_many (#19506 )	2022-10-11 22:54:19 -04:00
index.mdx	[Time-Series] informer model (#21099 )	2023-03-07 21:36:38 +01:00
installation.mdx	Move cache folder to huggingface/hub for consistency with hf_hub (#18492 )	2022-08-05 13:14:00 -04:00
migration.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
model_sharing.mdx	Fix `PushToHubCallback` import in Share a model docs (#21457 )	2023-02-06 09:26:22 -05:00
model_summary.mdx	Refactor model summary (#21408 )	2023-02-15 10:35:14 -08:00
multilingual.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
perf_hardware.mdx	[WIP] [doc] performance/scalability revamp (#15723 )	2022-05-16 13:36:41 +02:00
perf_infer_cpu.mdx	add doc for (#20525 )	2022-12-01 16:52:13 +01:00
perf_infer_gpu_many.mdx	add doc for (#20525 )	2022-12-01 16:52:13 +01:00
perf_infer_gpu_one.mdx	[`Doc`] Fix int8 docs (#21487 )	2023-02-07 15:09:27 +01:00
perf_infer_special.mdx	Improve performance docs (#17750 )	2022-06-23 14:51:54 +02:00
perf_train_cpu_many.mdx	update cpu related doc (#20444 )	2022-11-28 08:54:35 -05:00
perf_train_cpu.mdx	Add perf numbers for perf_train_cpu (#20974 )	2023-02-06 09:20:43 -05:00
perf_train_gpu_many.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
perf_train_gpu_one.mdx	Migrate torchdynamo to torch.compile (#20634 )	2022-12-08 11:18:52 -05:00
perf_train_special.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
perf_train_tpu_tf.mdx	Typos/fixes to link syntax (#21450 )	2023-02-07 15:19:19 +00:00
perf_train_tpu.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
performance.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
perplexity.mdx	Fix incorrect size of input for 1st strided window length in `Perplexity of fixed-length models` (#18906 )	2022-09-06 15:20:12 -04:00
philosophy.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
pipeline_tutorial.mdx	[`pipeline`] A simple fix for half-precision & 8bit models (#21479 )	2023-02-10 10:26:17 +01:00
pipeline_webserver.mdx	Update quality tooling for formatting (#21480 )	2023-02-06 18:10:56 -05:00
pr_checks.mdx	Cleanup quality (#21493 )	2023-02-07 12:27:31 -05:00
preprocessing.mdx	Updates to computer vision section of the Preprocess doc (#21181 )	2023-01-19 08:43:36 -05:00
quicktour.mdx	Fix 2 quicktour file doctest (#21742 )	2023-02-23 09:41:28 +01:00
run_scripts.mdx	Just re-reading the whole doc every couple of months 😬 (#18489 )	2022-08-06 09:38:55 +02:00
sagemaker.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
serialization.mdx	Add EfficientNet (#21563 )	2023-02-20 16:37:11 +03:00
task_summary.mdx	Remove trailing 'extractive' word from en documentation (#21594 )	2023-02-13 10:09:00 -05:00
tasks_explained.mdx	Update task summary (#21067 )	2023-02-02 11:41:27 -08:00
testing.mdx	[`tests`] add `accelerate` marker (#21743 )	2023-02-27 12:33:34 +01:00
tf_xla.mdx	Rewrite a couple of lines in the TF XLA doc (#21177 )	2023-01-18 17:53:05 +00:00
tokenizer_summary.mdx	Update tokenizer_summary.mdx (#20135 )	2022-11-15 01:18:13 +01:00
torchscript.mdx	Breakup export guide (#19271 )	2022-10-03 13:18:29 -07:00
training.mdx	Fix code example in training tutorial (#21201 )	2023-01-20 07:38:15 -08:00
troubleshooting.mdx	Removed BLIP mention from the troubleshooting guide (#21872 )	2023-03-01 08:26:25 -05:00