mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-15 02:28:24 +06:00
![]() * added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit |
||
---|---|---|
.. | ||
internal | ||
main_classes | ||
model_doc | ||
tasks | ||
_config.py | ||
_toctree.yml | ||
accelerate.mdx | ||
add_new_model.mdx | ||
add_new_pipeline.mdx | ||
add_tensorflow_model.mdx | ||
attention.mdx | ||
autoclass_tutorial.mdx | ||
benchmarks.mdx | ||
bertology.mdx | ||
big_models.mdx | ||
community.mdx | ||
contributing.md | ||
converting_tensorflow_models.mdx | ||
create_a_model.mdx | ||
custom_models.mdx | ||
debugging.mdx | ||
fast_tokenizers.mdx | ||
generation_strategies.mdx | ||
glossary.mdx | ||
hpo_train.mdx | ||
index.mdx | ||
installation.mdx | ||
migration.mdx | ||
model_sharing.mdx | ||
model_summary.mdx | ||
multilingual.mdx | ||
notebooks.md | ||
pad_truncation.mdx | ||
perf_hardware.mdx | ||
perf_infer_cpu.mdx | ||
perf_infer_gpu_many.mdx | ||
perf_infer_gpu_one.mdx | ||
perf_infer_special.mdx | ||
perf_train_cpu_many.mdx | ||
perf_train_cpu.mdx | ||
perf_train_gpu_many.mdx | ||
perf_train_gpu_one.mdx | ||
perf_train_special.mdx | ||
perf_train_tpu_tf.mdx | ||
perf_train_tpu.mdx | ||
performance.mdx | ||
perplexity.mdx | ||
philosophy.mdx | ||
pipeline_tutorial.mdx | ||
pipeline_webserver.mdx | ||
pr_checks.mdx | ||
preprocessing.mdx | ||
quicktour.mdx | ||
run_scripts.mdx | ||
sagemaker.mdx | ||
serialization.mdx | ||
task_summary.mdx | ||
tasks_explained.mdx | ||
testing.mdx | ||
tf_xla.mdx | ||
tokenizer_summary.mdx | ||
torchscript.mdx | ||
training.mdx | ||
troubleshooting.mdx |