mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-06 22:30:09 +06:00

* added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit11a081e09e
. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commitb4cbddfa05
. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
310 lines
8.1 KiB
Plaintext
310 lines
8.1 KiB
Plaintext
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations under the License.
|
|
-->
|
|
|
|
# Model outputs
|
|
|
|
All models have outputs that are instances of subclasses of [`~utils.ModelOutput`]. Those are
|
|
data structures containing all the information returned by the model, but that can also be used as tuples or
|
|
dictionaries.
|
|
|
|
Let's see how this looks in an example:
|
|
|
|
```python
|
|
from transformers import BertTokenizer, BertForSequenceClassification
|
|
import torch
|
|
|
|
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
|
|
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
|
|
|
|
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
|
|
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1
|
|
outputs = model(**inputs, labels=labels)
|
|
```
|
|
|
|
The `outputs` object is a [`~modeling_outputs.SequenceClassifierOutput`], as we can see in the
|
|
documentation of that class below, it means it has an optional `loss`, a `logits` an optional `hidden_states` and
|
|
an optional `attentions` attribute. Here we have the `loss` since we passed along `labels`, but we don't have
|
|
`hidden_states` and `attentions` because we didn't pass `output_hidden_states=True` or
|
|
`output_attentions=True`.
|
|
|
|
You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you
|
|
will get `None`. Here for instance `outputs.loss` is the loss computed by the model, and `outputs.attentions` is
|
|
`None`.
|
|
|
|
When considering our `outputs` object as tuple, it only considers the attributes that don't have `None` values.
|
|
Here for instance, it has two elements, `loss` then `logits`, so
|
|
|
|
```python
|
|
outputs[:2]
|
|
```
|
|
|
|
will return the tuple `(outputs.loss, outputs.logits)` for instance.
|
|
|
|
When considering our `outputs` object as dictionary, it only considers the attributes that don't have `None`
|
|
values. Here for instance, it has two keys that are `loss` and `logits`.
|
|
|
|
We document here the generic model outputs that are used by more than one model type. Specific output types are
|
|
documented on their corresponding model page.
|
|
|
|
## ModelOutput
|
|
|
|
[[autodoc]] utils.ModelOutput
|
|
- to_tuple
|
|
|
|
## BaseModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.BaseModelOutput
|
|
|
|
## BaseModelOutputWithPooling
|
|
|
|
[[autodoc]] modeling_outputs.BaseModelOutputWithPooling
|
|
|
|
## BaseModelOutputWithCrossAttentions
|
|
|
|
[[autodoc]] modeling_outputs.BaseModelOutputWithCrossAttentions
|
|
|
|
## BaseModelOutputWithPoolingAndCrossAttentions
|
|
|
|
[[autodoc]] modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
|
|
|
|
## BaseModelOutputWithPast
|
|
|
|
[[autodoc]] modeling_outputs.BaseModelOutputWithPast
|
|
|
|
## BaseModelOutputWithPastAndCrossAttentions
|
|
|
|
[[autodoc]] modeling_outputs.BaseModelOutputWithPastAndCrossAttentions
|
|
|
|
## Seq2SeqModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqModelOutput
|
|
|
|
## CausalLMOutput
|
|
|
|
[[autodoc]] modeling_outputs.CausalLMOutput
|
|
|
|
## CausalLMOutputWithCrossAttentions
|
|
|
|
[[autodoc]] modeling_outputs.CausalLMOutputWithCrossAttentions
|
|
|
|
## CausalLMOutputWithPast
|
|
|
|
[[autodoc]] modeling_outputs.CausalLMOutputWithPast
|
|
|
|
## MaskedLMOutput
|
|
|
|
[[autodoc]] modeling_outputs.MaskedLMOutput
|
|
|
|
## Seq2SeqLMOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqLMOutput
|
|
|
|
## NextSentencePredictorOutput
|
|
|
|
[[autodoc]] modeling_outputs.NextSentencePredictorOutput
|
|
|
|
## SequenceClassifierOutput
|
|
|
|
[[autodoc]] modeling_outputs.SequenceClassifierOutput
|
|
|
|
## Seq2SeqSequenceClassifierOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqSequenceClassifierOutput
|
|
|
|
## MultipleChoiceModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.MultipleChoiceModelOutput
|
|
|
|
## TokenClassifierOutput
|
|
|
|
[[autodoc]] modeling_outputs.TokenClassifierOutput
|
|
|
|
## QuestionAnsweringModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.QuestionAnsweringModelOutput
|
|
|
|
## Seq2SeqQuestionAnsweringModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqQuestionAnsweringModelOutput
|
|
|
|
## Seq2SeqSpectrogramOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqSpectrogramOutput
|
|
|
|
## SemanticSegmenterOutput
|
|
|
|
[[autodoc]] modeling_outputs.SemanticSegmenterOutput
|
|
|
|
## ImageClassifierOutput
|
|
|
|
[[autodoc]] modeling_outputs.ImageClassifierOutput
|
|
|
|
## ImageClassifierOutputWithNoAttention
|
|
|
|
[[autodoc]] modeling_outputs.ImageClassifierOutputWithNoAttention
|
|
|
|
## DepthEstimatorOutput
|
|
|
|
[[autodoc]] modeling_outputs.DepthEstimatorOutput
|
|
|
|
## Wav2Vec2BaseModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.Wav2Vec2BaseModelOutput
|
|
|
|
## XVectorOutput
|
|
|
|
[[autodoc]] modeling_outputs.XVectorOutput
|
|
|
|
## Seq2SeqTSModelOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqTSModelOutput
|
|
|
|
## Seq2SeqTSPredictionOutput
|
|
|
|
[[autodoc]] modeling_outputs.Seq2SeqTSPredictionOutput
|
|
|
|
## SampleTSPredictionOutput
|
|
|
|
[[autodoc]] modeling_outputs.SampleTSPredictionOutput
|
|
|
|
## TFBaseModelOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFBaseModelOutput
|
|
|
|
## TFBaseModelOutputWithPooling
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPooling
|
|
|
|
## TFBaseModelOutputWithPoolingAndCrossAttentions
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPoolingAndCrossAttentions
|
|
|
|
## TFBaseModelOutputWithPast
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPast
|
|
|
|
## TFBaseModelOutputWithPastAndCrossAttentions
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPastAndCrossAttentions
|
|
|
|
## TFSeq2SeqModelOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFSeq2SeqModelOutput
|
|
|
|
## TFCausalLMOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFCausalLMOutput
|
|
|
|
## TFCausalLMOutputWithCrossAttentions
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFCausalLMOutputWithCrossAttentions
|
|
|
|
## TFCausalLMOutputWithPast
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFCausalLMOutputWithPast
|
|
|
|
## TFMaskedLMOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFMaskedLMOutput
|
|
|
|
## TFSeq2SeqLMOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFSeq2SeqLMOutput
|
|
|
|
## TFNextSentencePredictorOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFNextSentencePredictorOutput
|
|
|
|
## TFSequenceClassifierOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFSequenceClassifierOutput
|
|
|
|
## TFSeq2SeqSequenceClassifierOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFSeq2SeqSequenceClassifierOutput
|
|
|
|
## TFMultipleChoiceModelOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFMultipleChoiceModelOutput
|
|
|
|
## TFTokenClassifierOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFTokenClassifierOutput
|
|
|
|
## TFQuestionAnsweringModelOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFQuestionAnsweringModelOutput
|
|
|
|
## TFSeq2SeqQuestionAnsweringModelOutput
|
|
|
|
[[autodoc]] modeling_tf_outputs.TFSeq2SeqQuestionAnsweringModelOutput
|
|
|
|
## FlaxBaseModelOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutput
|
|
|
|
## FlaxBaseModelOutputWithPast
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutputWithPast
|
|
|
|
## FlaxBaseModelOutputWithPooling
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutputWithPooling
|
|
|
|
## FlaxBaseModelOutputWithPastAndCrossAttentions
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutputWithPastAndCrossAttentions
|
|
|
|
## FlaxSeq2SeqModelOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqModelOutput
|
|
|
|
## FlaxCausalLMOutputWithCrossAttentions
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxCausalLMOutputWithCrossAttentions
|
|
|
|
## FlaxMaskedLMOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxMaskedLMOutput
|
|
|
|
## FlaxSeq2SeqLMOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqLMOutput
|
|
|
|
## FlaxNextSentencePredictorOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxNextSentencePredictorOutput
|
|
|
|
## FlaxSequenceClassifierOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxSequenceClassifierOutput
|
|
|
|
## FlaxSeq2SeqSequenceClassifierOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqSequenceClassifierOutput
|
|
|
|
## FlaxMultipleChoiceModelOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxMultipleChoiceModelOutput
|
|
|
|
## FlaxTokenClassifierOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxTokenClassifierOutput
|
|
|
|
## FlaxQuestionAnsweringModelOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxQuestionAnsweringModelOutput
|
|
|
|
## FlaxSeq2SeqQuestionAnsweringModelOutput
|
|
|
|
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqQuestionAnsweringModelOutput
|