mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

[generate] move SinkCache to a custom_generate repo (#38399 )

remove sink cache

2025-06-02 12:13:30 +02:00

10 KiB

Raw Blame History

생성을 위한 유틸리티 utilities-for-generation

이 페이지는 [~generation.GenerationMixin.generate]에서 사용되는 모든 유틸리티 함수들을 나열합니다.

출력을 생성하기 (Generate Outputs) generate-outputs

[~generation.GenerationMixin.generate]의 출력은 [~utils.ModelOutput]의 하위 클래스의 인스턴스입니다. 이 출력은 [~generation.GenerationMixin.generate]에서 반환되는 모든 정보를 포함하는 데이터 구조체이며, 튜플 또는 딕셔너리로도 사용할 수 있습니다.

다음은 예시입니다:

from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("openai-community/gpt2")
model = GPT2LMHeadModel.from_pretrained("openai-community/gpt2")

inputs = tokenizer("Hello, my dog is cute and ", return_tensors="pt")
generation_output = model.generate(**inputs, return_dict_in_generate=True, output_scores=True)

generation_output 객체는 [~generation.GenerateDecoderOnlyOutput]입니다. 아래 문서에서 확인할 수 있듯이, 이 클래스는 다음과 같은 속성을 가지고 있습니다:

sequences: 생성된 토큰 시퀀스
scores (옵션): 각 생성 단계에서 언어 모델링 헤드의 예측 점수
hidden_states (옵션): 각 생성 단계에서 모델의 은닉 상태
attentions (옵션): 각 생성 단계에서 모델의 어텐션 가중치

output_scores=True를 전달했기 때문에 scores는 포함되어 있지만, output_hidden_states=True 또는 output_attentions=True를 전달하지 않았으므로 hidden_states와 attentions는 포함되지 않았습니다.

각 속성은 일반적으로 접근할 수 있으며, 모델이 해당 속성을 반환하지 않았다면 None이 반환됩니다. 예를 들어, generation_output.scores는 언어 모델링 헤드에서 생성된 모든 예측 점수를 포함하고 있으며, generation_output.attentions는 None입니다.

generation_output 객체를 튜플로 사용할 경우, None 값이 아닌 속성만 포함됩니다. 예를 들어, loss와 logits라는 두 요소가 포함된 경우:

generation_output[:2]

위 코드는 (generation_output.sequences, generation_output.scores) 튜플을 반환합니다.

generation_output 객체를 딕셔너리로 사용할 경우, None 값이 아닌 속성만 포함됩니다. 예를 들어, sequences와 scores라는 두 개의 키를 가질 수 있습니다.

여기서는 모든 출력 유형을 문서화합니다.

PyTorch transformers.generation.GenerateDecoderOnlyOutput

autodoc generation.GenerateDecoderOnlyOutput

autodoc generation.GenerateEncoderDecoderOutput

autodoc generation.GenerateBeamDecoderOnlyOutput

autodoc generation.GenerateBeamEncoderDecoderOutput

TensorFlow transformers.generation.TFGreedySearchEncoderDecoderOutput

autodoc generation.TFGreedySearchEncoderDecoderOutput

autodoc generation.TFGreedySearchDecoderOnlyOutput

autodoc generation.TFSampleEncoderDecoderOutput

autodoc generation.TFSampleDecoderOnlyOutput

autodoc generation.TFBeamSearchEncoderDecoderOutput

autodoc generation.TFBeamSearchDecoderOnlyOutput

autodoc generation.TFBeamSampleEncoderDecoderOutput

autodoc generation.TFBeamSampleDecoderOnlyOutput

autodoc generation.TFContrastiveSearchEncoderDecoderOutput

autodoc generation.TFContrastiveSearchDecoderOnlyOutput

FLAX transformers.generation.FlaxSampleOutput

autodoc generation.FlaxSampleOutput

autodoc generation.FlaxGreedySearchOutput

autodoc generation.FlaxBeamSearchOutput

LogitsProcessor logitsprocessor

[LogitsProcessor]는 생성 중 언어 모델 헤드의 예측 점수를 수정하는 데 사용됩니다.

PyTorch transformers.AlternatingCodebooksLogitsProcessor

autodoc AlternatingCodebooksLogitsProcessor - call

autodoc ClassifierFreeGuidanceLogitsProcessor - call

autodoc EncoderNoRepeatNGramLogitsProcessor - call

autodoc EncoderRepetitionPenaltyLogitsProcessor - call

autodoc EpsilonLogitsWarper - call

autodoc EtaLogitsWarper - call

autodoc ExponentialDecayLengthPenalty - call

autodoc ForcedBOSTokenLogitsProcessor - call

autodoc ForcedEOSTokenLogitsProcessor - call

autodoc HammingDiversityLogitsProcessor - call

autodoc InfNanRemoveLogitsProcessor - call

autodoc LogitNormalization - call

autodoc LogitsProcessor - call

autodoc LogitsProcessorList - call

autodoc MinLengthLogitsProcessor - call

autodoc MinNewTokensLengthLogitsProcessor - call

autodoc MinPLogitsWarper - call

autodoc NoBadWordsLogitsProcessor - call

autodoc NoRepeatNGramLogitsProcessor - call

autodoc PrefixConstrainedLogitsProcessor - call

autodoc RepetitionPenaltyLogitsProcessor - call

autodoc SequenceBiasLogitsProcessor - call

autodoc SuppressTokensAtBeginLogitsProcessor - call

autodoc SuppressTokensLogitsProcessor - call

autodoc TemperatureLogitsWarper - call

autodoc TopKLogitsWarper - call

autodoc TopPLogitsWarper - call

autodoc TypicalLogitsWarper - call

autodoc UnbatchedClassifierFreeGuidanceLogitsProcessor - call

autodoc WhisperTimeStampLogitsProcessor - call

autodoc WatermarkLogitsProcessor - call

TensorFlow transformers.TFForcedBOSTokenLogitsProcessor

autodoc TFForcedBOSTokenLogitsProcessor - call

autodoc TFForcedEOSTokenLogitsProcessor - call

autodoc TFForceTokensLogitsProcessor - call

autodoc TFLogitsProcessor - call

autodoc TFLogitsProcessorList - call

autodoc TFLogitsWarper - call

autodoc TFMinLengthLogitsProcessor - call

autodoc TFNoBadWordsLogitsProcessor - call

autodoc TFNoRepeatNGramLogitsProcessor - call

autodoc TFRepetitionPenaltyLogitsProcessor - call

autodoc TFSuppressTokensAtBeginLogitsProcessor - call

autodoc TFSuppressTokensLogitsProcessor - call

autodoc TFTemperatureLogitsWarper - call

autodoc TFTopKLogitsWarper - call

autodoc TFTopPLogitsWarper - call

FLAX transformers.FlaxForcedBOSTokenLogitsProcessor

autodoc FlaxForcedBOSTokenLogitsProcessor - call

autodoc FlaxForcedEOSTokenLogitsProcessor - call

autodoc FlaxForceTokensLogitsProcessor - call

autodoc FlaxLogitsProcessor - call

autodoc FlaxLogitsProcessorList - call

autodoc FlaxLogitsWarper - call

autodoc FlaxMinLengthLogitsProcessor - call

autodoc FlaxSuppressTokensAtBeginLogitsProcessor - call

autodoc FlaxSuppressTokensLogitsProcessor - call

autodoc FlaxTemperatureLogitsWarper - call

autodoc FlaxTopKLogitsWarper - call

autodoc FlaxTopPLogitsWarper - call

autodoc FlaxWhisperTimeStampLogitsProcessor - call

StoppingCriteria transformers.StoppingCriteria

[StoppingCriteria]는 생성이 언제 멈출지를 결정하는 데 사용됩니다 (EOS 토큰 외). 이 기능은 PyTorch 구현에만 제공됩니다.

autodoc StoppingCriteria - call

autodoc StoppingCriteriaList - call

autodoc MaxLengthCriteria - call

autodoc MaxTimeCriteria - call

autodoc StopStringCriteria - call

autodoc EosTokenCriteria - call

Constraint transformers.Constraint

[Constraint]는 생성 출력에 특정 토큰이나 시퀀스를 강제로 포함시키는 데 사용됩니다. 이 기능은 PyTorch 구현에만 제공됩니다.

autodoc Constraint

autodoc PhrasalConstraint

autodoc DisjunctiveConstraint

autodoc ConstraintListState

빔 검색 (BeamSearch) transformers.BeamScorer

autodoc BeamScorer - process - finalize

autodoc BeamSearchScorer - process - finalize

autodoc ConstrainedBeamSearchScorer - process - finalize

스트리머 (Streamers) transformers.TextStreamer

autodoc TextStreamer

autodoc TextIteratorStreamer

캐시 (Caches) transformers.Cache

autodoc Cache - update

autodoc CacheConfig - update

autodoc QuantizedCacheConfig - validate

autodoc DynamicCache - update - get_seq_length - reorder_cache - to_legacy_cache - from_legacy_cache

autodoc QuantizedCache - update - get_seq_length

autodoc QuantoQuantizedCache

autodoc HQQQuantizedCache

autodoc OffloadedCache - update - prefetch_layer - evict_previous_layer

autodoc StaticCache - update - get_seq_length - reset

autodoc OffloadedStaticCache - update - get_seq_length - reset

autodoc HybridCache - update - get_seq_length - reset

autodoc SlidingWindowCache - update - reset

autodoc EncoderDecoderCache - get_seq_length - to_legacy_cache - from_legacy_cache - reset - reorder_cache

autodoc MambaCache - update_conv_state - update_ssm_state - reset

워터마크 유틸리티 (Watermark Utils) transformers.WatermarkDetector

autodoc WatermarkDetector - call

10 KiB Raw Blame History