mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 02:02:21 +06:00
fix docstring
This commit is contained in:
parent
17b10ce339
commit
46ca0da436
@ -1314,25 +1314,6 @@ class SlidingWindowCache(Cache):
|
||||
55, 56, 57, 58, 59, 60, 61, 62, 63, 0])
|
||||
|
||||
We overwrite the cache using these, then we always write at cache_position (clamped to `sliding_window`)
|
||||
|
||||
Parameters:
|
||||
config (`PretrainedConfig`):
|
||||
The configuration file defining the shape-related attributes required to initialize the static cache.
|
||||
max_batch_size (`int`):
|
||||
The maximum batch size with which the model will be used. Note that a new instance must be instantiated if a
|
||||
smaller batch size is used.
|
||||
max_cache_len (`int`, *optional*):
|
||||
The maximum sequence length with which the model will be used.
|
||||
device (`torch.device` or `str`, *optional*):
|
||||
The device on which the cache should be initialized. If you're using more than 1 computation device, you
|
||||
should pass the `layer_device_map` argument instead.
|
||||
dtype (`torch.dtype`, *optional*, defaults to `torch.float32`):
|
||||
The default `dtype` to use when initializing the layer.
|
||||
layer_device_map (`Optional[dict[int, Union[str, torch.device, int]]]]`, *optional*):
|
||||
Mapping between the layers and its device. This is required when you are manually initializing the cache
|
||||
and the model is split between different gpus. You can know which layers mapped to which device by
|
||||
checking the associated device_map: `model.hf_device_map`.
|
||||
|
||||
Example:
|
||||
|
||||
```python
|
||||
|
Loading…
Reference in New Issue
Block a user