mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-04 13:20:12 +06:00
120 lines
5.4 KiB
Plaintext
120 lines
5.4 KiB
Plaintext
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations under the License.
|
|
-->
|
|
|
|
# Load pretrained instances with an AutoClass
|
|
|
|
With so many different Transformer architectures, it can be challenging to create one for your checkpoint. As a part of 🤗 Transformers core philosophy to make the library easy, simple and flexible to use, an `AutoClass` automatically infer and load the correct architecture from a given checkpoint. The `from_pretrained` method lets you quickly load a pretrained model for any architecture so you don't have to devote time and resources to train a model from scratch. Producing this type of checkpoint-agnostic code means if your code works for one checkpoint, it will work with another checkpoint - as long as it was trained for a similar task - even if the architecture is different.
|
|
|
|
<Tip>
|
|
|
|
Remember, architecture refers to the skeleton of the model and checkpoints are the weights for a given architecture. For example, [BERT](https://huggingface.co/bert-base-uncased) is an architecture, while `bert-base-uncased` is a checkpoint. Model is a general term that can mean either architecture or checkpoint.
|
|
|
|
</Tip>
|
|
|
|
In this tutorial, learn to:
|
|
|
|
* Load a pretrained tokenizer.
|
|
* Load a pretrained feature extractor.
|
|
* Load a pretrained processor.
|
|
* Load a pretrained model.
|
|
|
|
## AutoTokenizer
|
|
|
|
Nearly every NLP task begins with a tokenizer. A tokenizer converts your input into a format that can be processed by the model.
|
|
|
|
Load a tokenizer with [`AutoTokenizer.from_pretrained`]:
|
|
|
|
```py
|
|
>>> from transformers import AutoTokenizer
|
|
|
|
>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
|
|
```
|
|
|
|
Then tokenize your input as shown below:
|
|
|
|
```py
|
|
>>> sequence = "In a hole in the ground there lived a hobbit."
|
|
>>> print(tokenizer(sequence))
|
|
{'input_ids': [101, 1999, 1037, 4920, 1999, 1996, 2598, 2045, 2973, 1037, 7570, 10322, 4183, 1012, 102],
|
|
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
|
|
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
|
|
```
|
|
|
|
## AutoFeatureExtractor
|
|
|
|
For audio and vision tasks, a feature extractor processes the audio signal or image into the correct input format.
|
|
|
|
Load a feature extractor with [`AutoFeatureExtractor.from_pretrained`]:
|
|
|
|
```py
|
|
>>> from transformers import AutoFeatureExtractor
|
|
|
|
>>> feature_extractor = AutoFeatureExtractor.from_pretrained(
|
|
... "ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"
|
|
... )
|
|
```
|
|
|
|
## AutoProcessor
|
|
|
|
Multimodal tasks require a processor that combines two types of preprocessing tools. For example, the [LayoutLMV2](model_doc/layoutlmv2) model requires a feature extractor to handle images and a tokenizer to handle text; a processor combines both of them.
|
|
|
|
Load a processor with [`AutoProcessor.from_pretrained`]:
|
|
|
|
```py
|
|
>>> from transformers import AutoProcessor
|
|
|
|
>>> processor = AutoProcessor.from_pretrained("microsoft/layoutlmv2-base-uncased")
|
|
```
|
|
|
|
## AutoModel
|
|
|
|
<frameworkcontent>
|
|
<pt>
|
|
Finally, the `AutoModelFor` classes let you load a pretrained model for a given task (see [here](model_doc/auto) for a complete list of available tasks). For example, load a model for sequence classification with [`AutoModelForSequenceClassification.from_pretrained`]:
|
|
|
|
```py
|
|
>>> from transformers import AutoModelForSequenceClassification
|
|
|
|
>>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
|
|
```
|
|
|
|
Easily reuse the same checkpoint to load an architecture for a different task:
|
|
|
|
```py
|
|
>>> from transformers import AutoModelForTokenClassification
|
|
|
|
>>> model = AutoModelForTokenClassification.from_pretrained("distilbert-base-uncased")
|
|
```
|
|
|
|
Generally, we recommend using the `AutoTokenizer` class and the `AutoModelFor` class to load pretrained instances of models. This will ensure you load the correct architecture every time. In the next [tutorial](preprocessing), learn how to use your newly loaded tokenizer, feature extractor and processor to preprocess a dataset for fine-tuning.
|
|
</pt>
|
|
<tf>
|
|
Finally, the `TFAutoModelFor` classes let you load a pretrained model for a given task (see [here](model_doc/auto) for a complete list of available tasks). For example, load a model for sequence classification with [`TFAutoModelForSequenceClassification.from_pretrained`]:
|
|
|
|
```py
|
|
>>> from transformers import TFAutoModelForSequenceClassification
|
|
|
|
>>> model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
|
|
```
|
|
|
|
Easily reuse the same checkpoint to load an architecture for a different task:
|
|
|
|
```py
|
|
>>> from transformers import TFAutoModelForTokenClassification
|
|
|
|
>>> model = TFAutoModelForTokenClassification.from_pretrained("distilbert-base-uncased")
|
|
```
|
|
|
|
Generally, we recommend using the `AutoTokenizer` class and the `TFAutoModelFor` class to load pretrained instances of models. This will ensure you load the correct architecture every time. In the next [tutorial](preprocessing), learn how to use your newly loaded tokenizer, feature extractor and processor to preprocess a dataset for fine-tuning.
|
|
</tf>
|
|
</frameworkcontent>
|