transformers/docs/source/custom_models.mdx
Sylvain Gugger 44b21f117b
Save code of registered custom models (#15379)
* Allow dynamic modules to use relative imports

* Work for configs

* Fix last merge conflict

* Save code of registered custom objects

* Map strings to strings

* Fix test

* Add tokenizer

* Rework tests

* Tests

* Ignore fixtures py files for tests

* Tokenizer test + fix collection

* With full path

* Rework integration

* Fix typo

* Remove changes in conftest

* Test for tokenizers

* Add documentation

* Update docs/source/custom_models.mdx

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add file structure and file content

* Add more doc

* Style

* Update docs/source/custom_models.mdx

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-02-02 10:44:37 -05:00

172 lines
6.1 KiB
Plaintext

<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Sharing custom models
The 🤗 Transformers library is designed to be easily extensible. Every model is fully coded in a given subfolder
of the repository with no abstraction, so you can easily copy a modeling file and tweak it to your needs.
Once you are happy with those tweaks and trained a model you want to share with the community, there are simple steps
to push on the Model Hub not only the weights of your model, but also the code it relies on, so that anyone in the
community can use it, even if it's not present in the 🤗 Transformers library.
This also applies to configurations and tokenizers (support for feature extractors and processors is coming soon).
## Sending the code to the Hub
First, make sure your model is fully defined in a `.py` file. It can rely on relative imports to some other files as
long as all the files are in the same directory (we don't support submodules for this feature yet). For instance,
let's say you have a `modeling.py` file and a `configuration.py` file in a folder of the current working directory
named `awesome_model`, and that the modeling file defines an `AwesomeModel`, the configuration file a `AwesomeConfig`.
```
.
└── awesome_model
├── __init__.py
├── configuration.py
└── modeling.py
```
The `__init__.py` can be empty, it's just there so that Python detects `awesome_model` can be use as a module.
Here is an example of what the configuration file could look like:
```py
from transformers import PretrainedConfig
class AwesomeConfig(PretrainedConfig):
model_type = "awesome"
def __init__(self, attribute=1, hidden_size=42, **kwargs):
self.attribute = attribute
self.hidden_size = hidden_size
super().__init__(**kwargs)
```
and the modeling file could have content like this:
```py
import torch
from transformers import PreTrainedModel
from .configuration import AwesomeConfig
class AwesomeModel(PreTrainedModel):
config_class = AwesomeConfig
base_model_prefix = "base"
def __init__(self, config):
super().__init__(config)
self.linear = torch.nn.Linear(config.hidden_size, config.hidden_size)
def forward(self, x):
return self.linear(x)
```
`AwesomeModel` should subclass [`PreTrainedModel`] and `AwesomeConfig` should subclass [`PretrainedConfig`]. The
easiest way to achieve this is to copy the modeling and configuration files of the model closest to the one you're
coding, and then tweaking them.
<Tip warning={true}>
If copying a modeling files from the library, you will need to replace all the relative imports at the top of the file
to import from the `transformers` package.
</Tip>
Note that you can re-use (or subclass) an existing configuration/model.
To share your model with the community, follow those steps: first import the custom objects.
```py
from awesome_model.configuration import AwesomeConfig
from awesome_model.modeling import AwesomeModel
```
Then you have to tell the library you want to copy the code files of those objects when using the `save_pretrained`
method and properly register them with a given Auto class (especially for models), just run:
```py
AwesomeConfig.register_for_auto_class()
AwesomeModel.register_for_auto_class("AutoModel")
```
Note that there is no need to specify an auto class for the configuration (there is only one auto class for them,
[`AutoConfig`]) but it's different for models. Your custom model could be suitable for sequence classification (in
which case you should do `AwesomeModel.register_for_auto_class("AutoModelForSequenceClassification")`) or any other
task, so you have to specify which one of the auto classes is the correct one for your model.
Next, just create the config and models as you would any other Transformer models:
```py
config = AwesomeConfig()
model = AwesomeModel(config)
```
then train your model. Alternatively, you could load a pretrained checkpoint you have already trained in your model.
Once everything is ready, you just have to do:
```py
model.save_pretrained("save_dir")
```
which will not only save the model weights and the configuration in json format, but also copy the modeling and
configuration `.py` files in this folder, so you can directly upload the result to the Hub.
If you have already logged in to Hugging face with
```bash
huggingface-cli login
```
or in a notebook with
```py
from huggingface_hub import notebook_login
notebook_login()
```
you can push your model and its code to the Hub with the following:
```py
model.push_to_hub("model-identifier")
```
See the [sharing tutorial](model_sharing) for more information on the push to Hub method.
## Using a model with custom code
You can use any configuration, model or tokenizer with custom code files in its repository with the auto-classes and
the `from_pretrained` method. The only thing is that you have to add an extra argument to make sure you have read the
online code and trust the author of that model, to avoid executing malicious code on your machine:
```py
from transformers import AutoModel
model = AutoModel.from_pretrained("model-checkpoint", trust_remote_code=True)
```
It is also strongly encouraged to pass a commit hash as a `revision` to make sure the author of the models did not
update the code with some malicious new lines (unless you fully trust the authors of the models).
```py
commit_hash = "b731e5fae6d80a4a775461251c4388886fb7a249"
model = AutoModel.from_pretrained("model-checkpoint", trust_remote_code=True, revision=commit_hash)
```
Note that when browsing the commit history of the model repo on the Hub, there is a button to easily copy the commit
hash of any commit.