Model sharing rst (#8439)

* Update RST * Finer details * Re-organize * Style
2025-07-31 10:12:23 +06:00 · 2020-11-10 08:35:11 -05:00 · 2020-11-10 08:35:11 -05:00 · 9cebee38ad
commit 9cebee38ad
parent ad2303a401
1 changed files with 72 additions and 67 deletions
--- a/docs/source/model_sharing.rst
+++ b/docs/source/model_sharing.rst
@ -18,39 +18,65 @@ done something similar on your task, either using the model directly in your own
 :class:`~.transformers.Trainer`/:class:`~.transformers.TFTrainer` class. Let's see how you can share the result on the
 `model hub <https://huggingface.co/models>`__.

+Model versioning
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since version v3.5.0, the model hub has built-in model versioning based on git and git-lfs. It is based on the paradigm
+that one model *is* one repo.
+
+This allows:
+
+- built-in versioning
+- access control
+- scalability
+
+This is built around *revisions*, which is a way to pin a specific version of a model, using a commit hash, tag or
+branch.
+
+For instance:
+
+.. code-block::
+
+    >>> tokenizer = AutoTokenizer.from_pretrained(
+    >>>   "julien-c/EsperBERTo-small",
+    >>>   revision="v2.0.1" # tag name, or branch name, or commit hash
+    >>> )
+
 Basic steps
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-.. 
-    When #5258 is merged, we can remove the need to create the directory.
+In order to upload a model, you'll need to first create a git repo. This repo will live on the model hub, allowing
+users to clone it and you (and your organization members) to push to it. First, you should ensure you are logged in the
+``transformers-cli``:

-First, pick a directory with the name you want your model to have on the model hub (its full name will then be
-`username/awesome-name-you-picked` or `organization/awesome-name-you-picked`) and create it with either
+Go in a terminal and run the following command. It should be in the virtual environment where you installed 🤗
+Transformers, since that command :obj:`transformers-cli` comes from the library.

 .. code-block::

-    mkdir path/to/awesome-name-you-picked
+    transformers-cli login

-or in python
+
+Once you are logged in with your model hub credentials, you can start building your repositories. To create a repo:

 .. code-block::

-    import os
-    os.makedirs("path/to/awesome-name-you-picked")
+    transformers-cli repo create your-model-name

-then you can save your model and tokenizer with:
+This creates a repo on the model hub, which can be cloned. You can then add/remove from that repo as you would with any
+other git repo.

 .. code-block::

-    model.save_pretrained("path/to/awesome-name-you-picked")
-    tokenizer.save_pretrained("path/to/awesome-name-you-picked")
+    git clone https://huggingface.co/username/your-model-name

-Or, if you're using the Trainer API
+    # Then commit as usual
+    cd your-model-name
+    echo "hello" >> README.md
+    git add . && git commit -m "Update from $USER"

-.. code-block::
+We are intentionally not wrapping git too much, so as to stay intuitive and easy-to-use.

-    trainer.save_model("path/to/awesome-name-you-picked")
-    tokenizer.save_pretrained("path/to/awesome-name-you-picked")

 Make your model work on all frameworks
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -71,13 +97,13 @@ or removing TF. For instance, if you trained a :class:`~transformers.DistilBertF

 .. code-block::

-    from transformers import TFDistilBertForSequenceClassification
+    >>> from transformers import TFDistilBertForSequenceClassification

 and if you trained a :class:`~transformers.TFDistilBertForSequenceClassification`, try to type

 .. code-block::

-    from transformers import DistilBertForSequenceClassification
+    >>> from transformers import DistilBertForSequenceClassification

 This will give back an error if your model does not exist in the other framework (something that should be pretty rare
 since we're aiming for full parity between the two frameworks). In this case, skip this and go to the next step.
@ -87,20 +113,20 @@ model class:

 .. code-block::

-    tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True)
-    tf_model.save_pretrained("path/to/awesome-name-you-picked")
+    >>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True)
+    >>> tf_model.save_pretrained("path/to/awesome-name-you-picked")

 and if you trained your model in TensorFlow and have to create a PyTorch version, adapt the following code to your
 model class:

 .. code-block::

-    pt_model = DistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_tf=True)
-    pt_model.save_pretrained("path/to/awesome-name-you-picked")
+    >>> pt_model = DistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_tf=True)
+    >>> pt_model.save_pretrained("path/to/awesome-name-you-picked")

 That's all there is to it!

-Check the directory before uploading
+Check the directory before pushing to the model hub.
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 Make sure there are no garbage files in the directory you'll upload. It should only have:
@ -116,62 +142,46 @@ Make sure there are no garbage files in the directory you'll upload. It should o

 Other files can safely be deleted.

-Upload your model with the CLI
+
+Uploading your files
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-Now go in a terminal and run the following command. It should be in the virtual environment where you installed 🤗
-Transformers, since that command :obj:`transformers-cli` comes from the library.
+Once the repo is cloned, you can add the model, configuration and tokenizer files. For instance, saving the model and
+tokenizer files:

 .. code-block::

-    transformers-cli login
+    >>> model.save_pretrained("path/to/repo/clone/your-model-name")
+    >>> tokenizer.save_pretrained("path/to/repo/clone/your-model-name")

-Then log in using the same credentials as on huggingface.co. To upload your model, just type
+Or, if you're using the Trainer API

 .. code-block::

-    transformers-cli upload path/to/awesome-name-you-picked/
+    >>> trainer.save_model("path/to/awesome-name-you-picked")

-This will upload the folder containing the weights, tokenizer and configuration we prepared in the previous section.
-
-By default you will be prompted to confirm that you want these files to be uploaded. If you are uploading multiple
-models and need to script that process, you can add `-y` to bypass the prompt. For example:
+You can then add these files to the staging environment and verify that they have been correctly staged with the ``git
+status`` command:

 .. code-block::

-    transformers-cli upload -y path/to/awesome-name-you-picked/
+    git add --all
+    git status

-
-If you want to upload a single file (a new version of your model, or the other framework checkpoint you want to add),
-just type:
+Finally, the files should be comitted:

 .. code-block::

-    transformers-cli upload path/to/awesome-name-you-picked/that-file 
+    git commit -m "First version of the your-model-name model and tokenizer."

-or
+And pushed to the remote:

 .. code-block::

-   transformers-cli upload path/to/awesome-name-you-picked/that-file --filename awesome-name-you-picked/new_name
+    git push

-if you want to change its filename.
+This will upload the folder containing the weights, tokenizer and configuration we have just prepared.

-This uploads the model to your personal account. If you want your model to be namespaced by your organization name
-rather than your username, add the following flag to any command:
-
-.. code-block::
-
-    --organization organization_name
-
-so for instance:
-
-.. code-block::
-
-    transformers-cli upload path/to/awesome-name-you-picked/ --organization organization_name
-
-Your model will then be accessible through its identifier, which is, as we saw above,
-`username/awesome-name-you-picked` or `organization/awesome-name-you-picked`.

 Add a model card
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -203,20 +213,15 @@ Anyone can load it from code:

 .. code-block::

-    tokenizer = AutoTokenizer.from_pretrained("namespace/awesome-name-you-picked")
-    model = AutoModel.from_pretrained("namespace/awesome-name-you-picked")
+    >>> tokenizer = AutoTokenizer.from_pretrained("namespace/awesome-name-you-picked")
+    >>> model = AutoModel.from_pretrained("namespace/awesome-name-you-picked")

-Additional commands
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-You can list all the files you uploaded on the hub like this:
+You may specify a revision by using the ``revision`` flag in the ``from_pretrained`` method:

 .. code-block::

-    transformers-cli s3 ls
-
-You can also delete unneeded files with
-
-.. code-block::
-
-    transformers-cli s3 rm awesome-name-you-picked/filename
+    >>> tokenizer = AutoTokenizer.from_pretrained(
+    >>>   "julien-c/EsperBERTo-small",
+    >>>   revision="v2.0.1" # tag name, or branch name, or commit hash
+    >>> )