mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-03 12:50:06 +06:00
Fix typos (#25936)
* fix typo * fix typo * fix typo * fix typos * fix typos * fix typo * fix typo * fix typo * fix typos * fix typo * fix typo * fix typo * fix typos * fix typos
This commit is contained in:
parent
b1d475f6d2
commit
0f0e1a2c2b
@ -133,4 +133,4 @@ accelerate launch train.py
|
||||
>>> notebook_launcher(training_function)
|
||||
```
|
||||
|
||||
For more information about 🤗 Accelerate and it's rich features, refer to the [documentation](https://huggingface.co/docs/accelerate).
|
||||
For more information about 🤗 Accelerate and its rich features, refer to the [documentation](https://huggingface.co/docs/accelerate).
|
||||
|
@ -361,7 +361,7 @@ We expect that every model added to 🤗 Transformers passes a couple of integra
|
||||
model and the reimplemented version in 🤗 Transformers have to give the exact same output up to a precision of 0.001!
|
||||
Since it is normal that the exact same model written in different libraries can give a slightly different output
|
||||
depending on the library framework, we accept an error tolerance of 1e-3 (0.001). It is not enough if the model gives
|
||||
nearly the same output, they have to be the almost identical. Therefore, you will certainly compare the intermediate
|
||||
nearly the same output, they have to be almost identical. Therefore, you will certainly compare the intermediate
|
||||
outputs of the 🤗 Transformers version multiple times against the intermediate outputs of the original implementation of
|
||||
*brand_new_bert* in which case an **efficient** debugging environment of the original repository is absolutely
|
||||
important. Here is some advice is to make your debugging environment as efficient as possible.
|
||||
|
@ -56,7 +56,7 @@ you might recall from our [general overview of 🤗 Transformers](add_new_model#
|
||||
that we are an opinionated bunch - the ease of use of 🤗 Transformers relies on consistent design choices. From
|
||||
experience, we can tell you a few important things about adding TensorFlow models:
|
||||
|
||||
- Don't reinvent the wheel! More often that not, there are at least two reference implementations you should check: the
|
||||
- Don't reinvent the wheel! More often than not, there are at least two reference implementations you should check: the
|
||||
PyTorch equivalent of the model you are implementing and other TensorFlow models for the same class of problems.
|
||||
- Great model implementations survive the test of time. This doesn't happen because the code is pretty, but rather
|
||||
because the code is clear, easy to debug and build upon. If you make the life of the maintainers easy with your
|
||||
@ -101,7 +101,7 @@ TensorFlow-related pull request.
|
||||
|
||||
**2. Prepare transformers dev environment**
|
||||
|
||||
Having selected the model architecture, open an draft PR to signal your intention to work on it. Follow the
|
||||
Having selected the model architecture, open a draft PR to signal your intention to work on it. Follow the
|
||||
instructions below to set up your environment and open a draft PR.
|
||||
|
||||
1. Fork the [repository](https://github.com/huggingface/transformers) by clicking on the 'Fork' button on the
|
||||
@ -328,7 +328,7 @@ That's it! 🎉
|
||||
## Debugging mismatches across ML frameworks 🐛
|
||||
|
||||
At some point, when adding a new architecture or when creating TensorFlow weights for an existing architecture, you
|
||||
might come across errors compaining about mismatches between PyTorch and TensorFlow. You might even decide to open the
|
||||
might come across errors complaining about mismatches between PyTorch and TensorFlow. You might even decide to open the
|
||||
model architecture code for the two frameworks, and find that they look identical. What's going on? 🤔
|
||||
|
||||
First of all, let's talk about why understanding these mismatches matters. Many community members will use 🤗
|
||||
@ -351,7 +351,7 @@ ingredient here is patience. Here is our suggested workflow for when you come ac
|
||||
that you'll have to venture into the source implementation of said instruction. In some cases, you might find an
|
||||
issue with a reference implementation - don't abstain from opening an issue in the upstream repository.
|
||||
|
||||
In some cases, in dicussion with the 🤗 Transformers team, we might find that the fixing the mismatch is infeasible.
|
||||
In some cases, in discussion with the 🤗 Transformers team, we might find that fixing the mismatch is infeasible.
|
||||
When the mismatch is very small in the output layers of the model (but potentially large in the hidden states), we
|
||||
might decide to ignore it in favor of distributing the model. The `pt-to-tf` CLI mentioned above has a `--max-error`
|
||||
flag to override the error message at weight conversion time.
|
||||
|
@ -23,11 +23,11 @@ from PyTorch is:
|
||||
2. Load your pretrained weights.
|
||||
3. Put those pretrained weights in your random model.
|
||||
|
||||
Step 1 and 2 both require a full version of the model in memory, which is not a problem in most cases, but if your model starts weighing several GigaBytes, those two copies can make you got our of RAM. Even worse, if you are using `torch.distributed` to launch a distributed training, each process will load the pretrained model and store these two copies in RAM.
|
||||
Step 1 and 2 both require a full version of the model in memory, which is not a problem in most cases, but if your model starts weighing several GigaBytes, those two copies can make you get out of RAM. Even worse, if you are using `torch.distributed` to launch a distributed training, each process will load the pretrained model and store these two copies in RAM.
|
||||
|
||||
<Tip>
|
||||
|
||||
Note that the randomly created model is initialized with "empty" tensors, which take the space in memory without filling it (thus the random values are whatever was in this chunk of memory at a given time). The random initialization following the appropriate distribution for the kind of model/parameters instatiated (like a normal distribution for instance) is only performed after step 3 on the non-initialized weights, to be as fast as possible!
|
||||
Note that the randomly created model is initialized with "empty" tensors, which take the space in memory without filling it (thus the random values are whatever was in this chunk of memory at a given time). The random initialization following the appropriate distribution for the kind of model/parameters instantiated (like a normal distribution for instance) is only performed after step 3 on the non-initialized weights, to be as fast as possible!
|
||||
|
||||
</Tip>
|
||||
|
||||
@ -120,4 +120,4 @@ If you want to directly load such a sharded checkpoint inside a model without us
|
||||
|
||||
Sharded checkpoints reduce the memory usage during step 2 of the workflow mentioned above, but in order to use that model in a low memory setting, we recommend leveraging our tools based on the Accelerate library.
|
||||
|
||||
Please read the following guide for more information: [Large model loading using Accelerate](./main_classes/model#large-model-loading)
|
||||
Please read the following guide for more information: [Large model loading using Accelerate](./main_classes/model#large-model-loading)
|
||||
|
@ -10,7 +10,7 @@ This page regroups resources around 🤗 Transformers developed by the community
|
||||
|
||||
| Resource | Description | Author |
|
||||
|:----------|:-------------|------:|
|
||||
| [Hugging Face Transformers Glossary Flashcards](https://www.darigovresearch.com/huggingface-transformers-glossary-flashcards) | A set of flashcards based on the [Transformers Docs Glossary](glossary) that has been put into a form which can be easily learnt/revised using [Anki ](https://apps.ankiweb.net/) an open source, cross platform app specifically designed for long term knowledge retention. See this [Introductory video on how to use the flashcards](https://www.youtube.com/watch?v=Dji_h7PILrw). | [Darigov Research](https://www.darigovresearch.com/) |
|
||||
| [Hugging Face Transformers Glossary Flashcards](https://www.darigovresearch.com/huggingface-transformers-glossary-flashcards) | A set of flashcards based on the [Transformers Docs Glossary](glossary) that has been put into a form which can be easily learned/revised using [Anki ](https://apps.ankiweb.net/) an open source, cross platform app specifically designed for long term knowledge retention. See this [Introductory video on how to use the flashcards](https://www.youtube.com/watch?v=Dji_h7PILrw). | [Darigov Research](https://www.darigovresearch.com/) |
|
||||
|
||||
## Community notebooks:
|
||||
|
||||
|
@ -209,7 +209,7 @@ Easily reuse this checkpoint for another task by switching to a different model
|
||||
The last base class you need before using a model for textual data is a [tokenizer](main_classes/tokenizer) to convert raw text to tensors. There are two types of tokenizers you can use with 🤗 Transformers:
|
||||
|
||||
- [`PreTrainedTokenizer`]: a Python implementation of a tokenizer.
|
||||
- [`PreTrainedTokenizerFast`]: a tokenizer from our Rust-based [🤗 Tokenizer](https://huggingface.co/docs/tokenizers/python/latest/) library. This tokenizer type is significantly faster - especially during batch tokenization - due to it's Rust implementation. The fast tokenizer also offers additional methods like *offset mapping* which maps tokens to their original words or characters.
|
||||
- [`PreTrainedTokenizerFast`]: a tokenizer from our Rust-based [🤗 Tokenizer](https://huggingface.co/docs/tokenizers/python/latest/) library. This tokenizer type is significantly faster - especially during batch tokenization - due to its Rust implementation. The fast tokenizer also offers additional methods like *offset mapping* which maps tokens to their original words or characters.
|
||||
|
||||
Both tokenizers support common methods such as encoding and decoding, adding new tokens, and managing special tokens.
|
||||
|
||||
|
@ -341,7 +341,7 @@ model. This is different from pushing the code to the Hub in the sense that user
|
||||
get the custom models (contrarily to automatically downloading the model code from the Hub).
|
||||
|
||||
As long as your config has a `model_type` attribute that is different from existing model types, and that your model
|
||||
classes have the right `config_class` attributes, you can just add them to the auto classes likes this:
|
||||
classes have the right `config_class` attributes, you can just add them to the auto classes like this:
|
||||
|
||||
```py
|
||||
from transformers import AutoConfig, AutoModel, AutoModelForImageClassification
|
||||
|
@ -92,7 +92,7 @@ sequences that start with a lower probability initial tokens and would've been i
|
||||
- `do_sample`: if set to `True`, this parameter enables decoding strategies such as multinomial sampling, beam-search
|
||||
multinomial sampling, Top-K sampling and Top-p sampling. All these strategies select the next token from the probability
|
||||
distribution over the entire vocabulary with various strategy-specific adjustments.
|
||||
- `num_return_sequences`: the number of sequence candidates to return for each input. This options is only available for
|
||||
- `num_return_sequences`: the number of sequence candidates to return for each input. This option is only available for
|
||||
the decoding strategies that support multiple sequence candidates, e.g. variations of beam search and sampling. Decoding
|
||||
strategies like greedy search and contrastive search return a single output sequence.
|
||||
|
||||
@ -146,7 +146,7 @@ one for summarization with beam search). You must have the right Hub permissions
|
||||
|
||||
## Streaming
|
||||
|
||||
The `generate()` supports streaming, through its `streamer` input. The `streamer` input is compatible any instance
|
||||
The `generate()` supports streaming, through its `streamer` input. The `streamer` input is compatible with any instance
|
||||
from a class that has the following methods: `put()` and `end()`. Internally, `put()` is used to push new tokens and
|
||||
`end()` is used to flag the end of text generation.
|
||||
|
||||
@ -301,7 +301,7 @@ the `num_beams` greater than 1, and set `do_sample=True` to use this decoding st
|
||||
The diverse beam search decoding strategy is an extension of the beam search strategy that allows for generating a more diverse
|
||||
set of beam sequences to choose from. To learn how it works, refer to [Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models](https://arxiv.org/pdf/1610.02424.pdf).
|
||||
This approach has three main parameters: `num_beams`, `num_beam_groups`, and `diversity_penalty`.
|
||||
The diversily penalty ensures the outputs are distinct across groups, and beam search is used within each group.
|
||||
The diversity penalty ensures the outputs are distinct across groups, and beam search is used within each group.
|
||||
|
||||
|
||||
```python
|
||||
@ -367,7 +367,7 @@ To enable assisted decoding, set the `assistant_model` argument with a model.
|
||||
['Alice and Bob are sitting in a bar. Alice is drinking a beer and Bob is drinking a']
|
||||
```
|
||||
|
||||
When using assisted decoding with sampling methods, you can use the `temperarure` argument to control the randomness
|
||||
When using assisted decoding with sampling methods, you can use the `temperature` argument to control the randomness
|
||||
just like in multinomial sampling. However, in assisted decoding, reducing the temperature will help improving latency.
|
||||
|
||||
```python
|
||||
|
@ -187,7 +187,7 @@ The model head refers to the last layer of a neural network that accepts the raw
|
||||
|
||||
### image patch
|
||||
|
||||
Vision-based Transformers models split an image into smaller patches which are linearly embedded, and then passed as a sequence to the model. You can find the `patch_size` - or resolution - of the model in it's configuration.
|
||||
Vision-based Transformers models split an image into smaller patches which are linearly embedded, and then passed as a sequence to the model. You can find the `patch_size` - or resolution - of the model in its configuration.
|
||||
|
||||
### inference
|
||||
|
||||
|
@ -27,7 +27,7 @@ This tutorial will show you how to:
|
||||
|
||||
* Generate text with an LLM
|
||||
* Avoid common pitfalls
|
||||
* Next steps to help you get the most out your LLM
|
||||
* Next steps to help you get the most out of your LLM
|
||||
|
||||
Before you begin, make sure you have all the necessary libraries installed:
|
||||
|
||||
|
@ -184,7 +184,7 @@ GPU memory occupied: 14949 MB.
|
||||
We see that already a relatively small batch size almost fills up our GPU's entire memory. However, a larger batch size
|
||||
can often result in faster model convergence or better end performance. So ideally we want to tune the batch size to our
|
||||
model's needs and not to the GPU limitations. What's interesting is that we use much more memory than the size of the model.
|
||||
To understand a bit better why this is the case let's have look at a model's operations and memory needs.
|
||||
To understand a bit better why this is the case let's have a look at a model's operations and memory needs.
|
||||
|
||||
## Anatomy of Model's Operations
|
||||
|
||||
|
@ -95,7 +95,7 @@ Specify `from_pt=True` to convert a checkpoint from PyTorch to TensorFlow:
|
||||
>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True)
|
||||
```
|
||||
|
||||
Then you can save your new TensorFlow model with it's new checkpoint:
|
||||
Then you can save your new TensorFlow model with its new checkpoint:
|
||||
|
||||
```py
|
||||
>>> tf_model.save_pretrained("path/to/awesome-name-you-picked")
|
||||
@ -201,7 +201,7 @@ Or perhaps you'd like to add the TensorFlow version of your fine-tuned PyTorch m
|
||||
>>> tf_model.push_to_hub("my-awesome-model")
|
||||
```
|
||||
|
||||
Now when you navigate to the your Hugging Face profile, you should see your newly created model repository. Clicking on the **Files** tab will display all the files you've uploaded to the repository.
|
||||
Now when you navigate to your Hugging Face profile, you should see your newly created model repository. Clicking on the **Files** tab will display all the files you've uploaded to the repository.
|
||||
|
||||
For more details on how to create and upload files to a repository, refer to the Hub documentation [here](https://huggingface.co/docs/hub/how-to-upstream).
|
||||
|
||||
|
@ -25,7 +25,7 @@ Apprentissage automatique de pointe pour [PyTorch](https://pytorch.org/), [Tenso
|
||||
🗣️ **Audio**: reconnaissance automatique de la parole et classification audio.<br>
|
||||
🐙 **Multimodalité**: système de question-réponse avec des tableaux ou images, reconnaissance optique de caractères, extraction d'information depuis des documents scannés et classification de vidéo.
|
||||
|
||||
🤗 Transformers prend en charge l'interopérabilité entre PyTorch, TensorFlow et JAX. Cela permet d'utiliser un framework différent à chaque étape de la vie d'un modèle, par example entraîner un modèle en trois lignes de code avec un framework, et le charger pour l'inférence avec un autre. Les modèles peuvent également être exportés dans un format comme ONNX et TorchScript pour être déployés dans des environnements de production.
|
||||
🤗 Transformers prend en charge l'interopérabilité entre PyTorch, TensorFlow et JAX. Cela permet d'utiliser un framework différent à chaque étape de la vie d'un modèle, par exemple entraîner un modèle en trois lignes de code avec un framework, et le charger pour l'inférence avec un autre. Les modèles peuvent également être exportés dans un format comme ONNX et TorchScript pour être déployés dans des environnements de production.
|
||||
|
||||
Rejoignez la communauté grandissante sur le [Hub](https://huggingface.co/models), le [forum](https://discuss.huggingface.co/) ou [Discord](https://discord.com/invite/JfAtkvEtRb) dès aujourd'hui !
|
||||
|
||||
@ -407,4 +407,4 @@ Le tableau ci-dessous représente la prise en charge actuelle dans la bibliothè
|
||||
| YOLOS | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
| YOSO | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
|
||||
<!-- End table-->
|
||||
<!-- End table-->
|
||||
|
@ -60,7 +60,7 @@ Le [`pipeline`] est le moyen le plus simple d'utiliser un modèle pré-entraîn
|
||||
| Traduction | Traduit du texte d'un langage à un autre | Texte | pipeline(task="translation") |
|
||||
| Classification d'image | Attribue une catégorie à une image | Image | pipeline(task="image-classification") |
|
||||
| Segmentation d'image | Attribue une catégorie à chaque pixel d'une image (supporte la segmentation sémantique, panoptique et d'instance) | Image | pipeline(task="image-segmentation") |
|
||||
| Détection d'objects | Prédit les délimitations et catégories d'objects dans une image | Image | pipeline(task="object-detection") |
|
||||
| Détection d'objets | Prédit les délimitations et catégories d'objets dans une image | Image | pipeline(task="object-detection") |
|
||||
| Classification d'audio | Attribue une catégorie à un fichier audio | Audio | pipeline(task="audio-classification") |
|
||||
| Reconnaissance automatique de la parole | Extrait le discours d'un fichier audio en texte | Audio | pipeline(task="automatic-speech-recognition") |
|
||||
| Question réponse visuels | Etant données une image et une question, répond correctement à une question sur l'image | Modalités multiples | pipeline(task="vqa") |
|
||||
@ -99,7 +99,7 @@ Le [`pipeline`] peut aussi itérer sur un jeu de données entier pour n'importe
|
||||
>>> speech_recognizer = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")
|
||||
```
|
||||
|
||||
Chargez un jeu de données audio (voir le 🤗 Datasets [Quick Start](https://huggingface.co/docs/datasets/quickstart#audio) pour plus de détails) sur lequel vous souhaitez itérer. Pour cet example, nous chargons le jeu de données [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) :
|
||||
Chargez un jeu de données audio (voir le 🤗 Datasets [Quick Start](https://huggingface.co/docs/datasets/quickstart#audio) pour plus de détails) sur lequel vous souhaitez itérer. Pour cet exemple, nous chargeons le jeu de données [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) :
|
||||
|
||||
```py
|
||||
>>> from datasets import load_dataset, Audio
|
||||
@ -155,7 +155,7 @@ Utilisez [`TFAutoModelForSequenceClassification`] et [`AutoTokenizer`] pour char
|
||||
</tf>
|
||||
</frameworkcontent>
|
||||
|
||||
Specifiez le modèle et le tokenizer dans le [`pipeline`], et utilisez le `classifier` sur le texte en français :
|
||||
Spécifiez le modèle et le tokenizer dans le [`pipeline`], et utilisez le `classifier` sur le texte en français :
|
||||
|
||||
```py
|
||||
>>> classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
|
||||
@ -418,7 +418,7 @@ En fonction de votre tâche, vous passerez généralement les paramètres suivan
|
||||
>>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
|
||||
```
|
||||
|
||||
2. [`TrainingArguments`] contient les hyperparamètres du modèle que vous pouvez changer comme le taux d'apprentissage, la taille due l'échantillon, et le nombre d'époques pour s'entraîner. Les valeurs par défaut sont utilisées si vous ne spécifiez pas d'hyperparamètres d'apprentissage :
|
||||
2. [`TrainingArguments`] contient les hyperparamètres du modèle que vous pouvez changer comme le taux d'apprentissage, la taille de l'échantillon, et le nombre d'époques pour s'entraîner. Les valeurs par défaut sont utilisées si vous ne spécifiez pas d'hyperparamètres d'apprentissage :
|
||||
|
||||
```py
|
||||
>>> from transformers import TrainingArguments
|
||||
@ -547,4 +547,4 @@ Tous les modèles sont des modèles standard [`tf.keras.Model`](https://www.tens
|
||||
|
||||
## Et après ?
|
||||
|
||||
Maintenant que vous avez terminé la visite rapide de 🤗 Transformers, consultez nos guides et apprenez à faire des choses plus spécifiques comme créer un modèle personnalisé, finetuner un modèle pour une tâche, et comment entraîner un modèle avec un script. Si vous souhaitez en savoir plus sur les concepts fondamentaux de 🤗 Transformers, jetez un œil à nos guides conceptuels !
|
||||
Maintenant que vous avez terminé la visite rapide de 🤗 Transformers, consultez nos guides et apprenez à faire des choses plus spécifiques comme créer un modèle personnalisé, finetuner un modèle pour une tâche, et comment entraîner un modèle avec un script. Si vous souhaitez en savoir plus sur les concepts fondamentaux de 🤗 Transformers, jetez un œil à nos guides conceptuels !
|
||||
|
Loading…
Reference in New Issue
Block a user