mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-05 13:50:13 +06:00

* udpaet * update * Update docs/source/ja/autoclass_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add codes workflows/build_pr_documentation.yml * Create preprocessing.md * added traning.md * Create Model_sharing.md * add quicktour.md * new * ll * Create benchmark.md * Create Tensorflow_model * add * add community.md * add create_a_model * create custom_model.md * create_custom_tools.md * create fast_tokenizers.md * create * add * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * md * add * commit * add * h * Update docs/source/ja/peft.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/ja/_toctree.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/ja/_toctree.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Suggested Update * add perf_train_gpu_one.md * added perf based MD files * Modify toctree.yml and Add transmartion to md codes * Add `serialization.md` and edit `_toctree.yml` * add task summary and tasks explained * Add and Modify files starting from T * Add testing.md * Create main_classes files * delete main_classes folder * Add toctree.yml * Update llm_tutorail.md * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update misspelled filenames * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/_toctree.yml * Update docs/source/ja/_toctree.yml * missplled file names inmrpovements * Update _toctree.yml * close tip block * close another tip block * Update docs/source/ja/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/pipeline_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/pipeline_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/preprocessing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/peft.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/add_new_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/testing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/task_summary.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks_explained.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update glossary.md * Update docs/source/ja/transformers_agents.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/llm_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/create_a_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/torchscript.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/benchmarks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/troubleshooting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/troubleshooting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/troubleshooting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/add_new_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update perf_torch_compile.md * Update Year to default in en documentation * Final Update --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
111 lines
22 KiB
Markdown
111 lines
22 KiB
Markdown
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||
the License. You may obtain a copy of the License at
|
||
|
||
http://www.apache.org/licenses/LICENSE-2.0
|
||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||
specific language governing permissions and limitations under the License.
|
||
|
||
â ïž Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
||
rendered properly in your Markdown viewer.
|
||
|
||
-->
|
||
|
||
# The Transformer model family
|
||
|
||
2017幎ã«å°å
¥ãããŠä»¥æ¥ã[å
ã®Transformer](https://arxiv.org/abs/1706.03762)ã¢ãã«ã¯ãèªç¶èšèªåŠçïŒNLPïŒã®ã¿ã¹ã¯ãè¶
ããå€ãã®æ°ãããšããµã€ãã£ã³ã°ãªã¢ãã«ãã€ã³ã¹ãã€ã¢ããŸããã[ã¿ã³ãã¯è³ªã®æããããŸããæ§é ãäºæž¬](https://huggingface.co/blog/deep-learning-with-proteins)ããã¢ãã«ã[ããŒã¿ãŒãèµ°ãããããã®ãã¬ãŒãã³ã°](https://huggingface.co/blog/train-decision-transformers)ããã¢ãã«ããããŠ[æç³»åäºæž¬](https://huggingface.co/blog/time-series-transformers)ã®ããã®ã¢ãã«ãªã©ããããŸããTransformerã®ããŸããŸãªããªã¢ã³ããå©çšå¯èœã§ããã倧å±ãèŠèœãšãããšããããŸãããããã®ãã¹ãŠã®ã¢ãã«ã«å
±éããã®ã¯ãå
ã®Transformerã¢ãŒããã¯ãã£ã«åºã¥ããŠããããšã§ããäžéšã®ã¢ãã«ã¯ãšã³ã³ãŒããŸãã¯ãã³ãŒãã®ã¿ã䜿çšããä»ã®ã¢ãã«ã¯äž¡æ¹ã䜿çšããŸããããã¯ãTransformerãã¡ããªãŒå
ã®ã¢ãã«ã®é«ã¬ãã«ã®éããã«ããŽã©ã€ãºãã調æ»ããããã®æçšãªå顿³ãæäŸãã以åã«åºäŒã£ãããšã®ãªãTransformerãçè§£ããã®ã«åœ¹ç«ã¡ãŸãã
|
||
|
||
å
ã®Transformerã¢ãã«ã«æ
£ããŠããªããããªãã¬ãã·ã¥ãå¿
èŠãªå Žåã¯ãHugging Faceã³ãŒã¹ã®[Transformerã®åäœåç](https://huggingface.co/course/chapter1/4?fw=pt)ç« ããã§ãã¯ããŠãã ããã
|
||
|
||
<div align="center">
|
||
<iframe width="560" height="315" src="https://www.youtube.com/embed/H39Z_720T5s" title="YouTubeãããªãã¬ãŒã€ãŒ"
|
||
frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope;
|
||
picture-in-picture" allowfullscreen></iframe>
|
||
</div>
|
||
|
||
## Computer vision
|
||
|
||
<iframe style="border: 1px solid rgba(0, 0, 0, 0.1);" width="1000" height="450" src="https://www.figma.com/embed?embed_host=share&url=https%3A%2F%2Fwww.figma.com%2Ffile%2FacQBpeFBVvrDUlzFlkejoz%2FModelscape-timeline%3Fnode-id%3D0%253A1%26t%3Dm0zJ7m2BQ9oe0WtO-1" allowfullscreen></iframe>
|
||
|
||
### Convolutional network
|
||
|
||
é·ãéãç³ã¿èŸŒã¿ãããã¯ãŒã¯ïŒCNNïŒã¯ã³ã³ãã¥ãŒã¿ããžã§ã³ã®ã¿ã¹ã¯ã«ãããŠæ¯é
çãªãã©ãã€ã ã§ãããã[ããžã§ã³Transformer](https://arxiv.org/abs/2010.11929)ã¯ãã®ã¹ã±ãŒã©ããªãã£ãšå¹çæ§ã瀺ããŸãããããã§ããäžéšã®CNNã®æé«ã®ç¹æ§ãç¹ã«ç¹å®ã®ã¿ã¹ã¯ã«ãšã£ãŠã¯éåžžã«åŒ·åãªç¿»èš³äžå€æ§ãªã©ãäžéšã®Transformerã¯ã¢ãŒããã¯ãã£ã«ç³ã¿èŸŒã¿ãçµã¿èŸŒãã§ããŸãã[ConvNeXt](model_doc/convnext)ã¯ãç³ã¿èŸŒã¿ãçŸä»£åããããã«Transformerããèšèšã®éžæè¢ãåãå
¥ããäŸãã°ãConvNeXtã¯ç»åããããã«åå²ããããã«éãªãåããªãã¹ã©ã€ãã£ã³ã°ãŠã£ã³ããŠãšãã°ããŒãã«å容éãå¢å ãããããã®å€§ããªã«ãŒãã«ã䜿çšããŸããConvNeXtã¯ãã¡ã¢ãªå¹çãåäžãããããã©ãŒãã³ã¹ãåäžãããããã«ããã€ãã®ã¬ã€ã€ãŒãã¶ã€ã³ã®éžæè¢ãæäŸããTransformerãšç«¶åçã«ãªããŸãïŒ
|
||
|
||
|
||
### Encoder[[cv-encoder]]
|
||
|
||
[ããžã§ã³ ãã©ã³ã¹ãã©ãŒããŒïŒViTïŒ](model_doc/vit) ã¯ãç³ã¿èŸŒã¿ã䜿çšããªãã³ã³ãã¥ãŒã¿ããžã§ã³ã¿ã¹ã¯ã®æãéããŸãããViT ã¯æšæºã®ãã©ã³ã¹ãã©ãŒããŒãšã³ã³ãŒããŒã䜿çšããŸãããç»åãæ±ãæ¹æ³ãäž»èŠãªãã¬ãŒã¯ã¹ã«ãŒã§ãããç»åãåºå®ãµã€ãºã®ãããã«åå²ããããããããŒã¯ã³ã®ããã«äœ¿çšããŠåã蟌ã¿ãäœæããŸããViT ã¯ãåœæã®CNNãšç«¶äºåã®ããçµæã瀺ãããã«ãã©ã³ã¹ãã©ãŒããŒã®å¹ççãªã¢ãŒããã¯ãã£ã掻çšããŸãããããã¬ãŒãã³ã°ã«å¿
èŠãªãªãœãŒã¹ãå°ãªããŠæžã¿ãŸãããViT ã«ç¶ããŠãã»ã°ã¡ã³ããŒã·ã§ã³ãæ€åºãªã©ã®å¯ãªããžã§ã³ã¿ã¹ã¯ãåŠçã§ããä»ã®ããžã§ã³ã¢ãã«ãç»å ŽããŸããã
|
||
|
||
ãããã®ã¢ãã«ã®1ã€ã[Swin](model_doc/swin) ãã©ã³ã¹ãã©ãŒããŒã§ããSwin ãã©ã³ã¹ãã©ãŒããŒã¯ãããå°ããªãµã€ãºã®ãããããéå±€çãªç¹åŸŽãããïŒCNNã®ããã§ ViT ãšã¯ç°ãªããŸãïŒãæ§ç¯ããæ·±å±€ã®ããããšé£æ¥ããããããšããŒãžããŸããæ³šæã¯ããŒã«ã«ãŠã£ã³ããŠå
ã§ã®ã¿èšç®ããããŠã£ã³ããŠã¯æ³šæã®ã¬ã€ã€ãŒéã§ã·ãããããã¢ãã«ãããè¯ãåŠç¿ããã®ããµããŒãããæ¥ç¶ãäœæããŸããSwin ãã©ã³ã¹ãã©ãŒããŒã¯éå±€çãªç¹åŸŽããããçæã§ãããããã»ã°ã¡ã³ããŒã·ã§ã³ãæ€åºãªã©ã®å¯ãªäºæž¬ã¿ã¹ã¯ã«é©ããŠããŸãã[SegFormer](model_doc/segformer) ãéå±€çãªç¹åŸŽããããæ§ç¯ããããã«ãã©ã³ã¹ãã©ãŒããŒãšã³ã³ãŒããŒã䜿çšããŸããããã¹ãŠã®ç¹åŸŽããããçµã¿åãããŠäºæž¬ããããã«ã·ã³ãã«ãªãã«ãã¬ã€ã€ãŒããŒã»ãããã³ïŒMLPïŒãã³ãŒããŒã远å ããŸãã
|
||
|
||
BeIT ããã³ ViTMAE ãªã©ã®ä»ã®ããžã§ã³ã¢ãã«ã¯ãBERTã®äºåãã¬ãŒãã³ã°ç®æšããã€ã³ã¹ãã¬ãŒã·ã§ã³ãåŸãŸããã[BeIT](model_doc/beit) 㯠*masked image modeling (MIM)* ã«ãã£ãŠäºåãã¬ãŒãã³ã°ãããŠããŸããç»åãããã¯ã©ã³ãã ã«ãã¹ã¯ãããç»åãèŠèŠããŒã¯ã³ã«ããŒã¯ã³åãããŸããBeIT ã¯ãã¹ã¯ããããããã«å¯Ÿå¿ããèŠèŠããŒã¯ã³ãäºæž¬ããããã«ãã¬ãŒãã³ã°ãããŸãã[ViTMAE](model_doc/vitmae) ã䌌ããããªäºåãã¬ãŒãã³ã°ç®æšãæã£ãŠãããèŠèŠããŒã¯ã³ã®ä»£ããã«ãã¯ã»ã«ãäºæž¬ããå¿
èŠããããŸããç°äŸãªã®ã¯ç»åãããã®75%ããã¹ã¯ãããŠããããšã§ãïŒãã³ãŒããŒã¯ãã¹ã¯ãããããŒã¯ã³ãšãšã³ã³ãŒããããããããããã¯ã»ã«ãåæ§ç¯ããŸããäºåãã¬ãŒãã³ã°ã®åŸããã³ãŒããŒã¯æšãŠããããšã³ã³ãŒããŒã¯ããŠã³ã¹ããªãŒã ã®ã¿ã¹ã¯ã§äœ¿çšã§ããç¶æ
ã§ãã
|
||
|
||
### Decoder[[cv-decoder]]
|
||
|
||
ãã³ãŒããŒã®ã¿ã®ããžã§ã³ã¢ãã«ã¯çããã§ãããªããªããã»ãšãã©ã®ããžã§ã³ã¢ãã«ã¯ç»å衚çŸãåŠã¶ããã«ãšã³ã³ãŒããŒã䜿çšããããã§ããããããç»åçæãªã©ã®ãŠãŒã¹ã±ãŒã¹ã§ã¯ããã³ãŒããŒã¯èªç¶ãªé©å¿ã§ããããã¯ãGPT-2ãªã©ã®ããã¹ãçæã¢ãã«ããèŠãŠããããã«ã[ImageGPT](model_doc/imagegpt) ã§ãåæ§ã®ã¢ãŒããã¯ãã£ã䜿çšããŸãããã·ãŒã±ã³ã¹å
ã®æ¬¡ã®ããŒã¯ã³ãäºæž¬ãã代ããã«ãç»åå
ã®æ¬¡ã®ãã¯ã»ã«ãäºæž¬ããŸããç»åçæã«å ããŠãImageGPT ã¯ç»ååé¡ã®ããã«ããã¡ã€ã³ãã¥ãŒãã³ã°ã§ããŸãã
|
||
|
||
### Encoder-decoder[[cv-encoder-decoder]]
|
||
|
||
ããžã§ã³ã¢ãã«ã¯äžè¬çã«ãšã³ã³ãŒããŒïŒããã¯ããŒã³ãšãåŒã°ããŸãïŒã䜿çšããŠéèŠãªç»åç¹åŸŽãæœåºããããããã©ã³ã¹ãã©ãŒããŒãã³ãŒããŒã«æž¡ãããã«äœ¿çšããŸãã[DETR](model_doc/detr) ã¯äºåãã¬ãŒãã³ã°æžã¿ã®ããã¯ããŒã³ãæã£ãŠããŸããããªããžã§ã¯ãæ€åºã®ããã«å®å
šãªãã©ã³ã¹ãã©ãŒããŒãšã³ã³ãŒããŒãã³ãŒããŒã¢ãŒããã¯ãã£ã䜿çšããŠããŸãããšã³ã³ãŒããŒã¯ç»å衚çŸãåŠã³ããã³ãŒããŒå
ã®ãªããžã§ã¯ãã¯ãšãªïŒåãªããžã§ã¯ãã¯ãšãªã¯ç»åå
ã®é åãŸãã¯ãªããžã§ã¯ãã«çŠç¹ãåœãŠãåŠç¿ãããåã蟌ã¿ã§ãïŒãšçµã¿åãããŸããDETR ã¯åãªããžã§ã¯ãã¯ãšãªã«å¯Ÿããå¢çããã¯ã¹ã®åº§æšãšã¯ã©ã¹ã©ãã«ãäºæž¬ããŸãã
|
||
|
||
## Natural lanaguage processing
|
||
|
||
<iframe style="border: 1px solid rgba(0, 0, 0, 0.1);" width="1000" height="450" src="https://www.figma.com/embed?embed_host=share&url=https%3A%2F%2Fwww.figma.com%2Ffile%2FUhbQAZDlpYW5XEpdFy6GoG%2Fnlp-model-timeline%3Fnode-id%3D0%253A1%26t%3D4mZMr4r1vDEYGJ50-1" allowfullscreen></iframe>
|
||
|
||
### Encoder[[nlp-encoder]]
|
||
|
||
[BERT](model_doc/bert) ã¯ãšã³ã³ãŒããŒå°çšã®Transformerã§ãå
¥åã®äžéšã®ããŒã¯ã³ãã©ã³ãã ã«ãã¹ã¯ããŠä»ã®ããŒã¯ã³ãèŠãªãããã«ããŠããŸããããã«ãããããŒã¯ã³ããã¹ã¯ããæèã«åºã¥ããŠãã¹ã¯ãããããŒã¯ã³ãäºæž¬ããããšãäºåãã¬ãŒãã³ã°ã®ç®æšã§ããããã«ãããBERTã¯å
¥åã®ããæ·±ããã€è±ããªè¡šçŸãåŠç¿ããã®ã«å·Šå³ã®æèãå®å
šã«æŽ»çšã§ããŸããããããBERTã®äºåãã¬ãŒãã³ã°æŠç¥ã«ã¯ãŸã æ¹åã®äœå°ããããŸããã[RoBERTa](model_doc/roberta) ã¯ããã¬ãŒãã³ã°ãé·æéè¡ãããã倧ããªãããã§ãã¬ãŒãã³ã°ããäºååŠçäžã«äžåºŠã ãã§ãªãåãšããã¯ã§ããŒã¯ã³ãã©ã³ãã ã«ãã¹ã¯ããæ¬¡æäºæž¬ã®ç®æšãåé€ããæ°ããäºåãã¬ãŒãã³ã°ã¬ã·ããå°å
¥ããããšã§ãããæ¹åããŸããã
|
||
|
||
æ§èœãåäžãããäž»èŠãªæŠç¥ã¯ã¢ãã«ã®ãµã€ãºãå¢ããããšã§ãããå€§èŠæš¡ãªã¢ãã«ã®ãã¬ãŒãã³ã°ã¯èšç®ã³ã¹ããããããŸããèšç®ã³ã¹ããåæžããæ¹æ³ã®1ã€ã¯ã[DistilBERT](model_doc/distilbert) ã®ãããªå°ããªã¢ãã«ã䜿çšããããšã§ããDistilBERTã¯[ç¥èèžç](https://arxiv.org/abs/1503.02531) - å§çž®æè¡ - ã䜿çšããŠãBERTã®ã»ãŒãã¹ãŠã®èšèªçè§£æ©èœãä¿æããªãããããå°ããªããŒãžã§ã³ãäœæããŸãã
|
||
|
||
ããããã»ãšãã©ã®Transformerã¢ãã«ã¯åŒãç¶ãããå€ãã®ãã©ã¡ãŒã¿ã«çŠç¹ãåœãŠããã¬ãŒãã³ã°å¹çãåäžãããæ°ããã¢ãã«ãç»å ŽããŠããŸãã[ALBERT](model_doc/albert) ã¯ã2ã€ã®æ¹æ³ã§ãã©ã¡ãŒã¿ã®æ°ãæžããããšã«ãã£ãŠã¡ã¢ãªæ¶è²»éãåæžããŸãã倧ããªèªåœåã蟌ã¿ã2ã€ã®å°ããªè¡åã«åå²ããã¬ã€ã€ãŒããã©ã¡ãŒã¿ãå
±æã§ããããã«ããŸãã[DeBERTa](model_doc/deberta) ã¯ãåèªãšãã®äœçœ®ã2ã€ã®ãã¯ãã«ã§å¥ã
ã«ãšã³ã³ãŒãããè§£ãããæ³šææ©æ§ã远å ããŸãããæ³šæã¯ãããã®å¥ã
ã®ãã¯ãã«ããèšç®ãããŸããåèªãšäœçœ®ã®åã蟌ã¿ãå«ãŸããåäžã®ãã¯ãã«ã§ã¯ãªãã[Longformer](model_doc/longformer) ã¯ãç¹ã«é·ãã·ãŒã±ã³ã¹é·ã®ããã¥ã¡ã³ããåŠçããããã«æ³šæãããå¹ççã«ããããšã«çŠç¹ãåœãŠãŸãããåºå®ããããŠã£ã³ããŠãµã€ãºã®åšãã®åããŒã¯ã³ããèšç®ãããããŒã«ã«ãŠã£ã³ããŠä»ã泚æïŒç¹å®ã®ã¿ã¹ã¯ããŒã¯ã³ïŒåé¡ã®ããã® `[CLS]` ãªã©ïŒã®ã¿ã®ããã®ã°ããŒãã«ãªæ³šæãå«ãïŒã®çµã¿åããã䜿çšããŠãå®å
šãªæ³šæè¡åã§ã¯ãªãçãªæ³šæè¡åãäœæããŸãã
|
||
|
||
|
||
### Decoder[[nlp-decoder]]
|
||
|
||
[GPT-2](model_doc/gpt2)ã¯ãã·ãŒã±ã³ã¹å
ã®æ¬¡ã®åèªãäºæž¬ãããã³ãŒããŒå°çšã®Transformerã§ããã¢ãã«ã¯å
ãèŠãããšãã§ããªãããã«ããŒã¯ã³ãå³ã«ãã¹ã¯ãã"ã®ããèŠ"ãé²ããŸãã倧éã®ããã¹ããäºåãã¬ãŒãã³ã°ããããšã«ãããGPT-2ã¯ããã¹ãçæãéåžžã«åŸæã§ãããã¹ããæ£ç¢ºã§ããããšãããã«ããŠããæææ£ç¢ºã§ã¯ãªãããšããããŸããããããGPT-2ã«ã¯BERTã®äºåãã¬ãŒãã³ã°ããã®åæ¹åã³ã³ããã¹ããäžè¶³ããŠãããç¹å®ã®ã¿ã¹ã¯ã«ã¯é©ããŠããŸããã§ããã[XLNET](model_doc/xlnet)ã¯ãåæ¹åã«åŠç¿ã§ããé åèšèªã¢ããªã³ã°ç®æšïŒPLMïŒã䜿çšããããšã§ãBERTãšGPT-2ã®äºåãã¬ãŒãã³ã°ç®æšã®ãã¹ããçµã¿åãããŠããŸãã
|
||
|
||
GPT-2ã®åŸãèšèªã¢ãã«ã¯ããã«å€§ããæé·ããä»ã§ã¯*å€§èŠæš¡èšèªã¢ãã«ïŒLLMïŒ*ãšããŠç¥ãããŠããŸããå€§èŠæš¡ãªããŒã¿ã»ããã§äºåãã¬ãŒãã³ã°ãããã°ãLLMã¯ã»ãŒãŒãã·ã§ããåŠç¿ã瀺ãããšããããŸãã[GPT-J](model_doc/gptj)ã¯ã6Bã®ãã©ã¡ãŒã¿ãæã€LLMã§ã400Bã®ããŒã¯ã³ã§ãã¬ãŒãã³ã°ãããŠããŸããGPT-Jã«ã¯[OPT](model_doc/opt)ãç¶ãããã®ãã¡æå€§ã®ã¢ãã«ã¯175Bã§ã180Bã®ããŒã¯ã³ã§ãã¬ãŒãã³ã°ãããŠããŸããåãææã«[BLOOM](model_doc/bloom)ããªãªãŒã¹ããããã®ãã¡ããªãŒã®æå€§ã®ã¢ãã«ã¯176Bã®ãã©ã¡ãŒã¿ãæã¡ã46ã®èšèªãš13ã®ããã°ã©ãã³ã°èšèªã§366Bã®ããŒã¯ã³ã§ãã¬ãŒãã³ã°ãããŠããŸãã
|
||
|
||
### Encoder-decoder[[nlp-encoder-decoder]]
|
||
|
||
[BART](model_doc/bart)ã¯ãå
ã®Transformerã¢ãŒããã¯ãã£ãä¿æããŠããŸãããäºåãã¬ãŒãã³ã°ç®æšã*ããã¹ãè£å®*ã®ç Žæã«å€æŽããŠããŸããäžéšã®ããã¹ãã¹ãã³ã¯åäžã®`mask`ããŒã¯ã³ã§çœ®æãããŸãããã³ãŒããŒã¯ç ŽæããŠããªãããŒã¯ã³ãäºæž¬ãïŒæªæ¥ã®ããŒã¯ã³ã¯ãã¹ã¯ãããŸãïŒããšã³ã³ãŒããŒã®é ããç¶æ
ã䜿çšããŠäºæž¬ãè£å©ããŸãã[Pegasus](model_doc/pegasus)ã¯BARTã«äŒŒãŠããŸãããPegasusã¯ããã¹ãã¹ãã³ã®ä»£ããã«æå
šäœããã¹ã¯ããŸãããã¹ã¯ãããèšèªã¢ããªã³ã°ã«å ããŠãPegasusã¯ã®ã£ããæçæïŒGSGïŒã«ãã£ãŠäºåãã¬ãŒãã³ã°ãããŠããŸããGSGã®ç®æšã¯ãææžã«éèŠãªæããã¹ã¯ãããããã`mask`ããŒã¯ã³ã§çœ®æããããšã§ãããã³ãŒããŒã¯æ®ãã®æããåºåãçæããªããã°ãªããŸããã[T5](model_doc/t5)ã¯ããã¹ãŠã®NLPã¿ã¹ã¯ãç¹å®ã®ãã¬ãã£ãã¯ã¹ã䜿çšããŠããã¹ã察ããã¹ãã®åé¡ã«å€æãããããŠããŒã¯ãªã¢ãã«ã§ããããšãã°ããã¬ãã£ãã¯ã¹`Summarize:`ã¯èŠçŽã¿ã¹ã¯ã瀺ããŸããT5ã¯æåž«ãããã¬ãŒãã³ã°ïŒGLUEãšSuperGLUEïŒãšèªå·±æåž«ãããã¬ãŒãã³ã°ïŒããŒã¯ã³ã®15ïŒ
ãã©ã³ãã ã«ãµã³ãã«ãããããã¢ãŠãïŒã«ãã£ãŠäºåãã¬ãŒãã³ã°ãããŠããŸãã
|
||
|
||
|
||
## Audio
|
||
|
||
<iframe style="border: 1px solid rgba(0, 0, 0, 0.1);" width="1000" height="450" src="https://www.figma.com/embed?embed_host=share&url=https%3A%2F%2Fwww.figma.com%2Ffile%2Fvrchl8jDV9YwNVPWu2W0kK%2Fspeech-and-audio-model-timeline%3Fnode-id%3D0%253A1%26t%3DmM4H8pPMuK23rClL-1" allowfullscreen></iframe>
|
||
|
||
### Encoder[[audio-encoder]]
|
||
|
||
[Wav2Vec2](model_doc/wav2vec2) ã¯ãçã®ãªãŒãã£ãªæ³¢åœ¢ããçŽæ¥é³å£°è¡šçŸãåŠç¿ããããã®Transformerãšã³ã³ãŒããŒã䜿çšããŸããããã¯ã察ç
§çãªã¿ã¹ã¯ã§äºååŠç¿ãããäžé£ã®åœã®è¡šçŸããçã®é³å£°è¡šçŸãç¹å®ããŸãã [HuBERT](model_doc/hubert) ã¯Wav2Vec2ã«äŒŒãŠããŸãããç°ãªããã¬ãŒãã³ã°ããã»ã¹ãæã£ãŠããŸããã¿ãŒã²ããã©ãã«ã¯ãé¡äŒŒãããªãŒãã£ãªã»ã°ã¡ã³ããã¯ã©ã¹ã¿ã«å²ãåœãŠããããããé ããŠãããã«ãªãã¯ã©ã¹ã¿ãªã³ã°ã¹ãããã«ãã£ãŠäœæãããŸããé ããŠãããã¯åã蟌ã¿ã«ããããããäºæž¬ãè¡ããŸãã
|
||
|
||
### Encoder-decoder[[audio-encoder-decoder]]
|
||
|
||
[Speech2Text](model_doc/speech_to_text) ã¯ãèªåé³å£°èªèïŒASRïŒããã³é³å£°ç¿»èš³ã®ããã«èšèšãããé³å£°ã¢ãã«ã§ãããã®ã¢ãã«ã¯ããªãŒãã£ãªæ³¢åœ¢ããæœåºããããã°ã¡ã«ãã£ã«ã¿ãŒãã³ã¯ãã£ãŒãã£ãŒãåãå
¥ããäºåãã¬ãŒãã³ã°ãããèªå·±ååž°çã«ãã©ã³ã¹ã¯ãªãããŸãã¯ç¿»èš³ãçæããŸãã [Whisper](model_doc/whisper) ãASRã¢ãã«ã§ãããä»ã®å€ãã®é³å£°ã¢ãã«ãšã¯ç°ãªããâš ã©ãã«ä»ã âš ãªãŒãã£ãªãã©ã³ã¹ã¯ãªãã·ã§ã³ããŒã¿ã倧éã«äºåã«åŠç¿ããŠããŒãã·ã§ããããã©ãŒãã³ã¹ãå®çŸããŸããããŒã¿ã»ããã®å€§éšåã«ã¯éè±èªã®èšèªãå«ãŸããŠãããWhisperã¯äœãªãœãŒã¹èšèªã«ã䜿çšã§ããŸããæ§é çã«ã¯ãWhisperã¯Speech2Textã«äŒŒãŠããŸãããªãŒãã£ãªä¿¡å·ã¯ãšã³ã³ãŒããŒã«ãã£ãŠãšã³ã³ãŒãããããã°ã¡ã«ã¹ãã¯ããã°ã©ã ã«å€æãããŸãããã³ãŒããŒã¯ãšã³ã³ãŒããŒã®é ãç¶æ
ãšåã®ããŒã¯ã³ãããã©ã³ã¹ã¯ãªãããèªå·±ååž°çã«çæããŸãã
|
||
|
||
## Multimodal
|
||
|
||
<iframe style="border: 1px solid rgba(0, 0, 0, 0.1);" width="1000" height="450" src="https://www.figma.com/embed?embed_host=share&url=https%3A%2F%2Fwww.figma.com%2Ffile%2FcX125FQHXJS2gxeICiY93p%2Fmultimodal%3Fnode-id%3D0%253A1%26t%3DhPQwdx3HFPWJWnVf-1" allowfullscreen></iframe>
|
||
|
||
### Encoder[[mm-encoder]]
|
||
|
||
[VisualBERT](model_doc/visual_bert) ã¯ãBERTã®åŸã«ãªãªãŒã¹ãããããžã§ã³èšèªã¿ã¹ã¯åãã®ãã«ãã¢ãŒãã«ã¢ãã«ã§ããããã¯BERTãšäºåãã¬ãŒãã³ã°ãããç©äœæ€åºã·ã¹ãã ãçµã¿åãããç»åç¹åŸŽãããžã¥ã¢ã«åã蟌ã¿ã«æœåºããããã¹ãåã蟌ã¿ãšäžç·ã«BERTã«æž¡ããŸããVisualBERTã¯éãã¹ã¯ããã¹ããåºã«ãããã¹ã¯ããã¹ããäºæž¬ããããã¹ããç»åãšæŽåããŠãããã©ãããäºæž¬ããå¿
èŠããããŸããViTããªãªãŒã¹ãããéã[ViLT](model_doc/vilt) ã¯ç»ååã蟌ã¿ãååŸããããã«ãã®æ¹æ³ãæ¡çšããŸãããç»ååã蟌ã¿ã¯ããã¹ãåã蟌ã¿ãšå
±ã«å
±åã§åŠçãããŸãããããããViLTã¯ç»åããã¹ããããã³ã°ããã¹ã¯èšèªã¢ããªã³ã°ãããã³å
šåèªãã¹ãã³ã°ã«ããäºåãã¬ãŒãã³ã°ãè¡ãããŸãã
|
||
|
||
[CLIP](model_doc/clip) ã¯ç°ãªãã¢ãããŒããåãã(`ç»å`ã`ããã¹ã`) ã®ãã¢äºæž¬ãè¡ããŸããç»åãšã³ã³ãŒããŒïŒViTïŒãšããã¹ããšã³ã³ãŒããŒïŒTransformerïŒã¯ã(`ç»å`ã`ããã¹ã`) ãã¢ããŒã¿ã»ããäžã§å
±åãã¬ãŒãã³ã°ããã(`ç»å`ã`ããã¹ã`) ãã¢ã®ç»åãšããã¹ãã®åã蟌ã¿ã®é¡äŒŒæ§ãæå€§åããŸããäºåãã¬ãŒãã³ã°åŸãCLIPã䜿çšããŠç»åããããã¹ããäºæž¬ãããããã®éãè¡ãããšãã§ããŸãã[OWL-ViT](model_doc/owlvit) ã¯ããŒãã·ã§ããç©äœæ€åºã®ããã¯ããŒã³ãšããŠCLIPã䜿çšããŠããŸããäºåãã¬ãŒãã³ã°åŸãç©äœæ€åºãããã远å ããã(`ã¯ã©ã¹`ã`ããŠã³ãã£ã³ã°ããã¯ã¹`) ãã¢ã«å¯Ÿããã»ããäºæž¬ãè¡ãããŸãã
|
||
|
||
### Encoder-decoder[[mm-encoder-decoder]]
|
||
|
||
å
åŠæåèªèïŒOCRïŒã¯ãéåžžãç»åãçè§£ãããã¹ããçæããããã«è€æ°ã®ã³ã³ããŒãã³ããé¢äžããããã¹ãèªèã¿ã¹ã¯ã§ãã [TrOCR](model_doc/trocr) ã¯ããšã³ãããŒãšã³ãã®Transformerã䜿çšããŠãã®ããã»ã¹ãç°¡ç¥åããŸãããšã³ã³ãŒããŒã¯ç»åãåºå®ãµã€ãºã®ããããšããŠåŠçããããã®ViTã¹ã¿ã€ã«ã®ã¢ãã«ã§ããããã³ãŒããŒã¯ãšã³ã³ãŒããŒã®é ãç¶æ
ãåãå
¥ããããã¹ããèªå·±ååž°çã«çæããŸãã[Donut](model_doc/donut) ã¯OCRããŒã¹ã®ã¢ãããŒãã«äŸåããªãããäžè¬çãªããžã¥ã¢ã«ããã¥ã¡ã³ãçè§£ã¢ãã«ã§ããšã³ã³ãŒããŒãšããŠSwin Transformerããã³ãŒããŒãšããŠå€èšèªBARTã䜿çšããŸãã Donutã¯ç»åãšããã¹ãã®æ³šéã«åºã¥ããŠæ¬¡ã®åèªãäºæž¬ããããšã«ãããããã¹ããèªãããã«äºåãã¬ãŒãã³ã°ãããŸãããã³ãŒããŒã¯ããã³ãããäžããããããŒã¯ã³ã·ãŒã±ã³ã¹ãçæããŸããããã³ããã¯åããŠã³ã¹ããªãŒã ã¿ã¹ã¯ããšã«ç¹å¥ãªããŒã¯ã³ã䜿çšããŠè¡šçŸãããŸããäŸãã°ãããã¥ã¡ã³ãã®è§£æã«ã¯`è§£æ`ããŒã¯ã³ãããããšã³ã³ãŒããŒã®é ãç¶æ
ãšçµã¿åããããŠããã¥ã¡ã³ããæ§é åãããåºåãã©ãŒãããïŒJSONïŒã«è§£æããŸãã
|
||
|
||
## Reinforcement learning
|
||
|
||
<iframe style="border: 1px solid rgba(0, 0, 0, 0.1);" width="1000" height="450" src="https://www.figma.com/embed?embed_host=share&url=https%3A%2F%2Fwww.figma.com%2Ffile%2FiB3Y6RvWYki7ZuKO6tNgZq%2Freinforcement-learning%3Fnode-id%3D0%253A1%26t%3DhPQwdx3HFPWJWnVf-1" allowfullscreen></iframe>
|
||
|
||
### Decoder[[rl-decoder]]
|
||
|
||
æææ±ºå®ãšè»è·¡ãã©ã³ã¹ãã©ãŒããŒã¯ãç¶æ
ãã¢ã¯ã·ã§ã³ãå ±é
¬ãã·ãŒã±ã³ã¹ã¢ããªã³ã°ã®åé¡ãšããŠæããŸãã [Decision Transformer](model_doc/decision_transformer) ã¯ããªã¿ãŒã³ã»ãã¥ã»ãŽãŒãéå»ã®ç¶æ
ãããã³ã¢ã¯ã·ã§ã³ã«åºã¥ããŠå°æ¥ã®åžæãªã¿ãŒã³ã«ã€ãªããã¢ã¯ã·ã§ã³ã®ç³»åãçæããŸããæåŸã® *K* ã¿ã€ã ã¹ãããã§ã¯ã3ã€ã®ã¢ããªãã£ãããããããŒã¯ã³åã蟌ã¿ã«å€æãããå°æ¥ã®ã¢ã¯ã·ã§ã³ããŒã¯ã³ãäºæž¬ããããã«GPTã®ãããªã¢ãã«ã«ãã£ãŠåŠçãããŸãã[Trajectory Transformer](model_doc/trajectory_transformer) ãç¶æ
ãã¢ã¯ã·ã§ã³ãå ±é
¬ãããŒã¯ã³åããGPTã¢ãŒããã¯ãã£ã§åŠçããŸããå ±é
¬èª¿æŽã«çŠç¹ãåœãŠãDecision Transformerãšã¯ç°ãªããTrajectory Transformerã¯ããŒã ãµãŒãã䜿çšããŠå°æ¥ã®ã¢ã¯ã·ã§ã³ãçæããŸãã
|