25 KiB
English | ç®äœäžæ | ç¹é«äžæ | íêµìŽ | Español | æ¥æ¬èª | à€¹à€¿à€šà¥à€Šà¥ | Ð ÑÑÑкОй | Ð ortuguês | à°€à±à°²à±à°à± | Français | Deutsch | Tiếng Viá»t | Ø§ÙØ¹Ø±ØšÙØ© | ارد٠|
JAXãPyTorchãTensorFlowã®ããã®æå ç«¯æ©æ¢°åŠç¿
ð€Transformersã¯ãããã¹ããèŠèŠãé³å£°ãªã©ã®ç°ãªãã¢ããªãã£ã«å¯ŸããŠã¿ã¹ã¯ãå®è¡ããããã«ãäºåã«åŠç¿ãããæ°åã®ã¢ãã«ãæäŸããŸãã
ãããã®ã¢ãã«ã¯æ¬¡ã®ãããªå Žåã«é©çšã§ããŸã:
- ð ããã¹ãã¯ãããã¹ãã®åé¡ãæ å ±æœåºã質åå¿çãèŠçŽã翻蚳ãããã¹ãçæãªã©ã®ã¿ã¹ã¯ã®ããã«ã100以äžã®èšèªã«å¯Ÿå¿ããŠããŸãã
- ðŒïž ç»ååé¡ãç©äœæ€åºãã»ã°ã¡ã³ããŒã·ã§ã³ãªã©ã®ã¿ã¹ã¯ã®ããã®ç»åã
- ð£ïž é³å£°ã¯ãé³å£°èªèãé³å£°åé¡ãªã©ã®ã¿ã¹ã¯ã«äœ¿çšããŸãã
ãã©ã³ã¹ãã©ãŒããŒã¢ãã«ã¯ãããŒãã«è³ªåå¿çãå åŠæåèªèãã¹ãã£ã³ææžããã®æ å ±æœåºããããªåé¡ãèŠèŠç質åå¿çãªã©ãè€æ°ã®ã¢ããªãã£ãçµã¿åãããã¿ã¹ã¯ãå®è¡å¯èœã§ãã
ð€Transformersã¯ãäžããããããã¹ãã«å¯ŸããŠãããã®äºååŠç¿ãããã¢ãã«ãçŽ æ©ãããŠã³ããŒãããŠäœ¿çšããããªãèªèº«ã®ããŒã¿ã»ããã§ãããã埮調æŽããç§ãã¡ã®model hubã§ã³ãã¥ããã£ãšå ±æããããã®APIãæäŸããŸããåæã«ãã¢ãŒããã¯ãã£ãå®çŸ©ããåPythonã¢ãžã¥ãŒã«ã¯å®å šã«ã¹ã¿ã³ãã¢ãã³ã§ãããè¿ éãªç ç©¶å®éšãå¯èœã«ããããã«å€æŽããããšãã§ããŸãã
ð€Transformersã¯JaxãPyTorchãTensorFlowãšãã3倧ãã£ãŒãã©ãŒãã³ã°ã©ã€ãã©ãªãŒã«æ¯ããããããããã®ã©ã€ãã©ãªãã·ãŒã ã¬ã¹ã«çµ±åããŠããŸããçæ¹ã§ã¢ãã«ãåŠç¿ããŠãããããçæ¹ã§æšè«çšã«ããŒãããã®ã¯ç°¡åãªããšã§ãã
ãªã³ã©ã€ã³ãã¢
model hubãããã»ãšãã©ã®ã¢ãã«ã®ããŒãžã§çŽæ¥ãã¹ãããããšãã§ããŸãããŸãããããªãã¯ã¢ãã«ããã©ã€ããŒãã¢ãã«ã«å¯ŸããŠããã©ã€ããŒãã¢ãã«ã®ãã¹ãã£ã³ã°ãããŒãžã§ãã³ã°ãæšè«APIãæäŸããŠããŸãã
以äžã¯ãã®äžäŸã§ã:
èªç¶èšèªåŠçã«ãŠ:
- BERTã«ãããã¹ã¯ãã¯ãŒãè£å®
- Electraã«ããååå®äœèªè
- GPT-2ã«ããããã¹ãçæ
- RoBERTaã«ããèªç¶èšèªæšè«
- BARTã«ããèŠçŽ
- DistilBERTã«ãã質åå¿ç
- T5ã«ãã翻蚳
ã³ã³ãã¥ãŒã¿ããžã§ã³ã«ãŠ:
- ViTã«ããç»ååé¡
- DETRã«ããç©äœæ€åº
- SegFormerã«ããã»ãã³ãã£ãã¯ã»ã°ã¡ã³ããŒã·ã§ã³
- DETRã«ããããããã£ãã¯ã»ã°ã¡ã³ããŒã·ã§ã³
ãªãŒãã£ãªã«ãŠ:
ãã«ãã¢ãŒãã«ãªã¿ã¹ã¯ã«ãŠ:
Hugging FaceããŒã ã«ãã£ãŠäœããã ãã©ã³ã¹ãã©ãŒããŒã䜿ã£ãæžã蟌㿠ã¯ããã®ãªããžããªã®ããã¹ãçææ©èœã®å ¬åŒãã¢ã§ããã
Hugging FaceããŒã ã«ããã«ã¹ã¿ã ã»ãµããŒãããåžæã®å Žå

ã¯ã€ãã¯ãã¢ãŒ
äžããããå
¥åïŒããã¹ããç»åãé³å£°ã...ïŒã«å¯ŸããŠããã«ã¢ãã«ã䜿ãããã«ãæã
ã¯pipeline
ãšããAPIãæäŸããŠãããŸããpipelineã¯ãåŠç¿æžã¿ã®ã¢ãã«ãšããã®ã¢ãã«ã®åŠç¿æã«äœ¿çšãããååŠçãã°ã«ãŒãåãããã®ã§ãã以äžã¯ãè¯å®çãªããã¹ããšåŠå®çãªããã¹ããåé¡ããããã«pipelineã䜿çšããæ¹æ³ã§ã:
>>> from transformers import pipeline
# Allocate a pipeline for sentiment-analysis
>>> classifier = pipeline('sentiment-analysis')
>>> classifier('We are very happy to introduce pipeline to the transformers repository.')
[{'label': 'POSITIVE', 'score': 0.9996980428695679}]
2è¡ç®ã®ã³ãŒãã§ã¯ãpipelineã§äœ¿çšãããäºååŠç¿æžã¿ã¢ãã«ãããŠã³ããŒãããŠãã£ãã·ã¥ãã3è¡ç®ã§ã¯äžããããããã¹ãã«å¯ŸããŠãã®ã¢ãã«ãè©äŸ¡ããŸããããã§ã¯ãçãã¯99.97%ã®ä¿¡é ŒåºŠã§ãããžãã£ããã§ãã
èªç¶èšèªåŠçã ãã§ãªããã³ã³ãã¥ãŒã¿ããžã§ã³ãé³å£°åŠçã«ãããŠããå€ãã®ã¿ã¹ã¯ã«ã¯ãããããèšç·Žãããpipeline
ãçšæãããŠãããäŸãã°ãç»åããæ€åºãããç©äœãç°¡åã«æœåºããããšãã§ãã:
>>> import requests
>>> from PIL import Image
>>> from transformers import pipeline
# Download an image with cute cats
>>> url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png"
>>> image_data = requests.get(url, stream=True).raw
>>> image = Image.open(image_data)
# Allocate a pipeline for object detection
>>> object_detector = pipeline('object-detection')
>>> object_detector(image)
[{'score': 0.9982201457023621,
'label': 'remote',
'box': {'xmin': 40, 'ymin': 70, 'xmax': 175, 'ymax': 117}},
{'score': 0.9960021376609802,
'label': 'remote',
'box': {'xmin': 333, 'ymin': 72, 'xmax': 368, 'ymax': 187}},
{'score': 0.9954745173454285,
'label': 'couch',
'box': {'xmin': 0, 'ymin': 1, 'xmax': 639, 'ymax': 473}},
{'score': 0.9988006353378296,
'label': 'cat',
'box': {'xmin': 13, 'ymin': 52, 'xmax': 314, 'ymax': 470}},
{'score': 0.9986783862113953,
'label': 'cat',
'box': {'xmin': 345, 'ymin': 23, 'xmax': 640, 'ymax': 368}}]
ããã§ã¯ãç»åããæ€åºããããªããžã§ã¯ãã®ãªã¹ããåŸããããªããžã§ã¯ããå²ãããã¯ã¹ãšä¿¡é ŒåºŠã¹ã³ã¢ã衚瀺ãããŸããå·ŠåŽãå ç»åãå³åŽãäºæž¬çµæã衚瀺ãããã®ã§ã:
ãã®ãã¥ãŒããªã¢ã«ã§ã¯ãpipeline
APIã§ãµããŒããããŠããã¿ã¹ã¯ã«ã€ããŠè©³ãã説æããŠããŸãã
pipeline
ã«å ããŠãäžããããã¿ã¹ã¯ã«åŠç¿æžã¿ã®ã¢ãã«ãããŠã³ããŒãããŠäœ¿çšããããã«å¿
èŠãªã®ã¯ã3è¡ã®ã³ãŒãã ãã§ãã以äžã¯PyTorchã®ããŒãžã§ã³ã§ã:
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> model = AutoModel.from_pretrained("google-bert/bert-base-uncased")
>>> inputs = tokenizer("Hello world!", return_tensors="pt")
>>> outputs = model(**inputs)
ãããŠãã¡ãã¯TensorFlowãšåçã®ã³ãŒããšãªããŸã:
>>> from transformers import AutoTokenizer, TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-uncased")
>>> inputs = tokenizer("Hello world!", return_tensors="tf")
>>> outputs = model(**inputs)
ããŒã¯ãã€ã¶ã¯åŠç¿æžã¿ã¢ãã«ãæåŸ ãããã¹ãŠã®ååŠçãæ åœããåäžã®æåå (äžèšã®äŸã®ããã«) ãŸãã¯ãªã¹ãã«å¯ŸããŠçŽæ¥åŒã³åºãããšãã§ããŸããããã¯äžæµã®ã³ãŒãã§äœ¿çšã§ããèŸæžãåºåããŸãããŸããåçŽã« ** åŒæ°å±éæŒç®åã䜿çšããŠã¢ãã«ã«çŽæ¥æž¡ãããšãã§ããŸãã
ã¢ãã«èªäœã¯éåžžã®Pytorch nn.Module
ãŸã㯠TensorFlow tf.keras.Model
(ããã¯ãšã³ãã«ãã£ãŠç°ãªã)ã§ãéåžžéã䜿çšããããšãå¯èœã§ãããã®ãã¥ãŒããªã¢ã«ã§ã¯ããã®ãããªã¢ãã«ãåŸæ¥ã®PyTorchãTensorFlowã®åŠç¿ã«ãŒãã«çµ±åããæ¹æ³ããç§ãã¡ã®Trainer
APIã䜿ã£ãŠæ°ããããŒã¿ã»ããã§çŽ æ©ã埮調æŽãè¡ãæ¹æ³ã«ã€ããŠèª¬æããŸãã
ãªãtransformersã䜿ãå¿ èŠãããã®ã§ããããïŒ
-
䜿ããããææ°ã¢ãã«:
- èªç¶èšèªçè§£ã»çæãã³ã³ãã¥ãŒã¿ããžã§ã³ããªãŒãã£ãªã®åã¿ã¹ã¯ã§é«ãããã©ãŒãã³ã¹ãçºæ®ããŸãã
- æè²è ãå®åè ã«ãšã£ãŠã®äœãåå ¥éå£ã
- åŠç¿ããã¯ã©ã¹ã¯3ã€ã ãã§ããŠãŒã¶ãçŽé¢ããæœè±¡åã¯ã»ãšãã©ãããŸããã
- åŠç¿æžã¿ã¢ãã«ãå©çšããããã®çµ±äžãããAPIã
-
äœãèšç®ã³ã¹ããå°ãªãã«ãŒãã³ãããããªã³ã:
- ç ç©¶è ã¯ãåžžã«åãã¬ãŒãã³ã°ãè¡ãã®ã§ã¯ãªãããã¬ãŒãã³ã°ãããã¢ãã«ãå ±æããããšãã§ããŸãã
- å®åå®¶ã¯ãèšç®æéãçç£ã³ã¹ããåæžããããšãã§ããŸãã
- ãã¹ãŠã®ã¢ããªãã£ã«ãããŠã60,000以äžã®äºååŠç¿æžã¿ã¢ãã«ãæã€æ°å€ãã®ã¢ãŒããã¯ãã£ãæäŸããŸãã
-
ã¢ãã«ã®ã©ã€ãã¿ã€ã ã®ããããéšåã§é©åãªãã¬ãŒã ã¯ãŒã¯ãéžæå¯èœ:
- 3è¡ã®ã³ãŒãã§æå 端ã®ã¢ãã«ããã¬ãŒãã³ã°ã
- TF2.0/PyTorch/JAXãã¬ãŒã ã¯ãŒã¯éã§1ã€ã®ã¢ãã«ãèªåšã«ç§»åãããã
- åŠç¿ãè©äŸ¡ãçç£ã«é©ãããã¬ãŒã ã¯ãŒã¯ãã·ãŒã ã¬ã¹ã«éžæã§ããŸãã
-
ã¢ãã«ããµã³ãã«ãããŒãºã«åãããŠç°¡åã«ã«ã¹ã¿ãã€ãºå¯èœ:
- åèè ãçºè¡šããçµæãåçŸããããã«ãåã¢ãŒããã¯ãã£ã®äŸãæäŸããŠããŸãã
- ã¢ãã«å éšã¯å¯èœãªéãäžè²«ããŠå ¬éãããŠããŸãã
- ã¢ãã«ãã¡ã€ã«ã¯ã©ã€ãã©ãªãšã¯ç¬ç«ããŠå©çšããããšãã§ããè¿ éãªå®éšãå¯èœã§ãã
ãªãtransformersã䜿ã£ãŠã¯ãããªãã®ã§ããããïŒ
- ãã®ã©ã€ãã©ãªã¯ããã¥ãŒã©ã«ãããã®ããã®ãã«ãã£ã³ã°ãããã¯ã®ã¢ãžã¥ãŒã«åŒããŒã«ããã¯ã¹ã§ã¯ãããŸãããã¢ãã«ãã¡ã€ã«ã®ã³ãŒãã¯ãç ç©¶è ã远å ã®æœè±¡å/ãã¡ã€ã«ã«é£ã³èŸŒãããšãªããåã¢ãã«ãçŽ æ©ãå埩ã§ããããã«ãæå³çã«è¿œå ã®æœè±¡åã§ãªãã¡ã¯ã¿ãªã³ã°ãããŠããŸããã
- åŠç¿APIã¯ã©ã®ãããªã¢ãã«ã§ãåäœããããã§ã¯ãªããã©ã€ãã©ãªãæäŸããã¢ãã«ã§åäœããããã«æé©åãããŠããŸããäžè¬çãªæ©æ¢°åŠç¿ã®ã«ãŒãã«ã¯ãå¥ã®ã©ã€ãã©ãª(ããããAccelerate)ã䜿çšããå¿ èŠããããŸãã
- ç§ãã¡ã¯ã§ããã ãå€ãã®äœ¿çšäŸã玹ä»ããããåªåããŠããŸãããexamples ãã©ã«ã ã«ããã¹ã¯ãªããã¯ãããŸã§äŸã§ããããªãã®ç¹å®ã®åé¡ã«å¯ŸããŠããã«åäœããããã§ã¯ãªããããªãã®ããŒãºã«åãããããã«æ°è¡ã®ã³ãŒãã倿Žããå¿ èŠãããããšãäºæ³ãããŸãã
ã€ã³ã¹ããŒã«
pipã«ãŠ
ãã®ãªããžããªã¯ãPython 3.9+, Flax 0.4.1+, PyTorch 2.1+, TensorFlow 2.6+ ã§ãã¹ããããŠããŸãã
ð€Transformersã¯ä»®æ³ç°å¢ã«ã€ã³ã¹ããŒã«ããå¿ èŠããããŸããPythonã®ä»®æ³ç°å¢ã«æ £ããŠããªãå Žåã¯ããŠãŒã¶ãŒã¬ã€ãã確èªããŠãã ããã
ãŸãã䜿çšããããŒãžã§ã³ã®Pythonã§ä»®æ³ç°å¢ãäœæããã¢ã¯ãã£ããŒãããŸãã
ãã®åŸãFlax, PyTorch, TensorFlowã®ãã¡å°ãªããšã1ã€ãã€ã³ã¹ããŒã«ããå¿ èŠããããŸãã TensorFlowã€ã³ã¹ããŒã«ããŒãžãPyTorchã€ã³ã¹ããŒã«ããŒãžãFlaxãJaxã€ã³ã¹ããŒã«ããŒãžã§ãã䜿ãã®ãã©ãããã©ãŒã å¥ã®ã€ã³ã¹ããŒã«ã³ãã³ããåç §ããŠãã ããã
ãããã®ããã¯ãšã³ãã®ãããããã€ã³ã¹ããŒã«ãããŠããå Žåãð€Transformersã¯ä»¥äžã®ããã«pipã䜿çšããŠã€ã³ã¹ããŒã«ããããšãã§ããŸã:
pip install transformers
ãããµã³ãã«ã詊ãããããŸãã¯ã³ãŒãã®æå 端ãå¿ èŠã§ãæ°ãããªãªãŒã¹ãåŸ ãŠãªãå Žåã¯ãã©ã€ãã©ãªããœãŒã¹ããã€ã³ã¹ããŒã«ããå¿ èŠããããŸãã
condaã«ãŠ
ð€Transformersã¯ä»¥äžã®ããã«condaã䜿ã£ãŠèšçœ®ããããšãã§ããŸã:
conda install conda-forge::transformers
泚æ:
huggingface
ãã£ã³ãã«ããtransformers
ãã€ã³ã¹ããŒã«ããããšã¯éæšå¥šã§ãã
FlaxãPyTorchãTensorFlowãcondaã§ã€ã³ã¹ããŒã«ããæ¹æ³ã¯ãããããã®ã€ã³ã¹ããŒã«ããŒãžã«åŸã£ãŠãã ããã
泚æ: Windowsã§ã¯ããã£ãã·ã¥ã®æ©æµãåããããã«ãããããããŒã¢ãŒããæå¹ã«ããããä¿ãããããšããããŸãããã®ãããªå Žåã¯ããã®issueã§ãç¥ãããã ããã
ã¢ãã«ã¢ãŒããã¯ãã£
ð€TransformersãæäŸãã å šã¢ãã«ãã§ãã¯ãã€ã³ã ã¯ããŠãŒã¶ãŒãçµç¹ã«ãã£ãŠçŽæ¥ã¢ããããŒããããhuggingface.co model hubããã·ãŒã ã¬ã¹ã«çµ±åãããŠããŸãã
çŸåšã®ãã§ãã¯ãã€ã³ãæ°:
ð€Transformersã¯çŸåšã以äžã®ã¢ãŒããã¯ãã£ãæäŸããŠããŸã: ããããã®ãã€ã¬ãã«ãªèŠçŽã¯ãã¡ããåç §ããŠãã ãã.
åã¢ãã«ãFlaxãPyTorchãTensorFlowã§å®è£ ãããŠããããð€Tokenizersã©ã€ãã©ãªã«æ¯ããããé¢é£ããŒã¯ãã€ã¶ãæã£ãŠãããã¯ããã®è¡šãåç §ããŠãã ããã
ãããã®å®è£ ã¯ããã€ãã®ããŒã¿ã»ããã§ãã¹ããããŠãã(ãµã³ãã«ã¹ã¯ãªãããåç §)ããªãªãžãã«ã®å®è£ ã®æ§èœãšäžèŽããã¯ãã§ãããæ§èœã®è©³çްã¯documentationã®Examplesã»ã¯ã·ã§ã³ã§èŠãããšãã§ããŸãã
ããã«è©³ãã
ã»ã¯ã·ã§ã³ | æŠèŠ |
---|---|
ããã¥ã¡ã³ã | å®å šãªAPIããã¥ã¡ã³ããšãã¥ãŒããªã¢ã« |
ã¿ã¹ã¯æŠèŠ | ð€TransformersããµããŒãããã¿ã¹ã¯ |
ååŠçãã¥ãŒããªã¢ã« | ã¢ãã«çšã®ããŒã¿ãæºåããããã«Tokenizer ã¯ã©ã¹ãäœ¿çš |
ãã¬ãŒãã³ã°ãšåŸ®èª¿æŽ | PyTorch/TensorFlowã®åŠç¿ã«ãŒããšTrainer APIã§ð€TransformersãæäŸããã¢ãã«ãäœ¿çš |
ã¯ã€ãã¯ãã¢ãŒ: 埮調æŽ/äœ¿çšæ¹æ³ã¹ã¯ãªãã | æ§ã ãªã¿ã¹ã¯ã§ã¢ãã«ã®åŸ®èª¿æŽãè¡ãããã®ã¹ã¯ãªããäŸ |
ã¢ãã«ã®å ±æãšã¢ããããŒã | 埮調æŽããã¢ãã«ãã¢ããããŒãããŠã³ãã¥ããã£ã§å ±æãã |
ãã€ã°ã¬ãŒã·ã§ã³ | pytorch-transformers ãŸãã¯pytorch-pretrained-bert ããð€Transformers ã«ç§»è¡ãã |
åŒçš
ð€ ãã©ã³ã¹ãã©ãŒããŒã©ã€ãã©ãªã«åŒçšã§ããè«æãåºæ¥ãŸãã:
@inproceedings{wolf-etal-2020-transformers,
title = "Transformers: State-of-the-Art Natural Language Processing",
author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
month = oct,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
pages = "38--45"
}