mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-07 14:50:07 +06:00

* added resources: links to task guides that support these models * minor polishing * conflict resolved * link fix * Update docs/source/en/model_doc/vision-encoder-decoder.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
71 lines
3.1 KiB
Plaintext
71 lines
3.1 KiB
Plaintext
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations under the License.
|
|
-->
|
|
|
|
# GPT-NeoX-Japanese
|
|
|
|
## Overview
|
|
|
|
We introduce GPT-NeoX-Japanese, which is an autoregressive language model for Japanese, trained on top of [https://github.com/EleutherAI/gpt-neox](https://github.com/EleutherAI/gpt-neox).
|
|
Japanese is a unique language with its large vocabulary and a combination of hiragana, katakana, and kanji writing scripts.
|
|
To address this distinct structure of the Japanese language, we use a [special sub-word tokenizer](https://github.com/tanreinama/Japanese-BPEEncoder_V2). We are very grateful to *tanreinama* for open-sourcing this incredibly helpful tokenizer.
|
|
Following the recommendations from Google's research on [PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html), we have removed bias parameters from transformer blocks, achieving better model performance. Please refer [this article](https://medium.com/ml-abeja/training-a-better-gpt-2-93b157662ae4) in detail.
|
|
|
|
Development of the model was led by [Shinya Otani](https://github.com/SO0529), [Takayoshi Makabe](https://github.com/spider-man-tm), [Anuj Arora](https://github.com/Anuj040), and [Kyo Hattori](https://github.com/go5paopao) from [ABEJA, Inc.](https://www.abejainc.com/). For more information on this model-building activity, please refer [here (ja)](https://tech-blog.abeja.asia/entry/abeja-gpt-project-202207).
|
|
|
|
### Generation
|
|
|
|
The `generate()` method can be used to generate text using GPT NeoX Japanese model.
|
|
|
|
```python
|
|
>>> from transformers import GPTNeoXJapaneseForCausalLM, GPTNeoXJapaneseTokenizer
|
|
|
|
>>> model = GPTNeoXJapaneseForCausalLM.from_pretrained("abeja/gpt-neox-japanese-2.7b")
|
|
>>> tokenizer = GPTNeoXJapaneseTokenizer.from_pretrained("abeja/gpt-neox-japanese-2.7b")
|
|
|
|
>>> prompt = "人とAIが協調するためには、"
|
|
|
|
>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids
|
|
|
|
>>> gen_tokens = model.generate(
|
|
... input_ids,
|
|
... do_sample=True,
|
|
... temperature=0.9,
|
|
... max_length=100,
|
|
... )
|
|
>>> gen_text = tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)[0]
|
|
|
|
>>> print(gen_text)
|
|
人とAIが協調するためには、AIと人が共存し、AIを正しく理解する必要があります。
|
|
```
|
|
|
|
## Documentation resources
|
|
|
|
- [Causal language modeling task guide](./tasks/language_modeling)
|
|
|
|
## GPTNeoXJapaneseConfig
|
|
|
|
[[autodoc]] GPTNeoXJapaneseConfig
|
|
|
|
## GPTNeoXJapaneseTokenizer
|
|
|
|
[[autodoc]] GPTNeoXJapaneseTokenizer
|
|
|
|
## GPTNeoXJapaneseModel
|
|
|
|
[[autodoc]] GPTNeoXJapaneseModel
|
|
- forward
|
|
|
|
## GPTNeoXJapaneseForCausalLM
|
|
|
|
[[autodoc]] GPTNeoXJapaneseForCausalLM
|
|
- forward
|