mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-02 19:21:31 +06:00
Render custom tool docs a bit better (#23269)
* Try on a couple of blocks to see * Build the doc please * Build the doc please * Build the doc please * add more * Finish with all * Style
This commit is contained in:
parent
42017d82ba
commit
eb5b5ce641
@ -54,7 +54,7 @@ The prompt is structured broadly into four parts.
|
||||
|
||||
To better understand each part, let's look at a shortened version of how the `run` prompt can look like:
|
||||
|
||||
````
|
||||
````text
|
||||
I will ask you to perform a task, your job is to come up with a series of simple commands in Python that will perform the task.
|
||||
[...]
|
||||
You can print intermediate results if it makes sense to do so.
|
||||
@ -101,7 +101,7 @@ The second part (the bullet points below *"Tools"*) is dynamically added upon ca
|
||||
exactly as many bullet points as there are tools in `agent.toolbox` and each bullet point consists of the name
|
||||
and description of the tool:
|
||||
|
||||
```
|
||||
```text
|
||||
- <tool.name>: <tool.description>
|
||||
```
|
||||
|
||||
@ -115,7 +115,7 @@ print(f"- {document_qa.name}: {document_qa.description}")
|
||||
```
|
||||
|
||||
which gives:
|
||||
```
|
||||
```text
|
||||
- document_qa: This is a tool that answers a question about a document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question.
|
||||
```
|
||||
|
||||
@ -143,7 +143,7 @@ executable code in practice.
|
||||
|
||||
Let's have a look at one example:
|
||||
|
||||
````
|
||||
````text
|
||||
Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner."
|
||||
|
||||
I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
|
||||
@ -166,7 +166,7 @@ The prompt examples are curated by the Transformers team and rigorously evaluate
|
||||
to ensure that the agent's prompt is as good as possible to solve real use cases of the agent.
|
||||
|
||||
The final part of the prompt corresponds to:
|
||||
```
|
||||
```text
|
||||
Task: "Draw me a picture of rivers and lakes"
|
||||
|
||||
I will use the following
|
||||
@ -187,7 +187,7 @@ exactly in the same way it was previously done in the examples.
|
||||
Without going into too much detail, the chat template has the same prompt structure with the
|
||||
examples having a slightly different style, *e.g.*:
|
||||
|
||||
````
|
||||
````text
|
||||
[...]
|
||||
|
||||
=====
|
||||
@ -225,8 +225,8 @@ to past exchanges as is done *e.g.* above by the user's input of "I tried **this
|
||||
previously generated code of the agent.
|
||||
|
||||
Upon running `.chat`, the user's input or *task* is cast into an unfinished example of the form:
|
||||
```
|
||||
Human: <user-input>\n\nAssistent:
|
||||
```text
|
||||
Human: <user-input>\n\nAssistant:
|
||||
```
|
||||
which the agent completes. Contrary to the `run` command, the `chat` command then appends the completed example
|
||||
to the prompt, thus giving the agent more context for the next `chat` turn.
|
||||
@ -254,7 +254,7 @@ agent.run("Show me a tree", return_code=True)
|
||||
|
||||
gives:
|
||||
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tool: `image_segmenter` to create a segmentation mask for the image.
|
||||
|
||||
@ -269,7 +269,8 @@ are present in the tool's name and description. Let's have a look.
|
||||
```py
|
||||
agent.toolbox["image_generator"].description
|
||||
```
|
||||
```
|
||||
|
||||
```text
|
||||
'This is a tool that creates an image according to a prompt, which is a text description. It takes an input named `prompt` which contains the image description and outputs an image.
|
||||
```
|
||||
|
||||
@ -280,7 +281,7 @@ agent.run("Create an image of a tree", return_code=True)
|
||||
```
|
||||
|
||||
gives:
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tool `image_generator` to generate an image of a tree.
|
||||
|
||||
@ -307,7 +308,7 @@ used a lot for image generation tasks, *e.g.*
|
||||
agent.run("Make an image of a house and a car", return_code=True)
|
||||
```
|
||||
returns
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tools `image_generator` to generate an image of a house and `image_transformer` to transform the image of a car into the image of a house.
|
||||
|
||||
@ -322,9 +323,11 @@ to understand the difference between `image_generator` and `image_transformer` a
|
||||
|
||||
We can help the agent here by changing the tool name and description of `image_transformer`. Let's instead call it `modifier`
|
||||
to disassociate it a bit from "image" and "prompt":
|
||||
```
|
||||
```py
|
||||
agent.toolbox["modifier"] = agent.toolbox.pop("image_transformer")
|
||||
agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace("transforms an image according to a prompt", "modifies an image")
|
||||
agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace(
|
||||
"transforms an image according to a prompt", "modifies an image"
|
||||
)
|
||||
```
|
||||
|
||||
Now "modify" is a strong cue to use the new image processor which should help with the above prompt. Let's run it again.
|
||||
@ -334,7 +337,7 @@ agent.run("Make an image of a house and a car", return_code=True)
|
||||
```
|
||||
|
||||
Now we're getting:
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tools: `image_generator` to generate an image of a house, then `image_generator` to generate an image of a car.
|
||||
|
||||
@ -350,7 +353,7 @@ which is definitely closer to what we had in mind! However, we want to have both
|
||||
agent.run("Create image: 'A house and car'", return_code=True)
|
||||
```
|
||||
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tool: `image_generator` to generate an image.
|
||||
|
||||
@ -389,7 +392,7 @@ of the tools, it has available to it as well as correctly insert the user's prom
|
||||
</Tip>
|
||||
|
||||
Similarly, one can overwrite the `chat` prompt template. Note that the `chat` mode always uses the following format for the exchanges:
|
||||
```
|
||||
```text
|
||||
Human: <<task>>
|
||||
|
||||
Assistant:
|
||||
@ -441,7 +444,7 @@ print(f"Name: '{controlnet_transformer.name}'")
|
||||
```
|
||||
|
||||
gives
|
||||
```
|
||||
```text
|
||||
Description: 'This is a tool that transforms an image with ControlNet according to a prompt.
|
||||
It takes two inputs: `image`, which should be the image to transform, and `prompt`, which should be the prompt to use to change it. It returns the modified image.'
|
||||
Name: 'image_transformer'
|
||||
@ -457,7 +460,7 @@ agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder",
|
||||
|
||||
This command should give you the following info:
|
||||
|
||||
```
|
||||
```text
|
||||
image_transformer has been replaced by <transformers_modules.diffusers.controlnet-canny-tool.bd76182c7777eba9612fc03c0
|
||||
8718a60c0aa6312.image_transformation.ControlNetTransformationTool object at 0x7f1d3bfa3a00> as provided in `additional_tools`
|
||||
```
|
||||
@ -480,7 +483,7 @@ You can always have a look at the toolbox that is currently available to the age
|
||||
print("\n".join([f"- {a}" for a in agent.toolbox.keys()]))
|
||||
```
|
||||
|
||||
```
|
||||
```text
|
||||
- document_qa
|
||||
- image_captioner
|
||||
- image_qa
|
||||
@ -518,7 +521,7 @@ Let's transform the image into a beautiful winter landscape:
|
||||
image = agent.run("Transform the image: 'A frozen lake and snowy forest'", image=image)
|
||||
```
|
||||
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tool: `image_transformer` to transform the image.
|
||||
|
||||
@ -536,7 +539,7 @@ By default the image processing tool returns an image of size 512x512 pixels. Le
|
||||
image = agent.run("Upscale the image", image)
|
||||
```
|
||||
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tool: `image_upscaler` to upscale the image.
|
||||
|
||||
@ -657,7 +660,7 @@ agent.run(
|
||||
)
|
||||
```
|
||||
which outputs the following:
|
||||
```
|
||||
```text
|
||||
==Code generated by the agent==
|
||||
model = model_download_counter(task="text-to-video")
|
||||
print(f"The model with the most downloads is {model}.")
|
||||
@ -738,7 +741,7 @@ agent.run("Generate an image of the `prompt` after improving it.", prompt="A rab
|
||||
```
|
||||
|
||||
The model adequately leverages the tool:
|
||||
```
|
||||
```text
|
||||
==Explanation from the agent==
|
||||
I will use the following tools: `StableDiffusionPromptGenerator` to improve the prompt, then `image_generator` to generate an image according to the improved prompt.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user