mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-25 23:38:59 +06:00

* Remove nestedness in tool config * Really do it * Use remote tools descriptions * Work * Clean up eval * Changes * Tools * Tools * tool * Fix everything * Use last result/assign for evaluation * Prompt * Remove hardcoded selection * Evaluation for chat agents * correct some spelling * Small fixes * Change summarization model (#23172) * Fix link displayed * Update description of the tool * Fixes in chat prompt * Custom tools, custom prompt * Tool clean up * save_pretrained and push_to_hub for tool * Fix init * Tests * Fix tests * Tool save/from_hub/push_to_hub and tool->load_tool * Clean push_to_hub and add app file * Custom inference API for endpoints too * Clean up * old remote tool and new remote tool * Make a requirements * return_code adds tool creation * Avoid redundancy between global variables * Remote tools can be loaded * Tests * Text summarization tests * Quality * Properly mark tests * Test the python interpreter * And the CI shall be green. * fix loading of additional tools * Work on RemoteTool and fix tests * General clean up * Guard imports * Fix tools * docs: Fix broken link in 'How to add a model...' (#23216) fix link * Get default endpoint from the Hub * Add guide * Simplify tool config * Docs * Some fixes * Docs * Docs * Docs * Fix code returned by agent * Try this * Match args with signature in remote tool * Should fix python interpreter for Python 3.8 * Fix push_to_hub for tools * Other fixes to push_to_hub * Add API doc page * Docs * Docs * Custom tools * Pin tensorflow-probability (#23220) * Pin tensorflow-probability * [all-test] * [all-test] Fix syntax for bash * PoC for some chaining API * Text to speech * J'ai pris des libertés * Rename * Basic python interpreter * Add agents * Quality * Add translation tool * temp * GenQA + LID + S2T * Quality + word missing in translation * Add open assistance, support f-strings in evaluate * captioning + s2t fixes * Style * Refactor descriptions and remove chain * Support errors and rename OpenAssistantAgent * Add setup * Deal with typos + example of inference API * Some rename + README * Fixes * Update prompt * Unwanted change * Make sure everyone has a default * One prompt to rule them all. * SD * Description * Clean up remote tools * More remote tools * Add option to return code and update doc * Image segmentation * ControlNet * Gradio demo * Diffusers protection * Lib protection * ControlNet description * Cleanup * Style * Remove accelerate and try to be reproducible * No randomness * Male Basic optional in token * Clean description * Better prompts * Fix args eval in interpreter * Add tool wrapper * Tool on the Hub * Style post-rebase * Big refactor of descriptions, batch generation and evaluation for agents * Make problems easier - interface to debug * More problems, add python primitives * Back to one prompt * Remove dict for translation * Be consistent * Add prompts * New version of the agent * Evaluate new agents * New endpoints agents * Make all tools a dict variable * Typo * Add problems * Add to big prompt * Harmonize * Add tools * New evaluation * Add more tools * Build prompt with tools descriptions * Tools on the Hub * Let's chat! * Cleanup * Temporary bs4 safeguard * Cache agents and clean up * Blank init * Fix evaluation for agents * New format for tools on the Hub * Add method to reset state * Remove nestedness in tool config * Really do it * Use remote tools descriptions * Work * Clean up eval * Changes * Tools * Tools * tool * Fix everything * Use last result/assign for evaluation * Prompt * Remove hardcoded selection * Evaluation for chat agents * correct some spelling * Small fixes * Change summarization model (#23172) * Fix link displayed * Update description of the tool * Fixes in chat prompt * Custom tools, custom prompt * Tool clean up * save_pretrained and push_to_hub for tool * Fix init * Tests * Fix tests * Tool save/from_hub/push_to_hub and tool->load_tool * Clean push_to_hub and add app file * Custom inference API for endpoints too * Clean up * old remote tool and new remote tool * Make a requirements * return_code adds tool creation * Avoid redundancy between global variables * Remote tools can be loaded * Tests * Text summarization tests * Quality * Properly mark tests * Test the python interpreter * And the CI shall be green. * Work on RemoteTool and fix tests * fix loading of additional tools * General clean up * Guard imports * Fix tools * Get default endpoint from the Hub * Simplify tool config * Add guide * Docs * Some fixes * Docs * Docs * Fix code returned by agent * Try this * Docs * Match args with signature in remote tool * Should fix python interpreter for Python 3.8 * Fix push_to_hub for tools * Other fixes to push_to_hub * Add API doc page * Fixes * Doc fixes * Docs * Fix audio * Custom tools * Audio fix * Improve custom tools docstring * Docstrings * Trigger CI * Mode docstrings * More docstrings * Improve custom tools * Fix for remote tools * Style * Fix repo consistency * Quality * Tip * Cleanup on doc * Cleanup toc * Add disclaimer for starcoder vs openai * Remove disclaimer * Small fixed in the prompts * 4.29 * Update src/transformers/tools/agents.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Complete documentation * Small fixes * Agent evaluation * Note about gradio-tools & LC * Clean up agents and prompt * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Note about gradio-tools & LC * Add copyrights and address review comments * Quality * Add all language codes * Add remote tool tests * Move custom prompts to other docs * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * TTS tests * Quality --------- Co-authored-by: Lysandre <hi@lyand.re> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Co-authored-by: Connor Henderson <connor.henderson@talkiatry.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre <lysandre@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
504 lines
18 KiB
Plaintext
504 lines
18 KiB
Plaintext
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations under the License.
|
|
-->
|
|
|
|
# Custom Tools and Prompts
|
|
|
|
<Tip>
|
|
|
|
If you are not aware of what tools and agents are in the context of transformers, we recommend you read the
|
|
[Transformers Agents](transformers_agents) page first.
|
|
|
|
</Tip>
|
|
|
|
<Tip warning={true}>
|
|
|
|
Transformers Agent is an experimental API which is subject to change at any time. Results returned by the agents
|
|
can vary as the APIs or underlying models are prone to change.
|
|
|
|
</Tip>
|
|
|
|
Creating and using custom tools and prompts is paramount to empowering the agent and having it perform new tasks.
|
|
In this guide we'll take a look at:
|
|
|
|
- How to customize the prompt
|
|
- How to use custom tools
|
|
- How to create custom tools
|
|
|
|
## Customizing the prompt
|
|
|
|
As explained in [Transformers Agents](transformers_agents) agents can run in [`~Agent.run`] and [`~Agent.chat`] mode.
|
|
Both the run and chat mode underlie the same logic. The language model powering the agent is conditioned on a long prompt
|
|
and simply asked to complete the prompt by generating next tokens until the stop token is reached.
|
|
The only difference between the `run` and `chat` mode is that during the `chat` mode the prompt is extended with
|
|
previous user inputs and model generations, which seemingly gives the agent a memory and allows it to refer to
|
|
past interactions.
|
|
|
|
Let's take a closer look into how the prompt is structured to understand how it can be best customized.
|
|
The prompt is structured broadly into four parts.
|
|
|
|
- 1. Introduction: how the agent should behave, explanation of the concept of tools.
|
|
- 2. Description of all the tools. This is defined by a `<<all_tools>>` token that is dynamically replaced at runtime with the tools defined/chosen by the user.
|
|
- 3. A set of examples of tasks and their solution
|
|
- 4. Current example, and request for solution.
|
|
|
|
To better understand each part, let's look at a shortened version of how such a prompt can look like in practice.
|
|
|
|
```
|
|
I will ask you to perform a task, your job is to come up with a series of simple commands in Python that will perform the task.
|
|
[...]
|
|
You can print intermediate results if it makes sense to do so.
|
|
|
|
Tools:
|
|
- document_qa: This is a tool that answers a question about an document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question.
|
|
- image_captioner: This is a tool that generates a description of an image. It takes an input named `image` which should be the image to caption, and returns a text that contains the description in English.
|
|
[...]
|
|
|
|
Task: "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French."
|
|
|
|
I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
|
|
|
|
Answer:
|
|
```py
|
|
translated_question = translator(question=question, src_lang="French", tgt_lang="English")
|
|
print(f"The translated question is {translated_question}.")
|
|
answer = image_qa(image=image, question=translated_question)
|
|
print(f"The answer is {answer}")
|
|
```
|
|
|
|
Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner."
|
|
|
|
I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
|
|
|
|
Answer:
|
|
```py
|
|
answer = document_qa(document, question="What is the oldest person?")
|
|
print(f"The answer is {answer}.")
|
|
image = image_generator("A banner showing " + answer)
|
|
```
|
|
|
|
[...]
|
|
|
|
Task: "Draw me a picture of rivers and lakes"
|
|
|
|
I will use the following
|
|
```
|
|
|
|
The first part explains precisely how the model shall behave and what it should do. This part
|
|
most likely does not need to be customized.
|
|
|
|
TODO(PVP) - explain better how the .description and .name influence the prompt
|
|
|
|
### Customizing the tool descriptions
|
|
|
|
The performance of the agent is directly linked to the prompt itself. We structure the prompt so that it works well
|
|
with what we intend for the agent to do; but for maximum customization we also offer the ability to specify a different prompt when instantiating the agent.
|
|
|
|
### Customizing the single-execution prompt
|
|
|
|
In order to specify a custom single-execution prompt, one would so the following:
|
|
|
|
```py
|
|
template = """ [...] """
|
|
|
|
agent = HfAgent(your_endpoint, run_prompt_template=template)
|
|
```
|
|
|
|
<Tip>
|
|
|
|
Please make sure to have the `<<all_tools>>` string defined somewhere in the `template` so that the agent can be aware
|
|
of the tools it has available to it.
|
|
|
|
</Tip>
|
|
|
|
#### Chat-execution prompt
|
|
|
|
In order to specify a custom single-execution prompt, one would so the following:
|
|
|
|
```
|
|
template = """ [...] """
|
|
|
|
agent = HfAgent(
|
|
url_endpoint=your_endpoint,
|
|
token=your_hf_token,
|
|
chat_prompt_template=template
|
|
)
|
|
```
|
|
|
|
<Tip>
|
|
|
|
Please make sure to have the `<<all_tools>>` string defined somewhere in the `template` so that the agent can be
|
|
aware of the tools it has available to it.
|
|
|
|
</Tip>
|
|
|
|
## Using custom tools
|
|
|
|
In this section, we'll be leveraging two existing custom tools that are specific to image generation:
|
|
|
|
- We replace [huggingface-tools/image-transformation](https://huggingface.co/spaces/huggingface-tools/image-transformation),
|
|
with [diffusers/controlnet-canny-tool](https://huggingface.co/spaces/diffusers/controlnet-canny-tool)
|
|
to allow for more image modifications.
|
|
- We add a new tool for image upscaling to the default toolbox:
|
|
[diffusers/latent-upscaler-tool](https://huggingface.co/spaces/diffusers/latent-upscaler-tool) replace the existing image-transformation tool.
|
|
|
|
We'll start by loading the custom tools with the convenient [`load_tool`] function:
|
|
|
|
```py
|
|
from transformers import load_tool
|
|
|
|
controlnet_transformer = load_tool("diffusers/controlnet-canny-tool")
|
|
upscaler = load_tool("diffusers/latent-upscaler-tool")
|
|
```
|
|
|
|
Upon adding custom tools to an agent, the tools' descriptions and names are automatically
|
|
included in the agents' prompts. Thus, it is imperative that custom tools have
|
|
a well-written description and name in order for the agent to understand how to use them.
|
|
Let's take a look at the description and name of `controlnet_transformer`:
|
|
|
|
```py
|
|
print(f"Description: '{controlnet_transformer.description}'")
|
|
print(f"Name: '{controlnet_transformer.name}'")
|
|
```
|
|
|
|
gives
|
|
```
|
|
Description: 'This is a tool that transforms an image with ControlNet according to a prompt.
|
|
It takes two inputs: `image`, which should be the image to transform, and `prompt`, which should be the prompt to use to change it. It returns the modified image.'
|
|
Name: 'image_transformer'
|
|
```
|
|
|
|
The name and description is accurate and fits the style of the [curated set of tools](./transformers_agents#a-curated-set-of-tools).
|
|
Next, let's instantiate an agent with `controlnet_transformer` and `upscaler`:
|
|
|
|
```py
|
|
tools = [controlnet_transformer, upscaler]
|
|
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=tools)
|
|
```
|
|
|
|
This command should give you the following info:
|
|
|
|
```
|
|
image_transformer has been replaced by <transformers_modules.diffusers.controlnet-canny-tool.bd76182c7777eba9612fc03c0
|
|
8718a60c0aa6312.image_transformation.ControlNetTransformationTool object at 0x7f1d3bfa3a00> as provided in `additional_tools`
|
|
```
|
|
|
|
The set of curated tools already has a `image_transformer` tool which is hereby replaced with our custom tool.
|
|
|
|
<Tip>
|
|
|
|
Overwriting existing tools can be beneficial if we want to use a custom tool exactly for the same task as an existing tool
|
|
because the agent is well-versed in using the specific task. Beware that the custom tool should follow the exact same API
|
|
as the overwritten tool in this case.
|
|
|
|
</Tip>
|
|
|
|
The upscaler tool was given the name `image_upscaler` which is not yet present in the default toolbox and is therefore is simply added to the list of tools.
|
|
You can always have a look at the toolbox that is currently available to the agent via the `agent.toolbox` attribute:
|
|
|
|
```py
|
|
print("\n".join([f"- {a}" for a in agent.toolbox.keys()]))
|
|
```
|
|
|
|
```
|
|
- document_qa
|
|
- image_captioner
|
|
- image_qa
|
|
- image_segmenter
|
|
- transcriber
|
|
- summarizer
|
|
- text_classifier
|
|
- text_qa
|
|
- text_reader
|
|
- translator
|
|
- image_transformer
|
|
- text_downloader
|
|
- image_generator
|
|
- video_generator
|
|
- image_upscaler
|
|
```
|
|
|
|
Note how `image_upscaler` is now part of the agents' toolbox.
|
|
|
|
Let's now try out the new tools! We will re-use the image we generated in (Transformers Agents Quickstart)[./transformers_agents#single-execution-run].
|
|
|
|
```py
|
|
from diffusers.utils import load_image
|
|
|
|
image = load_image(
|
|
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png"
|
|
)
|
|
```
|
|
|
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png" width=200>
|
|
|
|
Let's transform the image into a beautiful winter landscape:
|
|
|
|
```py
|
|
image = agent.run("Transform the image: 'A frozen lake and snowy forest'", image=image)
|
|
```
|
|
|
|
```
|
|
==Explanation from the agent==
|
|
I will use the following tool: `image_transformer` to transform the image.
|
|
|
|
|
|
==Code generated by the agent==
|
|
image = image_transformer(image, prompt="A frozen lake and snowy forest")
|
|
```
|
|
|
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes_winter.png" width=200>
|
|
|
|
The new image processing tool is based on ControlNet which is can make very strong modifications to the image.
|
|
By default the image processing tool returns an image of size 512x512 pixels. Let's see if we can upscale it.
|
|
|
|
```py
|
|
image = agent.run("Upscale the image", image)
|
|
```
|
|
|
|
```
|
|
==Explanation from the agent==
|
|
I will use the following tool: `image_upscaler` to upscale the image.
|
|
|
|
|
|
==Code generated by the agent==
|
|
upscaled_image = image_upscaler(image)
|
|
```
|
|
|
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes_winter_upscale.png" width=400>
|
|
|
|
The agent automatically mapped our prompt "Upscale the image" to the just added upscaler tool purely based on the description and name of the upscaler tool
|
|
and was able to correctly run it.
|
|
|
|
Next, let's have a look into how you can create a new custom tool.
|
|
|
|
### Adding new tools
|
|
|
|
In this section we show how to create a new tool that can be added to the agent.
|
|
|
|
#### Creating a new tool
|
|
|
|
We'll first start by creating a tool. We'll add the not-so-useful yet fun task of fetching the model on the Hugging Face
|
|
Hub with the most downloads for a given task.
|
|
|
|
We can do that with the following code:
|
|
|
|
```python
|
|
from huggingface_hub import list_models
|
|
|
|
task = "text-classification"
|
|
|
|
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
|
|
print(model.id)
|
|
```
|
|
|
|
For the task `text-classification`, this returns `'facebook/bart-large-mnli'`, for `translation` it returns `'t5-base`.
|
|
|
|
How do we convert this to a tool that the agent can leverage? All tools depend on the superclass `Tool` that holds the
|
|
main attributes necessary. We'll create a class that inherits from it:
|
|
|
|
```python
|
|
from transformers import Tool
|
|
|
|
|
|
class HFModelDownloadsTool(Tool):
|
|
pass
|
|
```
|
|
|
|
This class has a few needs:
|
|
- An attribute `name`, which corresponds to the name of the tool itself. To be in tune with other tools which have a
|
|
performative name, we'll name it `model_download_counter`.
|
|
- An attribute `description`, which will be used to populate the prompt of the agent.
|
|
- `inputs` and `outputs` attributes. Defining this will help the python interpreter make educated choices about types,
|
|
and will allow for a gradio-demo to be spawned when we push our tool to the Hub. They're both a list of expected
|
|
values, which can be `text`, `image`, or `audio`.
|
|
- A `__call__` method which contains the inference code. This is the code we've played with above!
|
|
|
|
Here's what our class looks like now:
|
|
|
|
```python
|
|
from transformers import Tool
|
|
from huggingface_hub import list_models
|
|
|
|
|
|
class HFModelDownloadsTool(Tool):
|
|
name = "model_download_counter"
|
|
description = (
|
|
"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. "
|
|
"It takes the name of the category (such as text-classification, depth-estimation, etc), and "
|
|
"returns the name of the checkpoint."
|
|
)
|
|
|
|
inputs = ["text"]
|
|
outputs = ["text"]
|
|
|
|
def __call__(self, task: str):
|
|
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
|
|
return model.id
|
|
```
|
|
|
|
We now have our tool handy. Save it in a file and import it from your main script. Let's name this file
|
|
`model_downloads.py`, so the resulting import code looks like this:
|
|
|
|
```python
|
|
from model_downloads import HFModelDownloadsTool
|
|
|
|
tool = HFModelDownloadsTool()
|
|
```
|
|
|
|
In order to let others benefit from it and for simpler initialization, we recommend pushing it to the Hub under your
|
|
namespace. To do so, just call `push_to_hub` on the `tool` variable:
|
|
|
|
```python
|
|
tool.push_to_hub("lysandre/hf-model-downloads")
|
|
```
|
|
|
|
You now have your code on the Hub! Let's take a look at the final step, which is to have the agent use it.
|
|
|
|
#### Having the agent use the tool
|
|
|
|
We now have our tool that lives on the Hub which can be instantiated as such:
|
|
|
|
```python
|
|
from transformers import load_tool
|
|
|
|
tool = load_tool("lysandre/hf-model-downloads")
|
|
```
|
|
|
|
In order to use it in the agent, simply pass it in the `additional_tools` parameter of the agent initialization method:
|
|
|
|
```python
|
|
from transformers import HfAgent
|
|
|
|
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=[tool])
|
|
|
|
agent.run(
|
|
"Can you read out loud the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?"
|
|
)
|
|
```
|
|
which outputs the following:
|
|
```
|
|
==Code generated by the agent==
|
|
model = model_download_counter(task="text-to-video")
|
|
print(f"The model with the most downloads is {model}.")
|
|
audio_model = text_reader(model)
|
|
|
|
|
|
==Result==
|
|
The model with the most downloads is damo-vilab/text-to-video-ms-1.7b.
|
|
```
|
|
|
|
and generates the following audio.
|
|
|
|
| **Audio** |
|
|
|------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|
| <audio controls><source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/damo.wav" type="audio/wav"/> |
|
|
|
|
|
|
<Tip>
|
|
|
|
Depending on the LLM, some are quite brittle and require very exact prompts in order to work well. Having a well-defined
|
|
description of the tool is paramount to having it be leveraged by the agent.
|
|
|
|
</Tip>
|
|
|
|
### Replacing existing tools
|
|
|
|
Replacing existing tools can be done simply by assigning a new item to the agent's toolbox. Here's how one would do so:
|
|
|
|
```python
|
|
from transformers import HfAgent, load_tool
|
|
|
|
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
|
|
agent.toolbox["image-transformation"] = load_tool("diffusers/controlnet-canny-tool")
|
|
```
|
|
|
|
<Tip>
|
|
|
|
Beware when replacing tools with others! This will also adjust the agent's prompt. This can be good if you have a better
|
|
prompt suited for the task, but it can also result in your tool being selected way more than others or for other
|
|
tools to be selected instead of the one you have defined.
|
|
|
|
</Tip>
|
|
|
|
## Leveraging gradio-tools
|
|
|
|
[gradio-tools](https://github.com/freddyaboulton/gradio-tools) is a powerful library that allows using Hugging
|
|
Face Spaces as tools. It supports many existing Spaces as well as custom Spaces to be designed with it.
|
|
|
|
We offer support for `gradio_tools` by using the `Tool.from_gradio` method. For example, we want to take
|
|
advantage of the `StableDiffusionPromptGeneratorTool` tool offered in the `gradio-tools` toolkit so as to
|
|
improve our prompts and generate better images.
|
|
|
|
We first import the tool from `gradio_tools` and instantiate it:
|
|
|
|
```python
|
|
from gradio_tools import StableDiffusionPromptGeneratorTool
|
|
|
|
gradio_tool = StableDiffusionPromptGeneratorTool()
|
|
```
|
|
|
|
We pass that instance to the `Tool.from_gradio` method:
|
|
|
|
```python
|
|
from transformers import Tool
|
|
|
|
tool = Tool.from_gradio(gradio_tools)
|
|
```
|
|
|
|
Now we can manage it exactly as we would a usual custom tool. We leverage it to improve our prompt
|
|
` a rabbit wearing a space suit`:
|
|
|
|
```python
|
|
from transformers import HfAgent
|
|
|
|
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", additional_tools=[tool])
|
|
|
|
agent.run("Generate an image of the `prompt` after improving it.", prompt="A rabbit wearing a space suit")
|
|
```
|
|
|
|
The model adequately leverages the tool:
|
|
```
|
|
==Explanation from the agent==
|
|
I will use the following tools: `StableDiffusionPromptGenerator` to improve the prompt, then `image_generator` to generate an image according to the improved prompt.
|
|
|
|
|
|
==Code generated by the agent==
|
|
improved_prompt = StableDiffusionPromptGenerator(prompt)
|
|
print(f"The improved prompt is {improved_prompt}.")
|
|
image = image_generator(improved_prompt)
|
|
```
|
|
|
|
Before finally generating the image:
|
|
|
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png">
|
|
|
|
<Tip warning={true}>
|
|
|
|
gradio-tools requires *textual* inputs and outputs, even when working with different modalities. This implementation
|
|
works with image and audio objects. The two are currently incompatible, but will rapidly become compatible as we
|
|
work to improve the support.
|
|
|
|
</Tip>
|
|
|
|
## Future compatibility with Langchain
|
|
|
|
We love Langchain and think it has a very compelling suite of tools. In order to handle these tools,
|
|
Langchain requires *textual* inputs and outputs, even when working with different modalities.
|
|
This is often the serialized version (i.e., saved to disk) of the objects.
|
|
|
|
This difference means that multi-modality isn't handled between transformers-agents and langchain.
|
|
We aim for this limitation to be resolved in future versions, and welcome any help from avid langchain
|
|
users to help us achieve this compatibility.
|
|
|
|
We would love to have better support. If you would like to help, please
|
|
[open an issue](https://github.com/huggingface/transformers/issues/new) and share what you have in mind.
|