transformers/docs/source/en/tools.md

<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.

-->

> [!WARNING]
> Agents and tools are being spun out into the standalone [smolagents](https://huggingface.co/docs/smolagents/index) library. These docs will be deprecated in the future!

# Tools

A tool is a function an agent can use to complete a task. Depending on your task, a tool can perform a web search, answer questions about a document, transcribe speech to text, and much more.

Transformers provides a default set of tools for agents. These include the tools mentioned above as well as image question answering, text-to-speech, translation, and a Python code interpreter that executes the Python code generated by a LLM in a secure environment.

Set `add_base_tools=True` to enable this default set of tools. The `tools` parameter is for adding additional tools. Leave the list empty if you aren't planning on adding any other tools to the toolbox.

```py
from transformers import ReactCodeAgent

agent = ReactCodeAgent(tools=[], add_base_tools=True)
```

You could also manually load a tool with [`load_tool`].

```py
from transformers import load_tool, ReactCodeAgent

tool = load_tool("text-to-speech")
audio = tool("This is a text-to-speech tool")
agent = ReactCodeAgent(tools=[audio])
```

This guide will help you learn how to create your own tools and manage an agents toolbox.

## Create a new tool

You can create any tool you can dream of to empower an agent. The example in this section creates a tool that returns the most downloaded model for a task from the Hub, and the code for it is shown below.

```py
from huggingface_hub import list_models

task = "text-classification"
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
print(model.id)
```

There are two ways you can create a tool, using a decorator or a superclass.

### Tool decorator

A fast and simple way to create a tool is to add the `@tool` decorator.

Convert the code above into a tool by wrapping it in a function and adding the `@tool` decorator. The function needs:

- A clear name that describes what the tool does, `model_download_counter`.
- Type hints for the input and output (`str`).
- A description that describes the tool in more detail and its arguments. This description is incorporated in the agents system prompt. It tells the agent *how* to use the tool, so try to make it as clear as possible!

```py
from transformers import tool

@tool
def model_download_counter(task: str) -> str:
    """
    This is a tool that returns the checkpoint name of the most downloaded model for a task from the Hugging Face Hub.

    Args:
        task: The task to retrieve the most downloaded model from.
    """
    model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
    return model.id
```

Pass the `model_download_counter` tool to the agents `tools` parameter to use it.

```py
from transformers import CodeAgent

agent = CodeAgent(tools=[model_download_counter], add_base_tools=True)
agent.run(
    "Can you give me the name of the model that has the most downloads on the 'text-to-video' task on the Hugging Face Hub?"
)
```

### Tool superclass

Inheritance allows you to customize the [`Tool`] superclass or build a tool much more flexibly and comprehensively. This example will show you how to build the same `model_download_counter` tool as a [`Tool`] class.

The [`Tool`] class needs:

- A clear name that describes what the tool does, `model_download_counter`.
- A description that describes the tool in more detail and its arguments. This description is incorporated in the agents system prompt. It tells the agent *how* to use the tool, so try to make it as clear as possible!
- An `inputs` attribute that describes the input type. This is a dictionary with the keys, `type` and `description`.
- An `outputs` attribute that describes the output type.
- A `forward` method containing the code to be executed when the tool is called.

Write the following code below to a file named `model_download.py`.

```py
from transformers import Tool
from huggingface_hub import list_models

class HFModelDownloadsTool(Tool):
    name = "model_download_counter"
    description = """
    This is a tool that returns the checkpoint name of the most downloaded model for a task from the Hugging Face Hub."""

    inputs = {
        "task": {
            "type": "string",
            "description": "the task category (such as text-classification, depth-estimation, etc)",
        }
    }
    output_type = "string"

    def forward(self, task: str):
        model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
        return model.id
```

Import the tool from `model_download.py` and use [`load_tool`] to load it into the agent.

```py
from model_download import HFModelDownloadsTool
from transformers import load_tool, CodeAgent

tool = HFModelDownloadsTool()
model_counter = load_tool(tool)
agent = CodeAgent(tools=[model_counter], add_base_tools=True)
```

Also consider sharing your tool to the Hub with [`~Tool.push_to_hub`] so that everyone can use it!

```py
from model_download import HFModelDownloadsTool
from transformers import load_tool, CodeAgent

tool = HFModelDownloadsTool()
tool.push_to_hub("{your_username}/hf-model-downloads")
model_counter = load_tool("m-ric/hf-model-downloads")
agent = CodeAgent(tools=[model_counter], add_base_tools=True)
```

## Add and replace tools

Once an agent is initialized, add or replace its available tools without reinitializing the agent from scratch.

Use [`add_tool`] to add a tool to an existing agent.

```py
from transformers import CodeAgent

agent = CodeAgent(tools=[], add_base_tools=True)
agent.toolbox.add_tool(model_download_counter)
```

Now you can use the default text-to-speech tool to read aloud the most downloaded model for the text-to-video task.

```py
agent.run(
    "Can you read out loud the name of the model that has the most downloads on the 'text-to-video' task on the Hugging Face Hub and return the audio?"
)
```

> [!WARNING]
> When adding tools to an agent that already works well, it can bias the agent towards your tool or a tool other than the one currently defined.

Use [`update_tool`] to replace an agents existing tool. This is useful if the new tool is a one-to-one replacement of the existing tool because the agent already knows how to perform the task. The new tool should follow the same API as the tool it replaced or the system prompt template should be adapted to ensure all examples using the replaced tool are updated.

```py
agent.toolbox.update_tool(new_model_download_counter)
```

## ToolCollection

A [`ToolCollection`] is a collection of Hugging Face [Spaces](https://hf.co/spaces) that can be quickly loaded and used by an agent.

> [!TIP]
> Learn more about creating collections on the Hub.

Create a [`ToolCollection`] object and specify the `collection_slug` of the collection you want to use, and then pass it to the agent. To speed up the starting process, tools are only loaded if they're called by the agent.

The example loads a collection of image generation tools.

```py
from transformers import ToolCollection, ReactCodeAgent

image_tool_collection = ToolCollection(collection_slug="")
agent = ReactCodeAgent(tools=[*image_tool_collection], add_base_tools=True)
agent.run(
    "Please draw me a picture of rivers and lakes."
)
```

## Tool integrations

Transformers supports tools from several other libraries, such as [gradio-tools](https://github.com/freddyaboulton/gradio-tools) and [LangChain](https://python.langchain.com/docs/introduction/).

### gradio-tools

gradio-tools is a library that enables [Gradio](https://www.gradio.app/) apps to be used as tools. With the wide variety of Gradio apps available, you can enhance your agent with a range of tools like generating images and videos or transcribing audio and summarizing it.

Import and instantiate a tool from gradio-tools, for example, the [StableDiffusionPromptGeneratorTool](https://github.com/freddyaboulton/gradio-tools/blob/main/gradio_tools/tools/prompt_generator.py). This tool can help improve prompts to generate better images.

> [!WARNING]
> gradio-tools require text inputs and outputs even when working with different modalities like images and audio, which are currently incompatible.

Use [`~Tool.from_gradio`] to load the prompt generator tool.

```py
from gradio_tools import StableDiffusionPromptGeneratorTool
from transformers import Tool, load_tool, CodeAgent

gradio_prompt_generator_tool = StableDiffusionPromptGeneratorTool()
prompt_generator_tool = Tool.from_gradio(gradio_prompt_generator_tool)
```

Now pass it to the agent along with a text-to-image tool.

```py
image_generation_tool = load_tool("huggingface-tools/text-to-image")
agent = CodeAgent(tools=[prompt_generator_tool, image_generation_tool], llm_engine=llm_engine)
agent.run(
    "Improve this prompt, then generate an image of it.", prompt="A rabbit wearing a space suit"
)
```

### LangChain

LangChain is a library for working with LLMs which includes agents and tools. Use the [`~Tool.from_langchain`] method to load any LangChain tool into an agent.

The example below demonstrates how to use LangChains web search tool.

```py
from langchain.agents import load_tools
from transformers import Tool, ReactCodeAgent

search_tool = Tool.from_langchain(load_tools(["serpapi"])[0])
agent = ReactCodeAgent(tools=[search_tool])
agent.run("How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?")
```