thenativefox
Added split files and tables
939262b
Document question answering: given a document (such as a PDF) in image format, answer a question on this document (Donut)
Image question answering: given an image, answer a question on this image (VILT)
Speech to text: given an audio recording of a person talking, transcribe the speech into text (Whisper)
Text to speech: convert text to speech (SpeechT5)
Translation: translates a given sentence from source language to target language.
Python code interpreter: runs your the LLM generated Python code in a secure environment. This tool will only be added to [ReactJsonAgent] if you use add_base_tools=True, since code-based tools can already execute Python code
You can manually use a tool by calling the [load_tool] function and a task to perform.
thon
from transformers import load_tool
tool = load_tool("text-to-speech")
audio = tool("This is a text to speech tool")
Create a new tool
You can create your own tool for use cases not covered by the default tools from Hugging Face.
For example, let's create a tool that returns the most downloaded model for a given task from the Hub.
You'll start with the code below.
thon
from huggingface_hub import list_models
task = "text-classification"
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
print(model.id)
This code can be converted into a class that inherits from the [Tool] superclass.
The custom tool needs:
- An attribute name, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name is model_download_counter.
- An attribute description is used to populate the agent's system prompt.
- An inputs attribute, which is a dictionary with keys "type" and "description". It contains information that helps the Python interpreter make educated choices about the input.
- An output_type attribute, which specifies the output type.
- A forward method which contains the inference code to be executed.
thon
from transformers import Tool
from huggingface_hub import list_models
class HFModelDownloadsTool(Tool):
name = "model_download_counter"
description = (
"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. "
"It returns the name of the checkpoint."
)
inputs = {
"task": {
"type": "text",
"description": "the task category (such as text-classification, depth-estimation, etc)",
}
}
output_type = "text"
def forward(self, task: str):
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
return model.id
Now that the custom HfModelDownloadsTool class is ready, you can save it to a file named model_downloads.py and import it for use.
thon
from model_downloads import HFModelDownloadsTool
tool = HFModelDownloadsTool()
You can also share your custom tool to the Hub by calling [~Tool.push_to_hub] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.
python
tool.push_to_hub("{your_username}/hf-model-downloads")
Load the tool with the [~Tool.load_tool] function and pass it to the tools parameter in your agent.
thon
from transformers import load_tool, CodeAgent
model_download_tool = load_tool("m-ric/hf-model-downloads")
agent = CodeAgent(tools=[model_download_tool], llm_engine=llm_engine)
agent.run(
"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?"
)