Spaces:
Running
Running
Document question answering: given a document (such as a PDF) in image format, answer a question on this document (Donut) | |
Image question answering: given an image, answer a question on this image (VILT) | |
Speech to text: given an audio recording of a person talking, transcribe the speech into text (Whisper) | |
Text to speech: convert text to speech (SpeechT5) | |
Translation: translates a given sentence from source language to target language. | |
Python code interpreter: runs your the LLM generated Python code in a secure environment. This tool will only be added to [ReactJsonAgent] if you use add_base_tools=True, since code-based tools can already execute Python code | |
You can manually use a tool by calling the [load_tool] function and a task to perform. | |
thon | |
from transformers import load_tool | |
tool = load_tool("text-to-speech") | |
audio = tool("This is a text to speech tool") | |
Create a new tool | |
You can create your own tool for use cases not covered by the default tools from Hugging Face. | |
For example, let's create a tool that returns the most downloaded model for a given task from the Hub. | |
You'll start with the code below. | |
thon | |
from huggingface_hub import list_models | |
task = "text-classification" | |
model = next(iter(list_models(filter=task, sort="downloads", direction=-1))) | |
print(model.id) | |
This code can be converted into a class that inherits from the [Tool] superclass. | |
The custom tool needs: | |
- An attribute name, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name is model_download_counter. | |
- An attribute description is used to populate the agent's system prompt. | |
- An inputs attribute, which is a dictionary with keys "type" and "description". It contains information that helps the Python interpreter make educated choices about the input. | |
- An output_type attribute, which specifies the output type. | |
- A forward method which contains the inference code to be executed. | |
thon | |
from transformers import Tool | |
from huggingface_hub import list_models | |
class HFModelDownloadsTool(Tool): | |
name = "model_download_counter" | |
description = ( | |
"This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub. " | |
"It returns the name of the checkpoint." | |
) | |
inputs = { | |
"task": { | |
"type": "text", | |
"description": "the task category (such as text-classification, depth-estimation, etc)", | |
} | |
} | |
output_type = "text" | |
def forward(self, task: str): | |
model = next(iter(list_models(filter=task, sort="downloads", direction=-1))) | |
return model.id | |
Now that the custom HfModelDownloadsTool class is ready, you can save it to a file named model_downloads.py and import it for use. | |
thon | |
from model_downloads import HFModelDownloadsTool | |
tool = HFModelDownloadsTool() | |
You can also share your custom tool to the Hub by calling [~Tool.push_to_hub] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access. | |
python | |
tool.push_to_hub("{your_username}/hf-model-downloads") | |
Load the tool with the [~Tool.load_tool] function and pass it to the tools parameter in your agent. | |
thon | |
from transformers import load_tool, CodeAgent | |
model_download_tool = load_tool("m-ric/hf-model-downloads") | |
agent = CodeAgent(tools=[model_download_tool], llm_engine=llm_engine) | |
agent.run( | |
"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?" | |
) |