File size: 6,314 Bytes
c0edb03
68ed57f
 
 
 
c0edb03
68ed57f
c0edb03
 
68ed57f
 
 
c0edb03
 
 
68ed57f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3f90b63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62fb43b
 
 
 
 
 
 
 
 
 
3f90b63
 
68ed57f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
title: GAIA Agent for Hugging Face Agents Course
emoji: πŸ•΅πŸ»β€β™‚οΈ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
hf_oauth: true
# optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes.
hf_oauth_expiration_minutes: 480
---

Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference

# GAIA Agent for Hugging Face Agents Course

This project implements a powerful intelligent agent using the SmolAgents framework to tackle the GAIA benchmark questions for the Hugging Face Agents course final assessment.

## Project Overview

The GAIA benchmark consists of challenging questions that require an agent to use various tools, including web search, file processing, and reasoning capabilities. This agent is designed to:

1. Receive questions from the GAIA API
2. Process and understand the questions
3. Use appropriate tools to find answers
4. Format and return precise answers

## Features

- **SmolAgents Integration**: Uses CodeAgent for flexible problem-solving with Python code execution
- **Multi-Model Support**: 
  - Compatible with Hugging Face models
  - OpenAI models (GPT-4o and others)
  - X.AI's Grok models
  - Anthropic, Cohere, and Mistral models via LiteLLM
- **Enhanced Tool Suite**: 
  - Web search via DuckDuckGo
  - Python interpreter for code execution
  - File handling (reading, saving, downloading)
  - Data analysis for CSV and Excel files
  - Image processing with OCR capabilities (when available)
- **Flexible Environment Configuration**: 
  - Easy setup via environment variables or .env file
  - Fallback mechanisms for missing dependencies
  - Support for both local and secure E2B code execution
- **Answer Processing**: 
  - Special handling for reversed text questions
  - Precise answer formatting for benchmark submission
  - Automatic cleanup of model responses for exact matching
- **Interactive UI**: Gradio interface for running the agent and submitting answers

## Setup

### Prerequisites

- Python 3.8+
- Hugging Face account
- API keys for your preferred models (HuggingFace, OpenAI, X.AI, etc.)

### Installation

1. Clone this repository
2. Install the required dependencies:

```bash
pip install -r requirements.txt
```

3. Copy the example environment file and add your API keys:

```bash
cp env.example .env
# Edit .env with your API keys and configuration
```

### Configuration

Configure the agent by setting these environment variables or editing the `.env` file:

#### API Keys
```
HUGGINGFACEHUB_API_TOKEN=your_huggingface_token_here
OPENAI_API_KEY=your_openai_key_here
XAI_API_KEY=your_xai_api_key_here  # For X.AI/Grok models
```

#### Agent Configuration
```
AGENT_MODEL_TYPE=OpenAIServerModel  # HfApiModel, InferenceClientModel, LiteLLMModel, OpenAIServerModel
AGENT_MODEL_ID=gpt-4o  # Model ID depends on the model type
AGENT_TEMPERATURE=0.2
AGENT_EXECUTOR_TYPE=local  # local or e2b for secure execution
AGENT_VERBOSE=true  # Set to true for detailed logging
```

#### Advanced Configuration
```
AGENT_PROVIDER=hf-inference  # Provider for InferenceClientModel
AGENT_TIMEOUT=120  # Timeout in seconds for API calls
AGENT_API_BASE=https://api.groq.com/openai/v1  # For X.AI when using OpenAIServerModel
```

### Hugging Face Spaces Setup

When deploying to Hugging Face Spaces, you need to add your API keys as secrets:

1. Go to your Space's Settings β†’ Repository Secrets
2. Add the following secrets (add at least one of these API keys):
   - `HUGGINGFACEHUB_API_TOKEN` - Your Hugging Face API token
   - `OPENAI_API_KEY` - Your OpenAI API key
   - `XAI_API_KEY` - Your X.AI/Grok API key

3. Add additional configuration secrets as needed:
   - `AGENT_MODEL_TYPE` - Model type (e.g., "OpenAIServerModel")
   - `AGENT_MODEL_ID` - Model ID to use (e.g., "gpt-4o")
   - `AGENT_TEMPERATURE` - Temperature setting (e.g., "0.2")
   - `AGENT_VERBOSE` - Set to "true" for detailed logging

4. For X.AI's API, also set:
   - `XAI_API_BASE` - The API base URL

5. **Important**: If you're using OpenAIServerModel, ensure the requirements.txt includes:
   ```
   smolagents[openai]
   openai
   ```
   
   If the space gives an error about OpenAI modules, rebuild the space after updating requirements.txt.

6. After adding all secrets, go to the "Factory" tab in the Space settings and click "Rebuild Space" to apply the changes.

![Hugging Face Secrets Setup](https://huggingface.co./datasets/huggingface/documentation-images/resolve/main/spaces/secrets.png)

## Usage

### Running the Agent

Launch the Gradio interface with:

```bash
python app.py
```

Then:
1. Log in to your Hugging Face account using the button in the interface
2. Click "Run Evaluation & Submit All Answers"

### Testing

To test the agent with sample questions before running the full evaluation:

```bash
python test_agent.py
```

For more focused testing with specific APIs:

```bash
python test_groq_api.py  # Test X.AI/Groq API integration
python test_xai_api.py  # Test X.AI API integration
```

## Project Structure

- `app.py`: Main application with Gradio interface
- `core_agent.py`: Agent implementation with SmolAgents framework
- `api_integration.py`: Client for interacting with GAIA API
- `test_agent.py`: Testing script with sample questions
- `test_groq_api.py` & `test_xai_api.py`: API-specific test scripts
- `update_groq_key.py`: Utility for updating API keys
- `project_planning.md`: Development roadmap and progress tracking
- `requirements.txt`: Project dependencies

## Tools Implementation

The agent includes several custom tools:

1. **save_and_read_file**: Save content to a temporary file and return the path
2. **download_file_from_url**: Download a file from a URL and save it locally
3. **extract_text_from_image**: OCR for extracting text from images (requires pytesseract)
4. **analyze_csv_file**: Load and analyze CSV files using pandas
5. **analyze_excel_file**: Load and analyze Excel files using pandas

## Resources

- [GAIA Benchmark Information](https://huggingface.co./spaces/gaia-benchmark/leaderboard)
- [SmolAgents Documentation](https://huggingface.co./docs/smolagents/en/index)
- [Hugging Face Agents Course](https://huggingface.co./agents-course)

## License

This project is licensed under the MIT License.