A newer version of the Gradio SDK is available:
5.27.1
metadata
title: Named Entity Recognition Tool
emoji: 🌍
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.27.0
app_file: app.py
pinned: false
tags:
- tool
Advanced Named Entity Recognition (NER) Tool for smolagents
This repository contains an enhanced Named Entity Recognition tool built for the smolagents
library from Hugging Face. This tool allows you to:
- Identify named entities (people, organizations, locations, dates, etc.) in text
- Choose from multiple NER models for different languages and use cases
- Configure different output formats and confidence thresholds
- Use with smolagents for AI agents that can understand entities in text
Installation
pip install smolagents transformers torch gradio
For faster inference on GPU:
pip install smolagents transformers torch gradio accelerate
Basic Usage
from ner_tool import NamedEntityRecognitionTool
# Initialize the NER tool
ner_tool = NamedEntityRecognitionTool()
# Analyze text with default settings
result = ner_tool("Apple Inc. is planning to open a new store in Paris, France next year.")
print(result)
# Analyze with custom settings
detailed_result = ner_tool(
text="Apple Inc. is planning to open a new store in Paris, France next year.",
model="Babelscape/wikineural-multilingual-ner", # Different model
aggregation="detailed", # More detailed output format
min_score=0.7 # Lower confidence threshold
)
print(detailed_result)
Available Models
The tool includes several pre-configured models:
Model ID | Description |
---|---|
dslim/bert-base-NER | Standard NER (English) - Default |
jean-baptiste/camembert-ner | French NER |
Davlan/bert-base-multilingual-cased-ner-hrl | Multilingual NER |
Babelscape/wikineural-multilingual-ner | WikiNeural Multilingual NER |
flair/ner-english-ontonotes-large | OntoNotes English (fine-grained) |
elastic/distilbert-base-cased-finetuned-conll03-english | CoNLL (fast) |
Output Formats
The tool supports three output formats:
- Simple - A simple list of entities found with their types and confidence scores
- Grouped - Entities grouped by their category (default)
- Detailed - A detailed analysis including the original text with entity markers
Using with an Agent
from smolagents import CodeAgent, InferenceClientModel
from ner_tool import NamedEntityRecognitionTool
# Initialize the NER tool
ner_tool = NamedEntityRecognitionTool()
# Create an agent model
model = InferenceClientModel(
model_id="mistralai/Mistral-7B-Instruct-v0.2",
token="your_huggingface_token"
)
# Create the agent with our NER tool
agent = CodeAgent(tools=[ner_tool], model=model)
# Run the agent
result = agent.run(
"Analyze this text and identify all entities: 'The European Union and United Kingdom finalized a trade deal on Tuesday.'"
)
print(result)
Interactive Gradio Interface
For an interactive experience, run the Gradio app:
python gradio_app.py
This provides a web interface where you can:
- Enter custom text or select from samples
- Choose different NER models
- Configure display formats and confidence thresholds
- See immediate results
Customization Options
Entity Confidence Score
- Use
min_score
parameter to filter entities by confidence - Range: 0.0 (include all) to 1.0 (only highest confidence)
- Default: 0.8
Entity Types
The tool can identify various entity types including:
- People (PER, PERSON)
- Organizations (ORG, ORGANIZATION)
- Locations (LOC, LOCATION, GPE)
- Dates and Times (DATE, TIME)
- Money and Percentages (MONEY, PERCENT)
- Products (PRODUCT)
- Events (EVENT)
- Works of Art (WORK_OF_ART)
- Laws (LAW)
- Languages (LANGUAGE)
- Facilities (FAC)
- Miscellaneous (MISC)
The exact entity types available depend on the chosen model.
Sharing Your Tool
You can share your tool on the Hugging Face Hub:
ner_tool.push_to_hub("your-username/advanced-ner-tool", token="your_huggingface_token")
Limitations
- First-time model loading may take some time
- Some models may require significant memory (especially larger ones)
- Entity recognition accuracy varies by model and language
Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request.
License
MIT