metadata
license: apache-2.0
datasets:
- prithivMLmods/IndoorOutdoorNet-20K
library_name: transformers
language:
- en
base_model:
- google/siglip2-base-patch16-224
pipeline_tag: image-classification
tags:
- Indoor
- Outdoor
- Classification
- SigLIP2
IndoorOutdoorNet
IndoorOutdoorNet is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify images as either Indoor or Outdoor using the SiglipForImageClassification architecture.
Classification Report:
precision recall f1-score support
Indoor 0.9661 0.9554 0.9607 9999
Outdoor 0.9559 0.9665 0.9612 9999
accuracy 0.9609 19998
macro avg 0.9610 0.9609 0.9609 19998
weighted avg 0.9610 0.9609 0.9609 19998
The model categorizes images into 2 environment-related classes:
Class 0: "Indoor"
Class 1: "Outdoor"
Install dependencies
!pip install -q transformers torch pillow gradio
Inference Code
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/IndoorOutdoorNet" # Updated model name
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
def classify_environment_image(image):
"""Predicts whether an image is Indoor or Outdoor."""
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
labels = {
"0": "Indoor", "1": "Outdoor"
}
predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
return predictions
# Create Gradio interface
iface = gr.Interface(
fn=classify_environment_image,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(label="Prediction Scores"),
title="IndoorOutdoorNet",
description="Upload an image to classify it as Indoor or Outdoor."
)
if __name__ == "__main__":
iface.launch()
Intended Use:
The IndoorOutdoorNet model is designed to classify images into indoor or outdoor environments. Potential use cases include:
- Smart Cameras: Detect indoor/outdoor context to adjust settings.
- Dataset Curation: Automatically filter image datasets by setting.
- Robotics & Drones: Environment-aware navigation logic.
- Content Filtering: Moderate or tag environment context in image platforms.