🍽️ Model Card for Food Vision Model
This model is an image classification model trained to identify different types of food from images. It was developed as part of a Food Vision project, utilizing transfer learning on a pre-trained convolutional neural network.
Model Details
Model Description
This model is a deep learning model for classifying food images into one of 101 categories from the Food101 dataset. It was trained using TensorFlow and employs a transfer learning approach, leveraging the features learned by a model pre-trained on a large dataset like ImageNet. The training process included the use of mixed precision for potentially faster training and reduced memory usage.
⚛️ HuggingFace Space for Food Vision Model
- Developed by:
Recompense
Me! - Model type: Image Classification (Transfer Learning with a CNN backbone)
- Language(s) (NLP): N/A (Image Classification)
- License: MIT
- Finetuned from model: EfficienntNetB0
Uses
This model is intended for classifying images of food into 101 distinct categories. Potential use cases include:
- Food recognition in mobile applications.
- Organizing food images in databases.
- Assisting in dietary tracking or recipe suggestions based on images.
Limitations
- Dataset Bias: The model is trained on the Food101 dataset. Its performance may degrade on food images that are significantly different in style, presentation, or origin from those in the training data.
- Image Quality: Performance can be affected by image quality, lighting conditions, occlusions, and variations in food presentation.
- Specificity: While it classifies into 101 categories, it may not distinguish between very similar dishes or variations within a category.
Evaluation
The model's performance was evaluated using standard classification metrics on a validation set from the Food101 dataset.
Testing Data
The model was evaluated on the validation split of the Food101 dataset.
- Food101 Dataset: A dataset of 101 food categories, with 101,000 images. 750 training images and 250 testing images per class.
- Source: TensorFlow Datasets
Factors
Evaluation was performed on the overall validation dataset. Further analysis could involve disaggregating performance by individual food categories to identify classes where the model performs better or worse.
Metrics
The primary evaluation metric used is Accuracy. A confusion matrix was also generated to visualize per-class performance.
- Accuracy: The proportion of correctly classified images out of the total number of images evaluated.
- Confusion Matrix: A table that visualizes the performance of a classification model. Each row represents the instances in an actual class, while each column represents the instances in a predicted class.
Results
70-80% Fluctuating accuracy on validation data
Summary
Transfer learning helped the model achieve greater accuracy, though the model struggled with food closely related to each other indicating more data was needed. The Dataset used a lot but more data is still needed to differentiate between closely looking food.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: Tesla T4
- Hours used: 1 hour estimate(max)
- Cloud Provider: Google Cloud
- Compute Region: us-central
- Carbon Emitted: 80 grams of CO₂eq (estimated)
Technical Specifications
Model Architecture and Objective
The model is a fine-tuned convolutional neural network (CNN) classifier.Mixed precision training was used for faster training, a modern CNN architecture compatible with float16
data types. The objective is to minimize the classification loss (e.g., categorical cross-entropy) to accurately predict the food category given an image.
Compute Infrastructure
The model was trained using a Tesla T4 GPU on Google Cloud in the us-central region. The estimated carbon emissions for 1 hour of training time on this setup are 80 grams of CO2eq. The environment was intended to support mixed precision training.
Software
- TensorFlow
- TensorFlow Datasets
- NumPy
- Matplotlib
- Scikit-learn
- Helper functions from
helper_functions.py
(for plotting, data handling)
Usage
Here's an example of how to use the model for inference on a new image.
First, make sure you have TensorFlow installed:
pip install tensorflow
Then, you can load the model and make a prediction:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import os
import keras
# Available backend options are: "jax", "torch", "tensorflow".
os.environ["KERAS_BACKEND"] = "jax"
loaded_model = keras.saving.load_model("hf://Recompense/FoodVision")
# Define the class names (replace with the actual class names from your training)
# to test the model intitially you can use these class names and upload an image based on any class you choose
class_names = ['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'buffalo_wings', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_chicken', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyros', 'hamburger', 'hot_dog', 'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosas', 'sashimi', 'scallops', 'shrimp_scampi', 'smores', 'spaghetti_bolognese', 'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', 'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'] # Example class names
# Create a function to load and prepare images (from your notebook)
def load_prep_image(filepath, img_shape=224, scale=True):
"""
Reads in an image and preprocesses it for model prediction
Args:
filepath (str): path to target image
img_shape (int): shape to resize image to. Default = 224
scale (bool): Condition to scale image. Default = True
Returns:
Image Tensor of shape (img_shape, img_shape, 3)
"""
image = tf.io.read_file(filepath)
image_tensor = tf.io.decode_image(image, channels=3)
image_tensor = tf.image.resize(image_tensor, [img_shape, img_shape])
if scale:
# Scale image tensor to be between 0 and 1
scaled_image_tensor = image_tensor / 255.
return scaled_image_tensor
else:
return image_tensor
# Load and preprocess a sample image
# Replace 'path/to/your/image.jpg' with the actual path to your image
sample_image_path = 'path/to/your/image.jpg'
prepared_image = load_prep_image(sample_image_path)
# Add a batch dimension to the image
prepared_image = tf.expand_dims(prepared_image, axis=0)
# Make a prediction
predictions = loaded_model.predict(prepared_image)
# Get the predicted class index
predicted_class_index = np.argmax(predictions)
# Get the predicted class name
predicted_class_name = class_names[predicted_class_index]
# Print the prediction
print(f"The predicted food item is: {predicted_class_name}")
# Optional: Display the image
# img = plt.imread(sample_image_path)
# plt.imshow(img)
# plt.title(f"Prediction: {predicted_class_name}")
# plt.axis('off')
# plt.show()
- Downloads last month
- 13
Model tree for Recompense/FoodVision
Base model
google/efficientnet-b0