Amarthya7 commited on
Commit
ce4c046
·
verified ·
1 Parent(s): 5fde043

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -58
README.md CHANGED
@@ -1,58 +1,56 @@
1
- ---
2
- title: Multi-Modal AI Demo
3
- emoji: 🤖
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 3.50.2
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- # Multi-Modal AI Demo
13
-
14
- This project demonstrates the use of multi-modal AI capabilities using Hugging Face pretrained models. The application provides the following features:
15
-
16
- 1. **Image Captioning**: Generate descriptive captions for images
17
- 2. **Visual Question Answering**: Answer questions about the content of images
18
- 3. **Sentiment Analysis**: Analyze the sentiment of text inputs
19
-
20
- ## Requirements
21
-
22
- - Python 3.8+
23
- - Dependencies listed in `requirements.txt`
24
-
25
- ## Installation
26
-
27
- 1. Clone this repository
28
- 2. Install dependencies and setup the application:
29
- ```
30
- python run.py
31
- ```
32
- Then select option 5 to perform full setup (install requirements, fix dependencies, and download sample images)
33
-
34
- ## Known Issues and Solutions
35
-
36
- If you encounter errors related to package compatibility (Pydantic, FastAPI, or Gradio errors), use:
37
- ```
38
- python fix_dependencies.py
39
- ```
40
- This will install compatible versions of all dependencies to ensure the application runs correctly.
41
-
42
- ## Usage
43
-
44
- Run the web interface:
45
- ```
46
- python app.py
47
- ```
48
-
49
- Then open your browser and navigate to the URL shown in the terminal (typically http://127.0.0.1:7860).
50
-
51
-
52
- ## Models Used
53
-
54
- This demo uses the following pretrained models from Hugging Face:
55
- - Image Captioning: `nlpconnect/vit-gpt2-image-captioning`
56
- - Visual Question Answering: `nlpconnect/vit-gpt2-image-captioning` (simplified)
57
- - Sentiment Analysis: `distilbert-base-uncased-finetuned-sst-2-english`
58
-
 
1
+ ---
2
+ title: Multi-Modal AI Demo
3
+ emoji: 🤖
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.19.2
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ # Multi-Modal AI Demo
13
+
14
+ This project demonstrates the use of multi-modal AI capabilities using Hugging Face pretrained models. The application provides the following features:
15
+
16
+ 1. **Image Captioning**: Generate descriptive captions for images
17
+ 2. **Visual Question Answering**: Answer questions about the content of images
18
+ 3. **Sentiment Analysis**: Analyze the sentiment of text inputs
19
+
20
+ ## Requirements
21
+
22
+ - Python 3.8+
23
+ - Dependencies listed in `requirements.txt`
24
+
25
+ ## Local Installation
26
+
27
+ To run this project locally:
28
+
29
+ 1. Clone this repository
30
+ 2. Install dependencies:
31
+ ```
32
+ pip install -r requirements.txt
33
+ ```
34
+ 3. Run the application:
35
+ ```
36
+ python app.py
37
+ ```
38
+
39
+ Then open your browser and navigate to the URL shown in the terminal (typically http://127.0.0.1:7860).
40
+
41
+ ## Deploying to Hugging Face Spaces
42
+
43
+ This project is configured for direct deployment to Hugging Face Spaces. The core files needed for deployment are:
44
+
45
+ - `app.py` - Main application file
46
+ - `model_utils.py` - Utility functions for model operations
47
+ - `requirements.txt` - Project dependencies
48
+ - `README.md` - This documentation file with Spaces configuration
49
+
50
+ ## Models Used
51
+
52
+ This demo uses the following pretrained models from Hugging Face:
53
+ - Image Captioning: `nlpconnect/vit-gpt2-image-captioning`
54
+ - Visual Question Answering: `nlpconnect/vit-gpt2-image-captioning` (simplified)
55
+ - Sentiment Analysis: `distilbert-base-uncased-finetuned-sst-2-english`
56
+