File size: 4,483 Bytes
e7e42ff
 
 
 
 
 
 
 
 
 
 
e705dda
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
title: MyPod 9
emoji: 👀
colorFrom: green
colorTo: gray
sdk: streamlit
sdk_version: 1.41.1
app_file: app.py
pinned: false
---

# MyPod: AI-Powered Podcast Magic

Welcome to **MyPod**, an AI-powered podcast generator designed to transform various sources of content into engaging and conversational podcasts. Whether you have a PDF, a webpage, a YouTube video, or a topic you want to explore, MyPod creates high-quality scripts and audio in minutes.

---

## Features

- **Multi-Source Input**: Supports PDF uploads, URLs, YouTube videos (with captions), and research topics.
- **AI Script Generation**: Produces engaging podcast scripts tailored to your selected tone and duration.
- **Customizable Output**: Personalize host/guest names, descriptions, and sponsored content.
- **Background Music Integration**: Use the default background music or upload your custom track.
- **Post-Podcast Q&A**: Ask follow-up questions and receive audio/text answers.
- **Audio Regeneration**: Edit transcripts and regenerate audio seamlessly.

---

## Installation

### Prerequisites

- Python 3.8+
- pip (Python package manager)
- Environment variables for required API keys:
  - `GROQ_API_KEY`
  - `DEEPGRAM_API_KEY`
  - `RAPIDAPI_KEY`

### Dependencies

Install the dependencies using the provided `requirements.txt`:

```bash
pip install -r requirements.txt
```

---

## Usage

### Running the App

Run the application locally:

```bash
streamlit run app.py
```

The app will be accessible at `http://localhost:8501` in your web browser.

### How to Use

1. **Provide a Source**:
   - Upload a PDF, enter a website URL, provide a YouTube video link, or input a topic to research.
2. **Set Parameters**:
   - Choose the tone (Casual, Formal, Humorous, Youthful) and duration (1–60 minutes).
   - Add custom names/descriptions for the host and guest if desired.
   - Optionally include sponsored content and background music.
3. **Generate**:
   - Click `Generate Podcast` to start the process. Wait a few moments for your podcast to be ready.
4. **Edit and Regenerate**:
   - Edit the transcript if needed and regenerate the audio.
5. **Post-Podcast Q&A**:
   - Ask up to 5 follow-up questions via text or voice input.

---

## Key Features Explained

### Input Options
- **PDF**: Upload a document, and MyPod will extract its text.
- **Website URL**: Scrape and summarize content from a webpage.
- **YouTube**: Transcribe captions from YouTube videos.
- **Research Topic**: Gather insights from RSS feeds, Wikipedia, and other sources.

### Customization
- Personalize the host and guest details for a tailored podcast experience.
- Integrate sponsored content seamlessly into the podcast script.
- Use custom background music for added flair.

### Audio Generation
- Uses high-quality Text-to-Speech (TTS) to produce natural-sounding audio.
- Mixes background music with generated speech for professional-grade podcasts.

### Post-Podcast Q&A
- Allows users to interact with the AI for further clarification or discussion on the topic.
- Supports text and voice-based inputs.

---

## Files and Structure

- **`app.py`**: Main application file for Streamlit.
- **`prompts.py`**: Contains system prompts for script generation.
- **`qa.py`**: Handles the Q&A functionality.
- **`utils.py`**: Utility functions for text processing, audio generation, and API calls.
- **`requirements.txt`**: List of dependencies.

---

## API Integration

### Groq API
Used for generating podcast scripts and answering follow-up questions.

### Deepgram API
Handles transcription and Text-to-Speech (TTS).

### RapidAPI (YouTube Transcription)
Extracts captions from YouTube videos for use in podcasts.

---

## Environment Variables
Ensure the following environment variables are set:

- `GROQ_API_KEY`: API key for Groq.
- `DEEPGRAM_API_KEY`: API key for Deepgram.
- `RAPIDAPI_KEY`: API key for RapidAPI (YouTube transcription).

---

## Future Enhancements

- Add support for additional languages.
- Enhance customization options for tone and audio styles.
- Expand Q&A functionality with more dynamic interactions.
- Integrate analytics for user engagement tracking.

---

## License

This project is licensed under the MIT License. See the LICENSE file for details.

---

## Acknowledgments

- Streamlit for the interactive interface.
- OpenAI Whisper for speech recognition.
- Deepgram and Groq for advanced AI capabilities.

For more information contact Siddharth Arya on [email protected]