MyPod_10 / README.md
siddhartharyaai's picture
Update README.md
e705dda verified

A newer version of the Streamlit SDK is available: 1.44.1

Upgrade
metadata
title: MyPod 9
emoji: 👀
colorFrom: green
colorTo: gray
sdk: streamlit
sdk_version: 1.41.1
app_file: app.py
pinned: false

MyPod: AI-Powered Podcast Magic

Welcome to MyPod, an AI-powered podcast generator designed to transform various sources of content into engaging and conversational podcasts. Whether you have a PDF, a webpage, a YouTube video, or a topic you want to explore, MyPod creates high-quality scripts and audio in minutes.


Features

  • Multi-Source Input: Supports PDF uploads, URLs, YouTube videos (with captions), and research topics.
  • AI Script Generation: Produces engaging podcast scripts tailored to your selected tone and duration.
  • Customizable Output: Personalize host/guest names, descriptions, and sponsored content.
  • Background Music Integration: Use the default background music or upload your custom track.
  • Post-Podcast Q&A: Ask follow-up questions and receive audio/text answers.
  • Audio Regeneration: Edit transcripts and regenerate audio seamlessly.

Installation

Prerequisites

  • Python 3.8+
  • pip (Python package manager)
  • Environment variables for required API keys:
    • GROQ_API_KEY
    • DEEPGRAM_API_KEY
    • RAPIDAPI_KEY

Dependencies

Install the dependencies using the provided requirements.txt:

pip install -r requirements.txt

Usage

Running the App

Run the application locally:

streamlit run app.py

The app will be accessible at http://localhost:8501 in your web browser.

How to Use

  1. Provide a Source:
    • Upload a PDF, enter a website URL, provide a YouTube video link, or input a topic to research.
  2. Set Parameters:
    • Choose the tone (Casual, Formal, Humorous, Youthful) and duration (1–60 minutes).
    • Add custom names/descriptions for the host and guest if desired.
    • Optionally include sponsored content and background music.
  3. Generate:
    • Click Generate Podcast to start the process. Wait a few moments for your podcast to be ready.
  4. Edit and Regenerate:
    • Edit the transcript if needed and regenerate the audio.
  5. Post-Podcast Q&A:
    • Ask up to 5 follow-up questions via text or voice input.

Key Features Explained

Input Options

  • PDF: Upload a document, and MyPod will extract its text.
  • Website URL: Scrape and summarize content from a webpage.
  • YouTube: Transcribe captions from YouTube videos.
  • Research Topic: Gather insights from RSS feeds, Wikipedia, and other sources.

Customization

  • Personalize the host and guest details for a tailored podcast experience.
  • Integrate sponsored content seamlessly into the podcast script.
  • Use custom background music for added flair.

Audio Generation

  • Uses high-quality Text-to-Speech (TTS) to produce natural-sounding audio.
  • Mixes background music with generated speech for professional-grade podcasts.

Post-Podcast Q&A

  • Allows users to interact with the AI for further clarification or discussion on the topic.
  • Supports text and voice-based inputs.

Files and Structure

  • app.py: Main application file for Streamlit.
  • prompts.py: Contains system prompts for script generation.
  • qa.py: Handles the Q&A functionality.
  • utils.py: Utility functions for text processing, audio generation, and API calls.
  • requirements.txt: List of dependencies.

API Integration

Groq API

Used for generating podcast scripts and answering follow-up questions.

Deepgram API

Handles transcription and Text-to-Speech (TTS).

RapidAPI (YouTube Transcription)

Extracts captions from YouTube videos for use in podcasts.


Environment Variables

Ensure the following environment variables are set:

  • GROQ_API_KEY: API key for Groq.
  • DEEPGRAM_API_KEY: API key for Deepgram.
  • RAPIDAPI_KEY: API key for RapidAPI (YouTube transcription).

Future Enhancements

  • Add support for additional languages.
  • Enhance customization options for tone and audio styles.
  • Expand Q&A functionality with more dynamic interactions.
  • Integrate analytics for user engagement tracking.

License

This project is licensed under the MIT License. See the LICENSE file for details.


Acknowledgments

  • Streamlit for the interactive interface.
  • OpenAI Whisper for speech recognition.
  • Deepgram and Groq for advanced AI capabilities.

For more information contact Siddharth Arya on [email protected]