Spaces:
Running
A newer version of the Gradio SDK is available:
5.28.0
title: Solution_2
app_file: app.py
sdk: gradio
sdk_version: 4.44.1
Prepare Development Environment
We will be using Poetry
, a versatile tool for Python projects that simplifying dependency management and packaging. It automates the process of declaring, installing, and updating the libraries your project relies on. By using a lockfile, Poetry
guarantees consistent and reproducible builds, ensuring that your project's dependencies are always installed in the specified versions. Additionally, Poetry
provides convenient features for building your project into distributable formats, making it easier to share and deploy your work.
Install Python (both 3.10.12 and 3.12.6 were used in testing to offer a wide range of diversity.)
Install Poetry.The flexibility of poetry makes it best of breed as it plays well with
pip
andconda
alike.Install ffmpeg
Windows
To simplify setup for windows users,
ffmpeg.exe
andffprode.exe
have been included in this repo underlibs
directory. For other operating systems, please reference below.Mac
For MacOS Monterey v12 and below: We will be using MacPorts. Homebrew is no longer supported for these MacOSx versions.
- Install Apple's CLI Developer Tools (If required):
xcode-select --install
- Download and install MacPorts for the version of your Mac operating system.
- Install Apple's CLI Developer Tools (If required):
For Mac OSX Ventura and above: We will be using
brew
but first we must make sure that it's up-to-date.First update
brew
.brew update
NOTE: if you get an error like
fatal: couldn't find remote ref refs/heads/master
when trying to runbrew update
, the possible culprit isdart-lang
changing it's default branch frommaster
tomain
. To resolve this run the following set of commands:brew tap --repair && brew cleanup && brew update-reset
Now you should be able to run
brew update
.Next, we can upgrade the outdated
brew
formulae.brew upgrade
NOTE: You might need to run this a few times to get all outdated formulae successfully updated. Depending on how outdated packages are, this may take some time to complete. Please be patient. This is a good time to grab a
brew
:)We can now install the latest/stable version of
ffmpeg
in Homebrew.brew install ffmpeg
Linux
Install using the following command:apt install libasound2-dev portaudio19-dev libportaudio2 libportaudiocpp0 ffmpeg
Create a Virtual Environment, and install libraries needed, using a Linux shell or git bash in Windows
Windows
python -m venv .venv .venv/Scripts/activate
Linux/Mac
python3 -m venv .venv source ./.venv/bin/activate
Mac Only
Prerequisites
One of the requirements is PyAudio. According to their instructions we need to use Homebrew to install the prerequisite
portaudio
library before we can installPyAudio
.brew install portaudio
Install the code, from the root directory that contains
pyproject.toml
file, usingpoetry
.poetry lock poetry install
Install
playwright
, an open-source tool for auomating web testing in python. We'll use it to get some data for our LLM.playwright install
Test Environment
Run the command below to make sure the virtual environment is activated.
python -V
Create an OpenAI Account and Obtain a Key
- Follow the instructions here to create your key.
- Make a copy of .env_template and rename it to
.env
. Then add your key in the.env
file as shown:
OPENAI_API_KEY="<YOUR_KEY_GOES_HERE>"
Project Directory Overview
The project is structured in a modularized manner, focusing on building and running chatbots with voice capabilities. It has a good separation of concerns with well-defined directories for different functionalities. Such as bots
, data_utils
, and models
, promoting code reusability and maintainability. It offers various voice capabilities, including text-to-speech, speech-to-text, and voice-based interactions, making it suitable for creating interactive chatbot applications.
Key Directories:
app: Contains the main application files:
chatbot_gradio_runner.ipynb
: Jupyter notebook for running the chatbot interactively.chatbot_gradio_runner.py
: Python script for running the chatbot with Gradio for a web interface.
data: Stores various types of data for chatbots. This is primarily used for LLM context for different bots:
travel_bot_context.txt
: This is the context we'll use for the chat assistant today.Other files with specific chatbot contexts (financial, call center, etc.).
CSV and JSON files formats to allow flexibility.
genai_voice: Core project code:
bots: Code specific to implementing chatbots (
chatbot.py
).config: Configuration files (
defaults.py
).data_utils: Utilities for data handling and gathering data from websites (e.g.,
extract_web_data.py
).defintions: Defines response formats and prompts for chatbots.
logger: Custom utility for logging information.
models: Code for managing and interacting with language models (
open_ai.py
).moderation: Code for handling and filtering chatbot responses.
processing: Functions for processing audio data (
audio.py
).
libs: External libraries used by the project (ffmpeg binaries for Windows only).
poetry.lock and pyproject.toml: Poetry-related files for dependency management.
Launch Notebook
We need to be able to point Jupyter notebook to our virtual environment that has the right packages and libraries.
ipython kernel install --user --name=venv
Run the command below from the virtual environment to launch the notebook in a browser. Once ran, select the
venv
kernel and continue executing the cells.jupyter notebook app/chatbot_gradio_runner.ipynb
You can also run the same chatbot directly from a python script using
poetry
.poetry run RunChatBotScript
Troubleshooting
- Try and use a headset microphone
- Record in a quiet room
- Make sure that you have granted microphone permissions
- Ensure the required audio libraries are installed