{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Generative AI for Audio Application" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Restart Kernel (If needed)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'status': 'ok', 'restart': True}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" }, { "ename": "", "evalue": "", "output_type": "error", "traceback": [ "\u001b[1;31mThe Kernel crashed while executing code in the current cell or a previous cell. \n", "\u001b[1;31mPlease review the code in the cell(s) to identify a possible cause of the failure. \n", "\u001b[1;31mClick here for more info. \n", "\u001b[1;31mView Jupyter log for further details." ] } ], "source": [ "import IPython\n", "\n", "app = IPython.Application.instance()\n", "app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Import Libraries\n", "Most of the complexity for the chatbot is in [customizable_chatbot.py](./customizable_chatbot.py) that uses [audio.py](./audio.py) internally for the audio capabilities." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import gradio as gr\n", "from genai_voice.bots.chatbot import ChatBot\n", "from genai_voice.config.defaults import Config\n", "from genai_voice.logger.log_utils import log, LogLevels" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "19:22:35 Configuration: \t\n", " Default(MODEL_GPT_TURBO_NAME='gpt-4-turbo', \n", " OPENAI_API_KEY='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', \n", " TEMPERATURE=0.0, \n", " TOP_P=0.97, \n", " TOP_K=40, \n", " MAX_OUTPUT_TOKENS=2048, \n", " WEB_SCRAP_OUTPUT_DIR=data/context.txt)\n" ] } ], "source": [ "log(f'Configuration: {Config()}', log_level=LogLevels.ON)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set Current Working Directory \n", "\n", "We want to simulate running this notebook from the project root just as it would work when using `Poetry` scripts." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "19:22:58 Before directory change: /Users/capturedbyace/sources/genai_voice/app\n", "19:22:58 Changing directory to root\n", "19:22:58 Before directory change: /Users/capturedbyace/sources/genai_voice\n" ] } ], "source": [ "curr_wrk_dir = os.getcwd()\n", "log(f'Before directory change: {curr_wrk_dir}')\n", "if curr_wrk_dir.endswith('app'):\n", " log(f'Changing directory to root')\n", " os.chdir('..')\n", " curr_wrk_dir = os.getcwd()\n", "log(f'Before directory change: {curr_wrk_dir}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Create Data for LLM Context \n", "\n", "`Poetry` scripts allow us to install our code as a package and run functions executables. \n", "\n", "We will use the `ExtractWebPagesAndSaveData` script, that is defined in `pyproject.toml` to scrape, extract and generate the file that the LLM will use as context data.\n", "\n", "`SAMPLE_URLS` have been defined under provided under `genai_voice.data_utils.urls.py`. Feel free to modify the links in that file to customize the source of data. \n", "\n", "> **DISCLAIMER:** Be responsible when scraping data that is not yours, complying with the EULA of the sites and conducting it in a legal fashion. Also remember that most sites will throttle scrapes, so do this with caution. \n", "\n", "> **NOTE:** This is an optional step. It just shows you have you can get custom data for your LLM context. We have provided the data for this project. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "19:23:25 Starting the web scraper...\n", "19:23:25 Creating the AsyncChromiumLoader with #10 urls...\n", "19:23:25 Starting scraping...\n", "19:23:28 Content scraped\n", "19:23:29 Starting scraping...\n", "19:23:33 Content scraped\n", "19:23:33 Starting scraping...\n", "19:23:36 Content scraped\n", "19:23:36 Starting scraping...\n", "19:23:39 Content scraped\n", "19:23:39 Starting scraping...\n", "19:23:41 Content scraped\n", "19:23:41 Starting scraping...\n", "19:23:45 Content scraped\n", "19:23:45 Starting scraping...\n", "19:23:49 Content scraped\n", "19:23:49 Starting scraping...\n", "19:23:53 Content scraped\n", "19:23:53 Starting scraping...\n", "19:23:57 Content scraped\n", "19:23:57 Starting scraping...\n", "19:24:00 Content scraped\n", "19:24:00 Documents scraped.\n", "19:24:00 Using BeautifulSoutTransformer to extract ['h1', 'h2', 'h3', 'p'].\n", "19:24:02 Transformed #10 urls.\n", "19:24:02 Writing to output file.\n", "19:24:02 Successfully written data to data/context.txt\n" ] } ], "source": [ "!poetry run ExtractWebPagesAndSaveData" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gradio Interface\n", "This launches the UI, you will probably need to allow the browser to use the microphone to enable the audio functions." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "19:27:45 Context file: travel_bot_context.txt\n", "19:27:45 Creating the OpenAI Model Client.\n", "19:27:45 Initialized OpenAI model: gpt-4-turbo\n", "19:27:45 HTTP Request: GET http://127.0.0.1:7861/startup-events \"HTTP/1.1 200 OK\"\n", "19:27:45 HTTP Request: HEAD http://127.0.0.1:7861/ \"HTTP/1.1 200 OK\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Running on local URL: http://127.0.0.1:7861\n", "\n", "To create a public link, set `share=True` in `launch()`.\n" ] }, { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "19:27:46 HTTP Request: GET https://api.gradio.app/pkg-version \"HTTP/1.1 200 OK\"\n" ] }, { "data": { "text/plain": [] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "Getting prompt from audio device: (44100, array([ 9, 12, 10, ..., 0, 0, 0], dtype=int16))\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/capturedbyace/sources/genai_voice/venv/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py:496: FutureWarning: The input name `inputs` is deprecated. Please make sure to use `input_features` instead.\n", " warnings.warn(\n", "19:28:23 Transcribed prompt: Hello, how are you?\n", "19:28:23 Empty history. Creating a state list to track histories.\n", "19:28:23 {'temperature': 0.0, 'top_p': 0.97, 'top_k': 40, 'max_output_tokens': 2048, 'seed': 0, 'response_format': {'type': 'text'}}\n", "19:28:25 HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "19:28:25 Chatbot response: Hello! I'm just a virtual assistant, but thank you for asking. How can I assist you with your travel plans today?\n" ] } ], "source": [ "\"\"\"Run Chatbot app\"\"\"\n", "chatbot = ChatBot(enable_speakers=True, threaded=True)\n", "history = []\n", "\n", "def get_response(audio):\n", " \"\"\"Get Audio Response From Chatbot\"\"\"\n", " if not audio:\n", " raise ValueError(\"No audio file provided.\")\n", " prompt = chatbot.get_prompt_from_gradio_audio(audio)\n", " log(f\"Transcribed prompt: {prompt}\", log_level=LogLevels.ON)\n", " response = chatbot.respond(prompt, history)\n", " log(f\"Chatbot response: {response}\", log_level=LogLevels.ON)\n", " history.append([prompt, response])\n", " return response\n", "\n", "demo = gr.Interface(\n", " get_response,\n", " gr.Audio(sources=\"microphone\"),\n", " None,\n", " title=\"Wanderwise Travel Assistant\"\n", ")\n", "demo.launch()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }