Spaces:
Running
Running
burtenshaw
commited on
Commit
Β·
f140a78
1
Parent(s):
d338da0
build gradio app around scripts
Browse files- .gitignore +4 -1
- app/.tmp/pdf/presentation.html +0 -0
- app/.url_cache/presentations_cache.db +0 -0
- app/app.py +1096 -0
- app/src/__init__.py +0 -0
- {scripts β app/src}/create_presentation.py +6 -3
- {scripts β app/src}/create_video.py +0 -0
- {scripts β app/src}/transcription_to_audio.py +0 -0
- {chapter1 β app}/template/index.html +0 -0
- {chapter1 β app}/template/remark.min.js +0 -0
- {chapter1 β app}/template/style.scss +0 -0
- chapter1/material/1_presentation.md +130 -0
.gitignore
CHANGED
@@ -189,4 +189,7 @@ cython_debug/
|
|
189 |
*.gif
|
190 |
*.bmp
|
191 |
*.tiff
|
192 |
-
*.pdf
|
|
|
|
|
|
|
|
189 |
*.gif
|
190 |
*.bmp
|
191 |
*.tiff
|
192 |
+
*.pdf
|
193 |
+
|
194 |
+
.DS_Store
|
195 |
+
.vscode
|
app/.tmp/pdf/presentation.html
ADDED
The diff for this file is too large to render.
See raw diff
|
|
app/.url_cache/presentations_cache.db
ADDED
Binary file (41 kB). View file
|
|
app/app.py
ADDED
@@ -0,0 +1,1096 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import requests
|
3 |
+
from bs4 import BeautifulSoup
|
4 |
+
import os
|
5 |
+
import re
|
6 |
+
import subprocess
|
7 |
+
import tempfile
|
8 |
+
import shutil
|
9 |
+
from pathlib import Path
|
10 |
+
import logging
|
11 |
+
from dotenv import load_dotenv
|
12 |
+
import shelve
|
13 |
+
|
14 |
+
# Import functions from your scripts (assuming they are structured appropriately)
|
15 |
+
# It's often better to refactor scripts into functions for easier import
|
16 |
+
try:
|
17 |
+
from src.create_presentation import (
|
18 |
+
generate_presentation_with_llm,
|
19 |
+
DEFAULT_LLM_MODEL,
|
20 |
+
DEFAULT_PRESENTATION_PROMPT_TEMPLATE,
|
21 |
+
)
|
22 |
+
from src.transcription_to_audio import text_to_speech, VOICE_ID
|
23 |
+
from src.create_video import (
|
24 |
+
find_audio_files,
|
25 |
+
convert_pdf_to_images,
|
26 |
+
create_video_clips,
|
27 |
+
concatenate_clips,
|
28 |
+
cleanup_temp_files,
|
29 |
+
)
|
30 |
+
from huggingface_hub import InferenceClient
|
31 |
+
except ImportError as e:
|
32 |
+
print(f"Error importing script functions: {e}")
|
33 |
+
print("Please ensure scripts are in the 'src' directory and structured correctly.")
|
34 |
+
exit(1)
|
35 |
+
|
36 |
+
load_dotenv()
|
37 |
+
|
38 |
+
# --- Configuration & Setup ---
|
39 |
+
logging.basicConfig(
|
40 |
+
level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
|
41 |
+
)
|
42 |
+
logger = logging.getLogger(__name__)
|
43 |
+
|
44 |
+
HF_API_KEY = os.getenv("HF_API_KEY")
|
45 |
+
LLM_MODEL = os.getenv("LLM_MODEL", DEFAULT_LLM_MODEL)
|
46 |
+
PRESENTATION_PROMPT = os.getenv(
|
47 |
+
"PRESENTATION_PROMPT", DEFAULT_PRESENTATION_PROMPT_TEMPLATE
|
48 |
+
)
|
49 |
+
CACHE_DIR = ".cache" # For TTS caching
|
50 |
+
URL_CACHE_DIR = ".url_cache"
|
51 |
+
URL_CACHE_FILE = os.path.join(URL_CACHE_DIR, "presentations_cache")
|
52 |
+
|
53 |
+
# Initialize clients (do this once if possible, or manage carefully in functions)
|
54 |
+
try:
|
55 |
+
if HF_API_KEY:
|
56 |
+
hf_client = InferenceClient(token=HF_API_KEY, provider="cohere")
|
57 |
+
else:
|
58 |
+
logger.warning("HF_API_KEY not found. LLM generation will fail.")
|
59 |
+
hf_client = None
|
60 |
+
except Exception as e:
|
61 |
+
logger.error(f"Failed to initialize Hugging Face client: {e}")
|
62 |
+
hf_client = None
|
63 |
+
|
64 |
+
# --- Helper Functions ---
|
65 |
+
|
66 |
+
|
67 |
+
def fetch_webpage_content(url):
|
68 |
+
"""Fetches and extracts basic text content from a webpage."""
|
69 |
+
logger.info(f"Fetching content from: {url}")
|
70 |
+
try:
|
71 |
+
response = requests.get(url, timeout=15)
|
72 |
+
response.raise_for_status() # Raise an exception for bad status codes
|
73 |
+
soup = BeautifulSoup(response.text, "html.parser")
|
74 |
+
|
75 |
+
# Basic text extraction (can be improved significantly)
|
76 |
+
paragraphs = soup.find_all("p")
|
77 |
+
headings = soup.find_all(["h1", "h2", "h3", "h4", "h5", "h6"])
|
78 |
+
list_items = soup.find_all("li")
|
79 |
+
|
80 |
+
content = (
|
81 |
+
"\n".join([h.get_text() for h in headings])
|
82 |
+
+ "\n\n"
|
83 |
+
+ "\n".join([p.get_text() for p in paragraphs])
|
84 |
+
+ "\n\n"
|
85 |
+
+ "\n".join(["- " + li.get_text() for li in list_items])
|
86 |
+
)
|
87 |
+
|
88 |
+
# Simple cleanup
|
89 |
+
content = re.sub(r"\s\s+", " ", content).strip()
|
90 |
+
logger.info(
|
91 |
+
f"Successfully fetched and parsed content (length: {len(content)})."
|
92 |
+
)
|
93 |
+
return content
|
94 |
+
except requests.exceptions.RequestException as e:
|
95 |
+
logger.error(f"Error fetching URL {url}: {e}")
|
96 |
+
return None
|
97 |
+
except Exception as e:
|
98 |
+
logger.error(f"Error parsing URL {url}: {e}")
|
99 |
+
return None
|
100 |
+
|
101 |
+
|
102 |
+
def parse_presentation_markdown(markdown_content):
|
103 |
+
"""Splits presentation markdown into slides with content and notes."""
|
104 |
+
slides = []
|
105 |
+
slide_parts = re.split(r"\n\n---\n\n", markdown_content)
|
106 |
+
for i, part in enumerate(slide_parts):
|
107 |
+
if "???" in part:
|
108 |
+
content, notes = part.split("???", 1)
|
109 |
+
slides.append({"id": i, "content": content.strip(), "notes": notes.strip()})
|
110 |
+
else:
|
111 |
+
# Handle slides without notes (like title slide maybe)
|
112 |
+
slides.append(
|
113 |
+
{
|
114 |
+
"id": i,
|
115 |
+
"content": part.strip(),
|
116 |
+
"notes": "", # Add empty notes field
|
117 |
+
}
|
118 |
+
)
|
119 |
+
logger.info(f"Parsed {len(slides)} slides from markdown.")
|
120 |
+
return slides
|
121 |
+
|
122 |
+
|
123 |
+
def reconstruct_presentation_markdown(slides_data):
|
124 |
+
"""Reconstructs the markdown string from slide data."""
|
125 |
+
full_md = []
|
126 |
+
for slide in slides_data:
|
127 |
+
slide_md = slide["content"]
|
128 |
+
if slide[
|
129 |
+
"notes"
|
130 |
+
]: # Only add notes separator if notes exist and are not just whitespace
|
131 |
+
slide_md += f"\n\n???\n{slide['notes'].strip()}"
|
132 |
+
full_md.append(slide_md.strip()) # Ensure each slide part is stripped
|
133 |
+
return "\n\n---\n\n".join(full_md)
|
134 |
+
|
135 |
+
|
136 |
+
def generate_pdf_from_markdown(markdown_file_path, output_pdf_path):
|
137 |
+
"""Generates a PDF from a Markdown file using bs export + decktape."""
|
138 |
+
logger.info(f"Attempting PDF gen: {markdown_file_path} -> {output_pdf_path}")
|
139 |
+
working_dir = os.path.dirname(markdown_file_path)
|
140 |
+
markdown_filename = os.path.basename(markdown_file_path)
|
141 |
+
html_output_dir_name = "bs_html_output"
|
142 |
+
html_output_dir_abs = os.path.join(working_dir, html_output_dir_name)
|
143 |
+
expected_html_filename = os.path.splitext(markdown_filename)[0] + ".html"
|
144 |
+
generated_html_path_abs = os.path.join(html_output_dir_abs, expected_html_filename)
|
145 |
+
pdf_gen_success = False # Flag to track success
|
146 |
+
|
147 |
+
# ---- Step 1: Generate HTML using bs export ----
|
148 |
+
try:
|
149 |
+
Path(html_output_dir_abs).mkdir(parents=True, exist_ok=True)
|
150 |
+
export_command = ["bs", "export", markdown_filename, "-o", html_output_dir_name]
|
151 |
+
logger.info(f"Running: {' '.join(export_command)} in CWD: {working_dir}")
|
152 |
+
export_result = subprocess.run(
|
153 |
+
export_command,
|
154 |
+
cwd=working_dir,
|
155 |
+
capture_output=True,
|
156 |
+
text=True,
|
157 |
+
check=True,
|
158 |
+
timeout=60,
|
159 |
+
)
|
160 |
+
logger.info("Backslide (bs export) OK.")
|
161 |
+
logger.debug(f"bs export stdout:\n{export_result.stdout}")
|
162 |
+
logger.debug(f"bs export stderr:\n{export_result.stderr}")
|
163 |
+
|
164 |
+
if not os.path.exists(generated_html_path_abs):
|
165 |
+
logger.error(f"Expected HTML not found: {generated_html_path_abs}")
|
166 |
+
try:
|
167 |
+
files_in_dir = os.listdir(html_output_dir_abs)
|
168 |
+
logger.error(f"Files in {html_output_dir_abs}: {files_in_dir}")
|
169 |
+
except FileNotFoundError:
|
170 |
+
logger.error(
|
171 |
+
f"HTML output directory {html_output_dir_abs} not found after bs run."
|
172 |
+
)
|
173 |
+
raise FileNotFoundError(
|
174 |
+
f"Generated HTML not found: {generated_html_path_abs}"
|
175 |
+
)
|
176 |
+
|
177 |
+
except FileNotFoundError:
|
178 |
+
logger.error(
|
179 |
+
"`bs` command not found. Install backslide (`npm install -g backslide`)."
|
180 |
+
)
|
181 |
+
raise gr.Error("HTML generation tool (backslide/bs) not found.")
|
182 |
+
except subprocess.CalledProcessError as e:
|
183 |
+
logger.error(f"Backslide (bs export) failed (code {e.returncode}).")
|
184 |
+
logger.error(f"bs stderr:\n{e.stderr}")
|
185 |
+
raise gr.Error(f"Backslide HTML failed: {e.stderr[:500]}...")
|
186 |
+
except subprocess.TimeoutExpired:
|
187 |
+
logger.error("Backslide (bs export) timed out.")
|
188 |
+
raise gr.Error("HTML generation timed out (backslide).")
|
189 |
+
except Exception as e:
|
190 |
+
logger.error(f"Unexpected error during bs export: {e}", exc_info=True)
|
191 |
+
raise gr.Error(f"Unexpected error during HTML generation: {e}")
|
192 |
+
|
193 |
+
# ---- Step 2: Generate PDF from HTML using decktape ----
|
194 |
+
try:
|
195 |
+
Path(output_pdf_path).parent.mkdir(parents=True, exist_ok=True)
|
196 |
+
html_file_url = Path(generated_html_path_abs).as_uri()
|
197 |
+
decktape_command = ["decktape", html_file_url, str(output_pdf_path)]
|
198 |
+
logger.info(f"Running PDF conversion: {' '.join(decktape_command)}")
|
199 |
+
decktape_result = subprocess.run(
|
200 |
+
decktape_command,
|
201 |
+
capture_output=True,
|
202 |
+
text=True,
|
203 |
+
check=True,
|
204 |
+
timeout=120,
|
205 |
+
)
|
206 |
+
logger.info("Decktape command executed successfully.")
|
207 |
+
logger.debug(f"decktape stdout:\n{decktape_result.stdout}")
|
208 |
+
logger.debug(f"decktape stderr:\n{decktape_result.stderr}")
|
209 |
+
|
210 |
+
if os.path.exists(output_pdf_path):
|
211 |
+
logger.info(f"PDF generated successfully: {output_pdf_path}")
|
212 |
+
pdf_gen_success = True # Mark as success
|
213 |
+
return output_pdf_path
|
214 |
+
else:
|
215 |
+
logger.error("Decktape command finished but output PDF not found.")
|
216 |
+
return None
|
217 |
+
|
218 |
+
except FileNotFoundError:
|
219 |
+
logger.error(
|
220 |
+
"`decktape` command not found. Install decktape (`npm install -g decktape`)."
|
221 |
+
)
|
222 |
+
raise gr.Error("PDF generation tool (decktape) not found.")
|
223 |
+
except subprocess.CalledProcessError as e:
|
224 |
+
logger.error(f"Decktape command failed (code {e.returncode}).")
|
225 |
+
logger.error(f"decktape stderr:\n{e.stderr}")
|
226 |
+
raise gr.Error(f"Decktape PDF failed: {e.stderr[:500]}...")
|
227 |
+
except subprocess.TimeoutExpired:
|
228 |
+
logger.error("Decktape command timed out.")
|
229 |
+
raise gr.Error("PDF generation timed out (decktape).")
|
230 |
+
except Exception as e:
|
231 |
+
logger.error(
|
232 |
+
f"Unexpected error during decktape PDF generation: {e}", exc_info=True
|
233 |
+
)
|
234 |
+
raise gr.Error(f"Unexpected error during PDF generation: {e}")
|
235 |
+
finally:
|
236 |
+
# --- Cleanup HTML output directory ---
|
237 |
+
if os.path.exists(html_output_dir_abs):
|
238 |
+
try:
|
239 |
+
shutil.rmtree(html_output_dir_abs)
|
240 |
+
logger.info(f"Cleaned up HTML temp dir: {html_output_dir_abs}")
|
241 |
+
except Exception as cleanup_e:
|
242 |
+
logger.warning(
|
243 |
+
f"Could not cleanup HTML dir {html_output_dir_abs}: {cleanup_e}"
|
244 |
+
)
|
245 |
+
# Log final status
|
246 |
+
if pdf_gen_success:
|
247 |
+
logger.info(f"PDF generation process completed for {output_pdf_path}.")
|
248 |
+
else:
|
249 |
+
logger.error(f"PDF generation process failed for {output_pdf_path}.")
|
250 |
+
|
251 |
+
|
252 |
+
# --- Helper Function to Read CSS ---
|
253 |
+
def load_css(css_path="template/style.css"):
|
254 |
+
"""Loads CSS content from a file."""
|
255 |
+
try:
|
256 |
+
with open(css_path, "r", encoding="utf-8") as f:
|
257 |
+
return f.read()
|
258 |
+
except FileNotFoundError:
|
259 |
+
logger.warning(f"CSS file not found at {css_path}. No custom styles applied.")
|
260 |
+
return "" # Return empty string instead of None
|
261 |
+
except Exception as e:
|
262 |
+
logger.error(f"Error reading CSS file {css_path}: {e}")
|
263 |
+
return "" # Return empty string on error
|
264 |
+
|
265 |
+
|
266 |
+
# --- Gradio Workflow Functions ---
|
267 |
+
|
268 |
+
|
269 |
+
def step1_fetch_and_generate_presentation(url, progress=gr.Progress(track_tqdm=True)):
|
270 |
+
"""Fetches content, generates presentation markdown, prepares editor, and copies template. Uses caching based on URL."""
|
271 |
+
if not url:
|
272 |
+
raise gr.Error("Please enter a URL.")
|
273 |
+
logger.info(f"Step 1: Fetching & Generating for {url}")
|
274 |
+
gr.Info(f"Starting Step 1: Fetching content from {url}...")
|
275 |
+
|
276 |
+
# --- Cache Check ---
|
277 |
+
try:
|
278 |
+
os.makedirs(URL_CACHE_DIR, exist_ok=True) # Ensure cache dir exists
|
279 |
+
with shelve.open(URL_CACHE_FILE) as cache:
|
280 |
+
if url in cache:
|
281 |
+
logger.info(f"Cache hit for URL: {url}")
|
282 |
+
progress(0.5, desc="Loading cached presentation...")
|
283 |
+
cached_data = cache[url]
|
284 |
+
presentation_md = cached_data.get("presentation_md")
|
285 |
+
slides_data = cached_data.get("slides_data")
|
286 |
+
|
287 |
+
if presentation_md and slides_data:
|
288 |
+
temp_dir = tempfile.mkdtemp()
|
289 |
+
md_path = os.path.join(temp_dir, "presentation.md")
|
290 |
+
try:
|
291 |
+
with open(md_path, "w", encoding="utf-8") as f:
|
292 |
+
f.write(presentation_md)
|
293 |
+
logger.info(
|
294 |
+
f"Wrote cached presentation to temp file: {md_path}"
|
295 |
+
)
|
296 |
+
|
297 |
+
# --- Copy Template Directory for Cached Item ---
|
298 |
+
template_src_dir = "template"
|
299 |
+
template_dest_dir = os.path.join(temp_dir, "template")
|
300 |
+
if os.path.isdir(template_src_dir):
|
301 |
+
try:
|
302 |
+
shutil.copytree(template_src_dir, template_dest_dir)
|
303 |
+
logger.info(
|
304 |
+
f"Copied template dir to {template_dest_dir} (cached)"
|
305 |
+
)
|
306 |
+
except Exception as copy_e:
|
307 |
+
logger.error(
|
308 |
+
f"Failed to copy template dir for cache: {copy_e}"
|
309 |
+
)
|
310 |
+
shutil.rmtree(temp_dir)
|
311 |
+
raise gr.Error(f"Failed to prepare template: {copy_e}")
|
312 |
+
else:
|
313 |
+
logger.error(
|
314 |
+
f"Template source dir '{template_src_dir}' not found."
|
315 |
+
)
|
316 |
+
shutil.rmtree(temp_dir)
|
317 |
+
raise gr.Error(
|
318 |
+
f"Required template '{template_src_dir}' not found."
|
319 |
+
)
|
320 |
+
|
321 |
+
progress(0.9, desc="Preparing editor from cache...")
|
322 |
+
logger.info(f"Using cached data for {len(slides_data)} slides.")
|
323 |
+
# Return updates for the UI state and controls
|
324 |
+
return (
|
325 |
+
temp_dir,
|
326 |
+
md_path,
|
327 |
+
slides_data,
|
328 |
+
gr.update(visible=True), # editor_column
|
329 |
+
gr.update(
|
330 |
+
visible=True
|
331 |
+
), # btn_generate_pdf (Enable PDF button next)
|
332 |
+
gr.update(
|
333 |
+
interactive=False
|
334 |
+
), # btn_fetch_generate (disable)
|
335 |
+
)
|
336 |
+
except Exception as e:
|
337 |
+
logger.error(f"Error writing cached markdown: {e}")
|
338 |
+
if os.path.exists(temp_dir):
|
339 |
+
shutil.rmtree(temp_dir)
|
340 |
+
else:
|
341 |
+
logger.warning(f"Cache entry for {url} incomplete. Regenerating.")
|
342 |
+
# --- Cache Miss or Failed Cache Load ---
|
343 |
+
logger.info(f"Cache miss for URL: {url}. Proceeding with generation.")
|
344 |
+
progress(0.1, desc="Fetching webpage content...")
|
345 |
+
if not hf_client:
|
346 |
+
raise gr.Error("LLM Client not initialized. Check API Key.")
|
347 |
+
|
348 |
+
web_content = fetch_webpage_content(url)
|
349 |
+
if not web_content:
|
350 |
+
raise gr.Error("Failed to fetch or parse content from the URL.")
|
351 |
+
|
352 |
+
progress(0.3, desc="Generating presentation with LLM...")
|
353 |
+
try:
|
354 |
+
presentation_md = generate_presentation_with_llm(
|
355 |
+
hf_client, LLM_MODEL, PRESENTATION_PROMPT, web_content, url
|
356 |
+
)
|
357 |
+
except Exception as e:
|
358 |
+
logger.error(f"Error during LLM call: {e}", exc_info=True)
|
359 |
+
raise gr.Error(f"Failed to generate presentation from LLM: {e}")
|
360 |
+
|
361 |
+
if not presentation_md:
|
362 |
+
logger.error("LLM generation returned None.")
|
363 |
+
raise gr.Error("LLM generation failed (received None).")
|
364 |
+
|
365 |
+
# Check for basic structure early, but parsing handles final validation
|
366 |
+
if "---" not in presentation_md:
|
367 |
+
logger.warning(
|
368 |
+
"LLM output missing slide separators ('---'). Parsing might fail."
|
369 |
+
)
|
370 |
+
if "???" not in presentation_md:
|
371 |
+
logger.warning(
|
372 |
+
"LLM output missing notes separators ('???'). Notes might be empty."
|
373 |
+
)
|
374 |
+
|
375 |
+
progress(0.7, desc="Parsing presentation slides...")
|
376 |
+
slides_data = parse_presentation_markdown(presentation_md)
|
377 |
+
if not slides_data:
|
378 |
+
logger.error("Parsing markdown resulted in zero slides.")
|
379 |
+
raise gr.Error("Failed to parse generated presentation markdown.")
|
380 |
+
|
381 |
+
# Create a temporary directory for this session
|
382 |
+
temp_dir = tempfile.mkdtemp()
|
383 |
+
md_path = os.path.join(temp_dir, "presentation.md")
|
384 |
+
with open(md_path, "w", encoding="utf-8") as f:
|
385 |
+
f.write(presentation_md)
|
386 |
+
logger.info(f"Presentation markdown saved to temp file: {md_path}")
|
387 |
+
|
388 |
+
# --- Copy Template Directory for New Item ---
|
389 |
+
template_src_dir = "template"
|
390 |
+
template_dest_dir = os.path.join(temp_dir, "template")
|
391 |
+
if os.path.isdir(template_src_dir):
|
392 |
+
try:
|
393 |
+
shutil.copytree(template_src_dir, template_dest_dir)
|
394 |
+
logger.info(f"Copied template directory to {template_dest_dir}")
|
395 |
+
except Exception as copy_e:
|
396 |
+
logger.error(f"Failed to copy template directory: {copy_e}")
|
397 |
+
shutil.rmtree(temp_dir)
|
398 |
+
raise gr.Error(f"Failed to prepare template: {copy_e}")
|
399 |
+
else:
|
400 |
+
logger.error(f"Template source dir '{template_src_dir}' not found.")
|
401 |
+
shutil.rmtree(temp_dir)
|
402 |
+
raise gr.Error(f"Required template '{template_src_dir}' not found.")
|
403 |
+
|
404 |
+
# --- Store in Cache ---
|
405 |
+
try:
|
406 |
+
with shelve.open(URL_CACHE_FILE) as cache_write:
|
407 |
+
cache_write[url] = {
|
408 |
+
"presentation_md": presentation_md,
|
409 |
+
"slides_data": slides_data,
|
410 |
+
}
|
411 |
+
logger.info(
|
412 |
+
f"Stored generated presentation in cache for URL: {url}"
|
413 |
+
)
|
414 |
+
except Exception as e:
|
415 |
+
logger.error(f"Failed to write to cache for URL {url}: {e}")
|
416 |
+
|
417 |
+
progress(0.9, desc="Preparing editor...")
|
418 |
+
logger.info(f"Prepared data for {len(slides_data)} slides.")
|
419 |
+
|
420 |
+
# Return updates for the UI state and controls
|
421 |
+
return (
|
422 |
+
temp_dir,
|
423 |
+
md_path,
|
424 |
+
slides_data,
|
425 |
+
gr.update(visible=True), # editor_column
|
426 |
+
gr.update(visible=True), # btn_generate_pdf (Enable PDF button next)
|
427 |
+
gr.update(interactive=False), # btn_fetch_generate (disable)
|
428 |
+
)
|
429 |
+
|
430 |
+
except Exception as e:
|
431 |
+
logger.error(f"Error in step 1 (fetch/generate): {e}", exc_info=True)
|
432 |
+
raise gr.Error(f"Error during presentation setup: {e}")
|
433 |
+
|
434 |
+
|
435 |
+
def step2_build_slides(
|
436 |
+
state_temp_dir,
|
437 |
+
state_md_path,
|
438 |
+
state_slides_data,
|
439 |
+
*editors,
|
440 |
+
progress=gr.Progress(track_tqdm=True),
|
441 |
+
):
|
442 |
+
"""Renamed from step2_generate_pdf"""
|
443 |
+
if not all([state_temp_dir, state_md_path, state_slides_data]):
|
444 |
+
raise gr.Error("Session state missing.")
|
445 |
+
logger.info("Step 2: Building Slides (PDF + Images)")
|
446 |
+
gr.Info("Starting Step 2: Building slides...")
|
447 |
+
num_slides = len(state_slides_data)
|
448 |
+
MAX_SLIDES = 20
|
449 |
+
all_editors = list(editors)
|
450 |
+
if len(all_editors) != MAX_SLIDES * 2:
|
451 |
+
raise gr.Error(f"Incorrect editor inputs: {len(all_editors)}")
|
452 |
+
edited_contents = all_editors[:MAX_SLIDES][:num_slides]
|
453 |
+
edited_notes_list = all_editors[MAX_SLIDES:][:num_slides]
|
454 |
+
if len(edited_contents) != num_slides or len(edited_notes_list) != num_slides:
|
455 |
+
raise gr.Error("Editor input mismatch.")
|
456 |
+
|
457 |
+
progress(0.1, desc="Saving edited markdown...")
|
458 |
+
updated_slides = []
|
459 |
+
for i in range(num_slides):
|
460 |
+
updated_slides.append(
|
461 |
+
{"id": i, "content": edited_contents[i], "notes": edited_notes_list[i]}
|
462 |
+
)
|
463 |
+
updated_md = reconstruct_presentation_markdown(updated_slides)
|
464 |
+
try:
|
465 |
+
with open(state_md_path, "w", encoding="utf-8") as f:
|
466 |
+
f.write(updated_md)
|
467 |
+
logger.info(f"Saved edited markdown: {state_md_path}")
|
468 |
+
except IOError as e:
|
469 |
+
raise gr.Error(f"Failed to save markdown: {e}")
|
470 |
+
|
471 |
+
progress(0.3, desc="Generating PDF...")
|
472 |
+
pdf_output_path = os.path.join(state_temp_dir, "presentation.pdf")
|
473 |
+
generated_pdf_path = generate_pdf_from_markdown(state_md_path, pdf_output_path)
|
474 |
+
if not generated_pdf_path:
|
475 |
+
raise gr.Error("PDF generation failed (check logs).")
|
476 |
+
|
477 |
+
progress(0.7, desc="Converting PDF to images...")
|
478 |
+
pdf_images = []
|
479 |
+
try:
|
480 |
+
pdf_images = convert_pdf_to_images(
|
481 |
+
generated_pdf_path, dpi=150
|
482 |
+
) # Use generated path
|
483 |
+
if not pdf_images:
|
484 |
+
raise gr.Error("PDF to image conversion failed.")
|
485 |
+
logger.info(f"Converted PDF to {len(pdf_images)} images.")
|
486 |
+
if len(pdf_images) != num_slides:
|
487 |
+
gr.Warning(
|
488 |
+
f"PDF page count ({len(pdf_images)}) != slide count ({num_slides}). Images might mismatch."
|
489 |
+
)
|
490 |
+
# Pad or truncate? For now, just return what we have, UI update logic handles MAX_SLIDES
|
491 |
+
except Exception as e:
|
492 |
+
logger.error(f"Error converting PDF to images: {e}", exc_info=True)
|
493 |
+
# Proceed without images? Or raise error? Let's raise.
|
494 |
+
raise gr.Error(f"Failed to convert PDF to images: {e}")
|
495 |
+
|
496 |
+
info_msg = f"Built {len(pdf_images)} slide images. Ready for Step 3."
|
497 |
+
logger.info(info_msg)
|
498 |
+
gr.Info(info_msg)
|
499 |
+
progress(1.0, desc="Slide build complete.")
|
500 |
+
# Return tuple WITHOUT status textbox update
|
501 |
+
return (
|
502 |
+
generated_pdf_path,
|
503 |
+
pdf_images, # Return the list of image paths
|
504 |
+
gr.update(visible=True),
|
505 |
+
gr.update(visible=False),
|
506 |
+
gr.update(value=generated_pdf_path, visible=True),
|
507 |
+
)
|
508 |
+
|
509 |
+
|
510 |
+
def step3_generate_audio(*args, progress=gr.Progress(track_tqdm=True)):
|
511 |
+
"""Generates audio files for the speaker notes using edited content."""
|
512 |
+
# Args structure:
|
513 |
+
# args[0]: state_temp_dir
|
514 |
+
# args[1]: state_md_path
|
515 |
+
# args[2]: original_slides_data (list of dicts, used to get count)
|
516 |
+
# args[3 : 3 + MAX_SLIDES]: values from all_code_editors
|
517 |
+
# args[3 + MAX_SLIDES :]: values from all_notes_textboxes
|
518 |
+
|
519 |
+
state_temp_dir = args[0]
|
520 |
+
state_md_path = args[1]
|
521 |
+
original_slides_data = args[2]
|
522 |
+
editors = args[3:]
|
523 |
+
num_slides = len(original_slides_data)
|
524 |
+
if num_slides == 0:
|
525 |
+
logger.error("Step 3 (Audio) called with zero slides data.")
|
526 |
+
raise gr.Error("No slide data available. Please start over.")
|
527 |
+
|
528 |
+
MAX_SLIDES = 20 # Ensure this matches UI definition
|
529 |
+
code_editors_start_index = 3
|
530 |
+
notes_textboxes_start_index = 3 + MAX_SLIDES
|
531 |
+
|
532 |
+
# Slice the *actual* edited values based on num_slides
|
533 |
+
edited_contents = args[
|
534 |
+
code_editors_start_index : code_editors_start_index + num_slides
|
535 |
+
]
|
536 |
+
edited_notes_list = args[
|
537 |
+
notes_textboxes_start_index : notes_textboxes_start_index + num_slides
|
538 |
+
]
|
539 |
+
|
540 |
+
if not state_temp_dir or not state_md_path:
|
541 |
+
raise gr.Error("Session state lost (Audio step). Please start over.")
|
542 |
+
|
543 |
+
# Check slicing
|
544 |
+
if len(edited_contents) != num_slides or len(edited_notes_list) != num_slides:
|
545 |
+
logger.error(
|
546 |
+
f"Input slicing error (Audio step): Expected {num_slides}, got {len(edited_contents)} contents, {len(edited_notes_list)} notes."
|
547 |
+
)
|
548 |
+
raise gr.Error(
|
549 |
+
f"Input processing error: Mismatch after slicing ({num_slides} slides)."
|
550 |
+
)
|
551 |
+
|
552 |
+
logger.info(f"Processing {num_slides} slides for audio generation.")
|
553 |
+
audio_dir = os.path.join(state_temp_dir, "audio")
|
554 |
+
os.makedirs(audio_dir, exist_ok=True)
|
555 |
+
|
556 |
+
# --- Update the presentation.md file AGAIN in case notes changed after PDF ---
|
557 |
+
# This might be redundant if users don't edit notes between PDF and Audio steps,
|
558 |
+
# but ensures the audio matches the *latest* notes displayed.
|
559 |
+
progress(0.1, desc="Saving latest notes...")
|
560 |
+
updated_slides_data = []
|
561 |
+
for i in range(num_slides):
|
562 |
+
updated_slides_data.append(
|
563 |
+
{
|
564 |
+
"id": original_slides_data[i]["id"], # Keep original ID
|
565 |
+
"content": edited_contents[i], # Use sliced edited content
|
566 |
+
"notes": edited_notes_list[i], # Use sliced edited notes
|
567 |
+
}
|
568 |
+
)
|
569 |
+
|
570 |
+
updated_markdown = reconstruct_presentation_markdown(updated_slides_data)
|
571 |
+
try:
|
572 |
+
with open(state_md_path, "w", encoding="utf-8") as f:
|
573 |
+
f.write(updated_markdown)
|
574 |
+
logger.info(f"Updated presentation markdown before audio gen: {state_md_path}")
|
575 |
+
except IOError as e:
|
576 |
+
logger.error(f"Failed to save updated markdown before audio gen: {e}")
|
577 |
+
# Continue with audio gen, but log warning
|
578 |
+
gr.Warning(f"Could not save latest notes to markdown file: {e}")
|
579 |
+
|
580 |
+
generated_audio_paths = ["" for _ in range(num_slides)]
|
581 |
+
audio_generation_failed = False
|
582 |
+
successful_audio_count = 0
|
583 |
+
|
584 |
+
for i in range(num_slides):
|
585 |
+
note_text = edited_notes_list[i]
|
586 |
+
slide_num = i + 1
|
587 |
+
progress(
|
588 |
+
(i + 1) / num_slides * 0.8 + 0.1,
|
589 |
+
desc=f"Audio slide {slide_num}/{num_slides}",
|
590 |
+
)
|
591 |
+
output_file_path = Path(audio_dir) / f"{slide_num}.wav"
|
592 |
+
if not note_text or not note_text.strip():
|
593 |
+
try: # Generate silence
|
594 |
+
subprocess.run(
|
595 |
+
[
|
596 |
+
"ffmpeg",
|
597 |
+
"-y",
|
598 |
+
"-f",
|
599 |
+
"lavfi",
|
600 |
+
"-i",
|
601 |
+
"anullsrc=r=44100:cl=mono",
|
602 |
+
"-t",
|
603 |
+
"0.1",
|
604 |
+
"-q:a",
|
605 |
+
"9",
|
606 |
+
str(output_file_path),
|
607 |
+
],
|
608 |
+
check=True,
|
609 |
+
capture_output=True,
|
610 |
+
text=True,
|
611 |
+
)
|
612 |
+
generated_audio_paths[i] = str(output_file_path)
|
613 |
+
except Exception as e:
|
614 |
+
audio_generation_failed = True
|
615 |
+
logger.error(f"Silence gen failed slide {i + 1}: {e}")
|
616 |
+
continue
|
617 |
+
try: # Generate TTS
|
618 |
+
success = text_to_speech(
|
619 |
+
note_text, output_file_path, voice=VOICE_ID, cache_dir=CACHE_DIR
|
620 |
+
)
|
621 |
+
if success:
|
622 |
+
generated_audio_paths[i] = str(output_file_path)
|
623 |
+
successful_audio_count += 1
|
624 |
+
else:
|
625 |
+
audio_generation_failed = True
|
626 |
+
logger.error(f"TTS failed slide {i + 1}")
|
627 |
+
except Exception as e:
|
628 |
+
audio_generation_failed = True
|
629 |
+
logger.error(f"TTS exception slide {i + 1}: {e}", exc_info=True)
|
630 |
+
|
631 |
+
# --- Prepare outputs for Gradio ---
|
632 |
+
audio_player_updates = [
|
633 |
+
gr.update(value=p if p else None, visible=bool(p and os.path.exists(p)))
|
634 |
+
for p in generated_audio_paths
|
635 |
+
]
|
636 |
+
regen_button_updates = [gr.update(visible=True)] * num_slides
|
637 |
+
audio_player_updates.extend(
|
638 |
+
[gr.update(value=None, visible=False)] * (MAX_SLIDES - num_slides)
|
639 |
+
)
|
640 |
+
regen_button_updates.extend([gr.update(visible=False)] * (MAX_SLIDES - num_slides))
|
641 |
+
|
642 |
+
info_msg = f"Generated {successful_audio_count}/{num_slides} audio clips. "
|
643 |
+
if audio_generation_failed:
|
644 |
+
info_msg += "Some audio failed. Review/Regenerate before video."
|
645 |
+
gr.Warning(info_msg)
|
646 |
+
else:
|
647 |
+
info_msg += "Ready for Step 4."
|
648 |
+
gr.Info(info_msg)
|
649 |
+
logger.info(info_msg)
|
650 |
+
progress(1.0, desc="Audio generation complete.")
|
651 |
+
|
652 |
+
# Return tuple WITHOUT status textbox update
|
653 |
+
return (
|
654 |
+
audio_dir,
|
655 |
+
gr.update(visible=True), # btn_generate_video
|
656 |
+
gr.update(visible=False), # btn_generate_audio
|
657 |
+
*audio_player_updates,
|
658 |
+
*regen_button_updates,
|
659 |
+
)
|
660 |
+
|
661 |
+
|
662 |
+
def step4_generate_video(
|
663 |
+
state_temp_dir,
|
664 |
+
state_audio_dir,
|
665 |
+
state_pdf_path, # Use PDF path from state
|
666 |
+
progress=gr.Progress(track_tqdm=True),
|
667 |
+
):
|
668 |
+
"""Generates the final video using PDF images and audio files."""
|
669 |
+
if not state_temp_dir or not state_audio_dir or not state_pdf_path:
|
670 |
+
raise gr.Error("Session state lost (Video step). Please start over.")
|
671 |
+
if not os.path.exists(state_pdf_path):
|
672 |
+
raise gr.Error(f"PDF file not found: {state_pdf_path}. Cannot generate video.")
|
673 |
+
if not os.path.isdir(state_audio_dir):
|
674 |
+
raise gr.Error(
|
675 |
+
f"Audio directory not found: {state_audio_dir}. Cannot generate video."
|
676 |
+
)
|
677 |
+
|
678 |
+
video_output_path = os.path.join(state_temp_dir, "final_presentation.mp4")
|
679 |
+
|
680 |
+
progress(0.1, desc="Preparing video components...")
|
681 |
+
pdf_images = [] # Initialize to ensure cleanup happens
|
682 |
+
try:
|
683 |
+
# Find audio files (natsorted)
|
684 |
+
audio_files = find_audio_files(state_audio_dir, "*.wav")
|
685 |
+
if not audio_files:
|
686 |
+
logger.warning(
|
687 |
+
f"No WAV files found in {state_audio_dir}. Video might lack audio."
|
688 |
+
)
|
689 |
+
# Decide whether to proceed with silent video or error out
|
690 |
+
# raise gr.Error(f"No audio files found in {state_audio_dir}")
|
691 |
+
|
692 |
+
# Convert PDF to images
|
693 |
+
progress(0.2, desc="Converting PDF to images...")
|
694 |
+
pdf_images = convert_pdf_to_images(state_pdf_path, dpi=150)
|
695 |
+
if not pdf_images:
|
696 |
+
raise gr.Error(f"Failed to convert PDF ({state_pdf_path}) to images.")
|
697 |
+
|
698 |
+
# Allow video generation even if audio is missing or count mismatch
|
699 |
+
# The create_video_clips function should handle missing audio gracefully (e.g., use image duration)
|
700 |
+
if len(pdf_images) != len(audio_files):
|
701 |
+
logger.warning(
|
702 |
+
f"Mismatch: {len(pdf_images)} PDF pages vs {len(audio_files)} audio files. Video clips might have incorrect durations or missing audio."
|
703 |
+
)
|
704 |
+
# Pad the shorter list? For now, let create_video_clips handle it.
|
705 |
+
|
706 |
+
progress(0.5, desc="Creating individual video clips...")
|
707 |
+
buffer_seconds = 1.0
|
708 |
+
output_fps = 10
|
709 |
+
video_clips = create_video_clips(
|
710 |
+
pdf_images, audio_files, buffer_seconds, output_fps
|
711 |
+
)
|
712 |
+
|
713 |
+
if not video_clips:
|
714 |
+
raise gr.Error("Failed to create any video clips.")
|
715 |
+
|
716 |
+
progress(0.8, desc="Concatenating clips...")
|
717 |
+
concatenate_clips(video_clips, video_output_path, output_fps)
|
718 |
+
|
719 |
+
logger.info(f"Video concatenation complete: {video_output_path}")
|
720 |
+
|
721 |
+
progress(0.95, desc="Cleaning up temp images...")
|
722 |
+
cleanup_temp_files(pdf_images) # Pass the list of image paths
|
723 |
+
|
724 |
+
except Exception as e:
|
725 |
+
if pdf_images:
|
726 |
+
cleanup_temp_files(pdf_images)
|
727 |
+
logger.error(f"Video generation failed: {e}", exc_info=True)
|
728 |
+
raise gr.Error(f"Video generation failed: {e}")
|
729 |
+
|
730 |
+
info_msg = f"Video generated: {os.path.basename(video_output_path)}"
|
731 |
+
logger.info(info_msg)
|
732 |
+
gr.Info(info_msg)
|
733 |
+
progress(1.0, desc="Video Complete.")
|
734 |
+
# Return tuple WITHOUT status textbox update
|
735 |
+
return (
|
736 |
+
gr.update(value=video_output_path, visible=True), # video_output
|
737 |
+
gr.update(visible=False), # btn_generate_video
|
738 |
+
)
|
739 |
+
|
740 |
+
|
741 |
+
def cleanup_session(temp_dir):
|
742 |
+
"""Removes the temporary directory."""
|
743 |
+
if temp_dir and isinstance(temp_dir, str) and os.path.exists(temp_dir):
|
744 |
+
try:
|
745 |
+
shutil.rmtree(temp_dir)
|
746 |
+
logger.info(f"Cleaned up temporary directory: {temp_dir}")
|
747 |
+
return "Cleaned up session files."
|
748 |
+
except Exception as e:
|
749 |
+
logger.error(f"Error cleaning up temp directory {temp_dir}: {e}")
|
750 |
+
return f"Error during cleanup: {e}"
|
751 |
+
logger.warning(f"Cleanup called but temp_dir invalid or not found: {temp_dir}")
|
752 |
+
return "No valid temporary directory found to clean."
|
753 |
+
|
754 |
+
|
755 |
+
# --- Gradio Interface ---
|
756 |
+
|
757 |
+
# Load custom CSS
|
758 |
+
custom_css = load_css()
|
759 |
+
|
760 |
+
with gr.Blocks(
|
761 |
+
theme=gr.themes.Soft(), css=custom_css, title="Webpage to Video"
|
762 |
+
) as demo:
|
763 |
+
gr.Markdown("# Webpage to Video Presentation Generator")
|
764 |
+
|
765 |
+
# State variables
|
766 |
+
state_temp_dir = gr.State(None)
|
767 |
+
state_md_path = gr.State(None)
|
768 |
+
state_audio_dir = gr.State(None)
|
769 |
+
state_pdf_path = gr.State(None)
|
770 |
+
state_slides_data = gr.State([])
|
771 |
+
state_pdf_image_paths = gr.State([])
|
772 |
+
|
773 |
+
MAX_SLIDES = 20
|
774 |
+
|
775 |
+
# --- Tabbed Interface ---
|
776 |
+
with gr.Tabs(elem_id="tabs") as tabs_widget:
|
777 |
+
# Tab 1: Generate Presentation
|
778 |
+
with gr.TabItem("1. Generate Presentation", id=0):
|
779 |
+
with gr.Row():
|
780 |
+
with gr.Column(scale=1):
|
781 |
+
gr.Markdown("**Step 1:** Enter URL")
|
782 |
+
input_url = gr.Textbox(
|
783 |
+
label="Webpage URL",
|
784 |
+
value="https://huggingface.co/blog/llm-course",
|
785 |
+
)
|
786 |
+
btn_fetch_generate = gr.Button(
|
787 |
+
value="1. Fetch & Generate", variant="primary"
|
788 |
+
)
|
789 |
+
with gr.Column(scale=4):
|
790 |
+
gr.Markdown(
|
791 |
+
"### Instructions\n1. Enter URL & click 'Fetch & Generate'.\n2. Editor appears below tabs.\n3. Go to next tab."
|
792 |
+
)
|
793 |
+
|
794 |
+
# Tab 2: Build Slides
|
795 |
+
with gr.TabItem("2. Build Slides", id=1):
|
796 |
+
with gr.Row():
|
797 |
+
with gr.Column(scale=1):
|
798 |
+
gr.Markdown("**Step 2:** Review/Edit, then build slides.")
|
799 |
+
btn_build_slides = gr.Button(
|
800 |
+
value="2. Build Slides", variant="secondary", visible=False
|
801 |
+
)
|
802 |
+
pdf_download_link = gr.File(
|
803 |
+
label="Download PDF", visible=False, interactive=False
|
804 |
+
)
|
805 |
+
with gr.Column(scale=4):
|
806 |
+
gr.Markdown(
|
807 |
+
"### Instructions\n1. Edit content/notes below.\n2. Click 'Build Slides'. Images appear.\n3. Download PDF from sidebar.\n4. Go to next tab."
|
808 |
+
)
|
809 |
+
|
810 |
+
# Tab 3: Generate Audio
|
811 |
+
with gr.TabItem("3. Generate Audio", id=2):
|
812 |
+
with gr.Row():
|
813 |
+
with gr.Column(scale=1):
|
814 |
+
gr.Markdown("**Step 3:** Review/Edit notes, then generate audio.")
|
815 |
+
btn_generate_audio = gr.Button(
|
816 |
+
value="3. Generate Audio", variant="primary", visible=False
|
817 |
+
)
|
818 |
+
with gr.Column(scale=4):
|
819 |
+
gr.Markdown(
|
820 |
+
"### Instructions\n1. Finalize notes below.\n2. Click 'Generate Audio'.\n3. Regenerate if needed.\n4. Go to next tab."
|
821 |
+
)
|
822 |
+
|
823 |
+
# Tab 4: Generate Video
|
824 |
+
with gr.TabItem("4. Create Video", id=3):
|
825 |
+
with gr.Row():
|
826 |
+
with gr.Column(scale=1):
|
827 |
+
gr.Markdown("**Step 4:** Create the final video.")
|
828 |
+
btn_generate_video = gr.Button(
|
829 |
+
value="4. Create Video", variant="primary", visible=False
|
830 |
+
)
|
831 |
+
with gr.Column(scale=4):
|
832 |
+
gr.Markdown(
|
833 |
+
"### Instructions\n1. Click 'Create Video'.\n2. Video appears below."
|
834 |
+
)
|
835 |
+
video_output = gr.Video(label="Final Video", visible=False)
|
836 |
+
|
837 |
+
# Define the shared editor structure once, AFTER tabs
|
838 |
+
slide_editors_group = []
|
839 |
+
with gr.Column(visible=False) as editor_column: # Initially hidden
|
840 |
+
gr.Markdown("--- \n## Edit Slides & Notes")
|
841 |
+
gr.Markdown("_(PDF uses content & notes, Audio uses notes only)_")
|
842 |
+
for i in range(MAX_SLIDES):
|
843 |
+
with gr.Accordion(f"Slide {i + 1}", open=(i == 0), visible=False) as acc:
|
844 |
+
with gr.Row(): # Row for Content/Preview/Image
|
845 |
+
with gr.Column(scale=1):
|
846 |
+
code_editor = gr.Code(
|
847 |
+
label="Content (Markdown)",
|
848 |
+
language="markdown",
|
849 |
+
lines=15,
|
850 |
+
interactive=True,
|
851 |
+
visible=False,
|
852 |
+
)
|
853 |
+
notes_textbox = gr.Code(
|
854 |
+
label="Script/Notes (for Audio)",
|
855 |
+
lines=8,
|
856 |
+
language="markdown",
|
857 |
+
interactive=True,
|
858 |
+
visible=False,
|
859 |
+
)
|
860 |
+
with gr.Column(scale=1):
|
861 |
+
slide_image = gr.Image(
|
862 |
+
label="Slide Image",
|
863 |
+
visible=False,
|
864 |
+
interactive=False,
|
865 |
+
height=300,
|
866 |
+
)
|
867 |
+
md_preview = gr.Markdown(visible=False)
|
868 |
+
with gr.Row(): # Row for audio controls
|
869 |
+
audio_player = gr.Audio(
|
870 |
+
label="Generated Audio",
|
871 |
+
visible=False,
|
872 |
+
interactive=False,
|
873 |
+
scale=3,
|
874 |
+
)
|
875 |
+
regen_button = gr.Button(
|
876 |
+
value="Regen Audio", visible=False, scale=1, size="sm"
|
877 |
+
)
|
878 |
+
slide_editors_group.append(
|
879 |
+
(
|
880 |
+
acc,
|
881 |
+
code_editor,
|
882 |
+
md_preview,
|
883 |
+
notes_textbox,
|
884 |
+
audio_player,
|
885 |
+
regen_button,
|
886 |
+
slide_image,
|
887 |
+
)
|
888 |
+
)
|
889 |
+
code_editor.change(
|
890 |
+
fn=lambda x: x,
|
891 |
+
inputs=code_editor,
|
892 |
+
outputs=md_preview,
|
893 |
+
show_progress="hidden",
|
894 |
+
)
|
895 |
+
|
896 |
+
# --- Component Lists for Updates ---
|
897 |
+
all_editor_components = [comp for group in slide_editors_group for comp in group]
|
898 |
+
all_code_editors = [group[1] for group in slide_editors_group]
|
899 |
+
all_notes_textboxes = [group[3] for group in slide_editors_group]
|
900 |
+
all_audio_players = [group[4] for group in slide_editors_group]
|
901 |
+
all_regen_buttons = [group[5] for group in slide_editors_group]
|
902 |
+
all_slide_images = [group[6] for group in slide_editors_group]
|
903 |
+
|
904 |
+
# --- Function to regenerate audio --- (Assumed correct)
|
905 |
+
# ... (regenerate_single_audio implementation as fixed before)...
|
906 |
+
def regenerate_single_audio(
|
907 |
+
slide_idx, note_text, temp_dir, progress=gr.Progress(track_tqdm=True)
|
908 |
+
):
|
909 |
+
# ...(Implementation as fixed before)...
|
910 |
+
if (
|
911 |
+
not temp_dir
|
912 |
+
or not isinstance(temp_dir, str)
|
913 |
+
or not os.path.exists(temp_dir)
|
914 |
+
):
|
915 |
+
logger.error(f"Regen audio failed: Invalid temp_dir '{temp_dir}'")
|
916 |
+
return gr.update(value=None, visible=False)
|
917 |
+
slide_num = slide_idx + 1
|
918 |
+
audio_dir = os.path.join(temp_dir, "audio")
|
919 |
+
os.makedirs(audio_dir, exist_ok=True)
|
920 |
+
output_file = Path(audio_dir) / f"{slide_num}.wav"
|
921 |
+
logger.info(f"Regenerating audio for slide {slide_num} -> {output_file}")
|
922 |
+
progress(0.1, desc=f"Regen audio slide {slide_num}...")
|
923 |
+
if not note_text or not note_text.strip():
|
924 |
+
logger.warning(f"Note for slide {slide_num} empty. Generating silence.")
|
925 |
+
try:
|
926 |
+
subprocess.run(
|
927 |
+
[
|
928 |
+
"ffmpeg",
|
929 |
+
"-y",
|
930 |
+
"-f",
|
931 |
+
"lavfi",
|
932 |
+
"-i",
|
933 |
+
"anullsrc=r=44100:cl=mono",
|
934 |
+
"-t",
|
935 |
+
"0.1",
|
936 |
+
"-q:a",
|
937 |
+
"9",
|
938 |
+
str(output_file),
|
939 |
+
],
|
940 |
+
check=True,
|
941 |
+
capture_output=True,
|
942 |
+
text=True,
|
943 |
+
)
|
944 |
+
logger.info(f"Created silent placeholder: {output_file}")
|
945 |
+
progress(1.0, desc=f"Generated silence slide {slide_num}.")
|
946 |
+
return gr.update(value=str(output_file), visible=True)
|
947 |
+
except Exception as e:
|
948 |
+
logger.error(f"Failed silent gen slide {slide_num}: {e}")
|
949 |
+
return gr.update(value=None, visible=False)
|
950 |
+
else:
|
951 |
+
try:
|
952 |
+
success = text_to_speech(
|
953 |
+
note_text, output_file, voice=VOICE_ID, cache_dir=CACHE_DIR
|
954 |
+
)
|
955 |
+
if success:
|
956 |
+
logger.info(f"Regen OK slide {slide_num}")
|
957 |
+
progress(1.0, desc=f"Audio regen OK slide {slide_num}.")
|
958 |
+
return gr.update(value=str(output_file), visible=True)
|
959 |
+
else:
|
960 |
+
logger.error(f"Regen TTS failed slide {slide_num}")
|
961 |
+
return gr.update(value=None, visible=False)
|
962 |
+
except Exception as e:
|
963 |
+
logger.error(
|
964 |
+
f"Regen TTS exception slide {slide_num}: {e}", exc_info=True
|
965 |
+
)
|
966 |
+
return gr.update(value=None, visible=False)
|
967 |
+
|
968 |
+
# --- Connect the individual Re-generate buttons ---
|
969 |
+
# Update unpacking to include slide_image (7 items)
|
970 |
+
for i, (
|
971 |
+
acc,
|
972 |
+
code_edit,
|
973 |
+
md_preview,
|
974 |
+
notes_tb,
|
975 |
+
audio_pl,
|
976 |
+
regen_btn,
|
977 |
+
slide_image,
|
978 |
+
) in enumerate(slide_editors_group):
|
979 |
+
regen_btn.click(
|
980 |
+
fn=regenerate_single_audio,
|
981 |
+
inputs=[gr.State(i), notes_tb, state_temp_dir],
|
982 |
+
outputs=[audio_pl],
|
983 |
+
show_progress="minimal",
|
984 |
+
)
|
985 |
+
|
986 |
+
# --- Main Button Click Handlers --- (Outputs use locally defined component vars)
|
987 |
+
|
988 |
+
# Step 1 Click Handler
|
989 |
+
step1_outputs = [
|
990 |
+
state_temp_dir,
|
991 |
+
state_md_path,
|
992 |
+
state_slides_data,
|
993 |
+
editor_column, # Show the editor column
|
994 |
+
btn_build_slides, # Enable the button in Tab 2
|
995 |
+
btn_fetch_generate, # Disable self
|
996 |
+
]
|
997 |
+
btn_fetch_generate.click(
|
998 |
+
fn=step1_fetch_and_generate_presentation,
|
999 |
+
inputs=[input_url],
|
1000 |
+
outputs=step1_outputs,
|
1001 |
+
).then(
|
1002 |
+
fn=lambda s_data: [
|
1003 |
+
upd
|
1004 |
+
for i, slide in enumerate(s_data)
|
1005 |
+
if i < MAX_SLIDES
|
1006 |
+
for upd in [
|
1007 |
+
gr.update(
|
1008 |
+
label=f"Slide {i + 1}: {slide['content'][:25]}...",
|
1009 |
+
visible=True,
|
1010 |
+
open=(i == 0),
|
1011 |
+
), # Accordion
|
1012 |
+
gr.update(value=slide["content"], visible=True), # Code Editor
|
1013 |
+
gr.update(value=slide["content"], visible=True), # MD Preview
|
1014 |
+
gr.update(value=slide["notes"], visible=True), # Notes Textbox
|
1015 |
+
gr.update(value=None, visible=False), # Audio Player
|
1016 |
+
gr.update(visible=False), # Regen Button
|
1017 |
+
gr.update(value=None, visible=False), # Slide Image
|
1018 |
+
]
|
1019 |
+
]
|
1020 |
+
+ [
|
1021 |
+
upd
|
1022 |
+
for i in range(len(s_data), MAX_SLIDES)
|
1023 |
+
for upd in [gr.update(visible=False)] * 7
|
1024 |
+
],
|
1025 |
+
inputs=[state_slides_data],
|
1026 |
+
outputs=all_editor_components,
|
1027 |
+
show_progress="hidden",
|
1028 |
+
).then(lambda: gr.update(selected=1), outputs=tabs_widget) # Switch to Tab 2
|
1029 |
+
|
1030 |
+
# Step 2 Click Handler
|
1031 |
+
step2_inputs = (
|
1032 |
+
[state_temp_dir, state_md_path, state_slides_data]
|
1033 |
+
+ all_code_editors
|
1034 |
+
+ all_notes_textboxes
|
1035 |
+
)
|
1036 |
+
step2_outputs = [
|
1037 |
+
state_pdf_path,
|
1038 |
+
state_pdf_image_paths,
|
1039 |
+
btn_generate_audio, # Enable button in Tab 3
|
1040 |
+
btn_build_slides, # Disable self
|
1041 |
+
pdf_download_link, # Update download link in Tab 2
|
1042 |
+
]
|
1043 |
+
btn_build_slides.click(
|
1044 |
+
fn=step2_build_slides,
|
1045 |
+
inputs=step2_inputs,
|
1046 |
+
outputs=step2_outputs,
|
1047 |
+
).then(
|
1048 |
+
fn=lambda image_paths: [
|
1049 |
+
gr.update(
|
1050 |
+
value=image_paths[i] if i < len(image_paths) else None,
|
1051 |
+
visible=(i < len(image_paths)),
|
1052 |
+
)
|
1053 |
+
for i in range(MAX_SLIDES)
|
1054 |
+
],
|
1055 |
+
inputs=[state_pdf_image_paths],
|
1056 |
+
outputs=all_slide_images,
|
1057 |
+
show_progress="hidden",
|
1058 |
+
).then(lambda: gr.update(selected=2), outputs=tabs_widget) # Switch to Tab 3
|
1059 |
+
|
1060 |
+
# Step 3 Click Handler
|
1061 |
+
step3_inputs = (
|
1062 |
+
[state_temp_dir, state_md_path, state_slides_data]
|
1063 |
+
+ all_code_editors
|
1064 |
+
+ all_notes_textboxes
|
1065 |
+
)
|
1066 |
+
step3_outputs = (
|
1067 |
+
[
|
1068 |
+
state_audio_dir,
|
1069 |
+
btn_generate_video, # Enable button in Tab 4
|
1070 |
+
btn_generate_audio, # Disable self
|
1071 |
+
]
|
1072 |
+
+ all_audio_players
|
1073 |
+
+ all_regen_buttons
|
1074 |
+
)
|
1075 |
+
btn_generate_audio.click(
|
1076 |
+
fn=step3_generate_audio,
|
1077 |
+
inputs=step3_inputs,
|
1078 |
+
outputs=step3_outputs,
|
1079 |
+
).then(lambda: gr.update(selected=3), outputs=tabs_widget) # Switch to Tab 4
|
1080 |
+
|
1081 |
+
# Step 4 Click Handler
|
1082 |
+
step4_inputs = [state_temp_dir, state_audio_dir, state_pdf_path]
|
1083 |
+
step4_outputs = [
|
1084 |
+
video_output, # Update video output in Tab 4
|
1085 |
+
btn_generate_video, # Disable self
|
1086 |
+
]
|
1087 |
+
btn_generate_video.click(
|
1088 |
+
fn=step4_generate_video,
|
1089 |
+
inputs=step4_inputs,
|
1090 |
+
outputs=step4_outputs,
|
1091 |
+
)
|
1092 |
+
|
1093 |
+
if __name__ == "__main__":
|
1094 |
+
os.makedirs(CACHE_DIR, exist_ok=True)
|
1095 |
+
os.makedirs(URL_CACHE_DIR, exist_ok=True)
|
1096 |
+
demo.queue().launch(debug=True)
|
app/src/__init__.py
ADDED
File without changes
|
{scripts β app/src}/create_presentation.py
RENAMED
@@ -19,10 +19,12 @@ You are an expert technical writer and presentation creator. Your task is to con
|
|
19 |
2. **Slide Format:** Each slide should start with `# Slide Title` derived from the corresponding `## Heading`.
|
20 |
3. **Content:** Include the relevant text, code blocks (preserving language identifiers like ```python), and lists from the input markdown within each slide.
|
21 |
4. **Images:** Convert Markdown images `` into Remark.js format: `.center[]`. Ensure the image URL is correct and accessible.
|
22 |
-
5. **Presenter Notes (Transcription Style):** For each slide, generate a detailed
|
|
|
23 |
6. **Separators:** Separate individual slides using `\n\n---\n\n`.
|
24 |
7. **Cleanup:** Do NOT include any HTML/MDX specific tags like `<CourseFloatingBanner>`, `<Tip>`, `<Question>`, `<Youtube>`, or internal links like `[[...]]`. Remove frontmatter.
|
25 |
-
8. **
|
|
|
26 |
```markdown
|
27 |
class: impact
|
28 |
|
@@ -34,7 +36,8 @@ You are an expert technical writer and presentation creator. Your task is to con
|
|
34 |
???
|
35 |
Welcome everyone. This presentation, automatically generated from the course material titled '{input_filename}', will walk you through the key topics discussed in the document. Let's begin.
|
36 |
```
|
37 |
-
|
|
|
38 |
|
39 |
**Generate the Remark.js presentation now:**
|
40 |
"""
|
|
|
19 |
2. **Slide Format:** Each slide should start with `# Slide Title` derived from the corresponding `## Heading`.
|
20 |
3. **Content:** Include the relevant text, code blocks (preserving language identifiers like ```python), and lists from the input markdown within each slide.
|
21 |
4. **Images:** Convert Markdown images `` into Remark.js format: `.center[]`. Ensure the image URL is correct and accessible.
|
22 |
+
5. **Presenter Notes (Transcription Style):** For each slide, generate a detailed **transcription** of what the presenter should say with the slide's content. This should be flowing text suitable for reading aloud. Place this transcription after the slide content, separated by `???`.
|
23 |
+
6. **Speaker Style:** The speaker should flow smoothly from one slide to the next. No need to explicitly mention the slide number or introduce the content directly.
|
24 |
6. **Separators:** Separate individual slides using `\n\n---\n\n`.
|
25 |
7. **Cleanup:** Do NOT include any HTML/MDX specific tags like `<CourseFloatingBanner>`, `<Tip>`, `<Question>`, `<Youtube>`, or internal links like `[[...]]`. Remove frontmatter.
|
26 |
+
8. **References:** Do not include references to files like `2.mdx`. Instead, refer to the title of the section.
|
27 |
+
9. **Start Slide:** Begin the presentation with a title slide:
|
28 |
```markdown
|
29 |
class: impact
|
30 |
|
|
|
36 |
???
|
37 |
Welcome everyone. This presentation, automatically generated from the course material titled '{input_filename}', will walk you through the key topics discussed in the document. Let's begin.
|
38 |
```
|
39 |
+
10. **Output:** Provide ONLY the complete Remark.js Markdown content, starting with the title slide and ending with the last content slide. Do not include any introductory text, explanations, or a final 'Thank You' slide.
|
40 |
+
11. **Style:** Keep slide content concise and to the point with no paragraphs. Speaker notes can expand the content of the slide further.
|
41 |
|
42 |
**Generate the Remark.js presentation now:**
|
43 |
"""
|
{scripts β app/src}/create_video.py
RENAMED
File without changes
|
{scripts β app/src}/transcription_to_audio.py
RENAMED
File without changes
|
{chapter1 β app}/template/index.html
RENAMED
File without changes
|
{chapter1 β app}/template/remark.min.js
RENAMED
File without changes
|
{chapter1 β app}/template/style.scss
RENAMED
File without changes
|
chapter1/material/1_presentation.md
ADDED
@@ -0,0 +1,130 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
class: impact
|
2 |
+
|
3 |
+
# Presentation based on 1.md
|
4 |
+
## Generated Presentation
|
5 |
+
|
6 |
+
.center[]
|
7 |
+
|
8 |
+
???
|
9 |
+
Welcome everyone. This presentation, automatically generated from the course material titled '1.md', will walk you through the key topics discussed in the document. Let's begin.
|
10 |
+
|
11 |
+
---
|
12 |
+
|
13 |
+
# Introduction
|
14 |
+
|
15 |
+
???
|
16 |
+
Welcome to the first chapter of our course. In this introductory section, we'll set the stage for what you can expect to learn and explore throughout this journey.
|
17 |
+
|
18 |
+
---
|
19 |
+
|
20 |
+
# Welcome to the π€ Course!
|
21 |
+
|
22 |
+
This course will teach you about large language models (LLMs) and natural language processing (NLP) using libraries from the [Hugging Face](https://huggingface.co/) ecosystem β [π€ Transformers](https://github.com/huggingface/transformers), [π€ Datasets](https://github.com/huggingface/datasets), [π€ Tokenizers](https://github.com/huggingface/tokenizers), and [π€ Accelerate](https://github.com/huggingface/accelerate) β as well as the [Hugging Face Hub](https://huggingface.co/models). It's completely free and without ads.
|
23 |
+
|
24 |
+
???
|
25 |
+
Welcome to the Hugging Face course! This course is designed to teach you about large language models and natural language processing using the Hugging Face ecosystem. We'll be exploring libraries like Transformers, Datasets, Tokenizers, and Accelerate, as well as the Hugging Face Hub. The best part? This course is completely free and ad-free.
|
26 |
+
|
27 |
+
---
|
28 |
+
|
29 |
+
# Understanding NLP and LLMs
|
30 |
+
|
31 |
+
While this course was originally focused on NLP (Natural Language Processing), it has evolved to emphasize Large Language Models (LLMs), which represent the latest advancement in the field.
|
32 |
+
|
33 |
+
**What's the difference?**
|
34 |
+
- **NLP (Natural Language Processing)** is the broader field focused on enabling computers to understand, interpret, and generate human language. NLP encompasses many techniques and tasks such as sentiment analysis, named entity recognition, and machine translation.
|
35 |
+
- **LLMs (Large Language Models)** are a powerful subset of NLP models characterized by their massive size, extensive training data, and ability to perform a wide range of language tasks with minimal task-specific training. Models like the Llama, GPT, or Claude series are examples of LLMs that have revolutionized what's possible in NLP.
|
36 |
+
|
37 |
+
Throughout this course, you'll learn about both traditional NLP concepts and cutting-edge LLM techniques, as understanding the foundations of NLP is crucial for working effectively with LLMs.
|
38 |
+
|
39 |
+
???
|
40 |
+
Let's start by understanding the difference between NLP and LLMs. NLP, or Natural Language Processing, is a broad field that focuses on enabling computers to understand, interpret, and generate human language. It encompasses various techniques and tasks like sentiment analysis and machine translation. On the other hand, LLMs, or Large Language Models, are a subset of NLP models known for their massive size and ability to perform a wide range of language tasks with minimal task-specific training. Throughout this course, we'll explore both traditional NLP concepts and cutting-edge LLM techniques.
|
41 |
+
|
42 |
+
---
|
43 |
+
|
44 |
+
# What to expect?
|
45 |
+
|
46 |
+
.center[]
|
47 |
+
|
48 |
+
- Chapters 1 to 4 provide an introduction to the main concepts of the π€ Transformers library. By the end of this part of the course, you will be familiar with how Transformer models work and will know how to use a model from the [Hugging Face Hub](https://huggingface.co/models), fine-tune it on a dataset, and share your results on the Hub!
|
49 |
+
- Chapters 5 to 8 teach the basics of π€ Datasets and π€ Tokenizers before diving into classic NLP tasks and LLM techniques. By the end of this part, you will be able to tackle the most common language processing challenges by yourself.
|
50 |
+
- Chapter 9 goes beyond NLP to cover how to build and share demos of your models on the π€ Hub. By the end of this part, you will be ready to showcase your π€ Transformers application to the world!
|
51 |
+
- Chapters 10 to 12 dive into advanced LLM topics like fine-tuning, curating high-quality datasets, and building reasoning models.
|
52 |
+
|
53 |
+
This course:
|
54 |
+
- Requires a good knowledge of Python
|
55 |
+
- Is better taken after an introductory deep learning course
|
56 |
+
- Does not expect prior PyTorch or TensorFlow knowledge, though some familiarity will help
|
57 |
+
|
58 |
+
???
|
59 |
+
Now, let's talk about what you can expect from this course. We'll start with an introduction to the Transformers library, followed by exploring Datasets and Tokenizers. We'll then dive into classic NLP tasks and LLM techniques. By the end of this course, you'll be equipped to tackle common language processing challenges and even build and share your own models. This course assumes a good knowledge of Python and is best taken after an introductory deep learning course.
|
60 |
+
|
61 |
+
---
|
62 |
+
|
63 |
+
# Who are we?
|
64 |
+
|
65 |
+
About the authors:
|
66 |
+
|
67 |
+
- **Abubakar Abid**: Completed his PhD at Stanford in applied machine learning. Founded Gradio, acquired by Hugging Face.
|
68 |
+
- **Ben Burtenshaw**: Machine Learning Engineer at Hugging Face, PhD in NLP from the University of Antwerp.
|
69 |
+
- **Matthew Carrigan**: Machine Learning Engineer at Hugging Face, previously at Parse.ly and Trinity College Dublin.
|
70 |
+
- **Lysandre Debut**: Machine Learning Engineer at Hugging Face, core maintainer of the π€ Transformers library.
|
71 |
+
- **Sylvain Gugger**: Research Engineer at Hugging Face, co-author of _Deep Learning for Coders with fastai and PyTorch_.
|
72 |
+
- **Dawood Khan**: Machine Learning Engineer at Hugging Face, co-founder of Gradio.
|
73 |
+
- **Merve Noyan**: Developer Advocate at Hugging Face, focused on democratizing machine learning.
|
74 |
+
- **Lucile Saulnier**: Machine Learning Engineer at Hugging Face, involved in NLP research projects.
|
75 |
+
- **Lewis Tunstall**: Machine Learning Engineer at Hugging Face, co-author of _Natural Language Processing with Transformers_.
|
76 |
+
- **Leandro von Werra**: Machine Learning Engineer at Hugging Face, co-author of _Natural Language Processing with Transformers_.
|
77 |
+
|
78 |
+
???
|
79 |
+
Let me introduce you to the authors of this course. We have a diverse team of experts, including Abubakar Abid, who founded Gradio; Ben Burtenshaw, with a PhD in NLP; and many others who bring a wealth of knowledge and experience to this course.
|
80 |
+
|
81 |
+
---
|
82 |
+
|
83 |
+
# FAQ
|
84 |
+
|
85 |
+
- **Does taking this course lead to a certification?**
|
86 |
+
Currently, no certification is available, but a program is in development.
|
87 |
+
|
88 |
+
- **How much time should I spend on this course?**
|
89 |
+
Each chapter is designed for 1 week, with 6-8 hours of work per week.
|
90 |
+
|
91 |
+
- **Where can I ask a question?**
|
92 |
+
Click the "Ask a question" banner to be redirected to the [Hugging Face forums](https://discuss.huggingface.co/).
|
93 |
+
|
94 |
+
.center[]
|
95 |
+
|
96 |
+
- **Where can I get the code for the course?**
|
97 |
+
Click the banner to run code in Google Colab or Amazon SageMaker Studio Lab.
|
98 |
+
|
99 |
+
.center[]
|
100 |
+
|
101 |
+
- **How can I contribute to the course?**
|
102 |
+
Open an issue on the [course repo](https://github.com/huggingface/course) or help translate the course.
|
103 |
+
|
104 |
+
- **Can I reuse this course?**
|
105 |
+
Yes, under the [Apache 2 license](https://www.apache.org/licenses/LICENSE-2.0.html).
|
106 |
+
|
107 |
+
???
|
108 |
+
Before we proceed, let's address some frequently asked questions. We're working on a certification program, but it's not available yet. Each chapter is designed for one week of study, with 6-8 hours of work per week. If you have questions, you can ask them on the Hugging Face forums. The code for the course is available on GitHub, and you can contribute by opening issues or helping with translations. Finally, this course is released under the Apache 2 license, so feel free to reuse it.
|
109 |
+
|
110 |
+
---
|
111 |
+
|
112 |
+
# Let's Go
|
113 |
+
|
114 |
+
In this chapter, you will learn:
|
115 |
+
- How to use the `pipeline()` function to solve NLP tasks such as text generation and classification
|
116 |
+
- About the Transformer architecture
|
117 |
+
- How to distinguish between encoder, decoder, and encoder-decoder architectures and use cases
|
118 |
+
|
119 |
+
???
|
120 |
+
Now that we've covered the basics, let's dive into what you'll learn in this chapter. We'll explore the `pipeline()` function for solving NLP tasks, understand the Transformer architecture, and learn to distinguish between different model architectures and their use cases.
|
121 |
+
|
122 |
+
---
|
123 |
+
|
124 |
+
class: center, middle
|
125 |
+
|
126 |
+
# Thank You!
|
127 |
+
|
128 |
+
???
|
129 |
+
That concludes the material covered in this presentation, generated from the provided course document. Thank you for your time and attention. Are there any questions?
|
130 |
+
```
|