Spaces:
Running
Running
File size: 6,749 Bytes
e6f3c3d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
import streamlit as st
# Custom CSS for better styling
st.markdown("""
<style>
.main-title {
font-size: 36px;
color: #4A90E2;
font-weight: bold;
text-align: center;
}
.sub-title {
font-size: 24px;
color: #4A90E2;
margin-top: 20px;
}
.section {
background-color: #f9f9f9;
padding: 15px;
border-radius: 10px;
margin-top: 20px;
}
.section p, .section ul {
color: #666666;
}
.link {
color: #4A90E2;
text-decoration: none;
}
.benchmark-table {
width: 100%;
border-collapse: collapse;
margin-top: 20px;
}
.benchmark-table th, .benchmark-table td {
border: 1px solid #ddd;
padding: 8px;
text-align: left;
}
.benchmark-table th {
background-color: #4A90E2;
color: white;
}
.benchmark-table td {
background-color: #f2f2f2;
}
</style>
""", unsafe_allow_html=True)
# Main Title
st.markdown('<div class="main-title">Whisper: Advanced Speech Recognition</div>', unsafe_allow_html=True)
# Overview Section
st.markdown("""
<div class="section">
<p>The <strong>Whisper</strong> model, developed by OpenAI, was introduced in the paper <em>Robust Speech Recognition via Large-Scale Weak Supervision</em>. Whisper is a cutting-edge speech recognition model designed to handle a wide range of tasks by learning from an extensive dataset of 680,000 hours of multilingual and multitask audio transcripts.</p>
<p>Whisper's robust architecture allows it to perform well across different speech processing tasks without the need for fine-tuning. Its zero-shot transfer capabilities enable it to generalize effectively, making it a versatile tool for developers and researchers alike.</p>
</div>
""", unsafe_allow_html=True)
# Use Cases Section
st.markdown('<div class="sub-title">Use Cases</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><strong>Transcription Services:</strong> Automate transcription of audio files in English for media, legal, and academic purposes.</li>
<li><strong>Voice-Activated Assistants:</strong> Enhance voice command recognition in smart devices and applications.</li>
<li><strong>Broadcast Media:</strong> Provide real-time transcription and subtitling for live broadcasts.</li>
<li><strong>Multilingual Translation:</strong> Use as a base for developing multilingual speech-to-text and translation services.</li>
</ul>
</div>
""", unsafe_allow_html=True)
# How to Use Section
st.markdown('<div class="sub-title">How to Use Whisper</div>', unsafe_allow_html=True)
st.code('''
audioAssembler = AudioAssembler() \\
.setInputCol("audio_content") \\
.setOutputCol("audio_assembler")
speechToText = WhisperForCTC \\
.pretrained("asr_whisper_small_english")\\
.setInputCols("audio_assembler") \\
.setOutputCol("text")
pipeline = Pipeline().setStages([audioAssembler, speechToText])
pipelineModel = pipeline.fit(data)
pipelineDF = pipelineModel.transform(data)
''', language='python')
st.markdown("""
<div class="section">
<p>This example demonstrates how to use Whisper in a Spark NLP pipeline to convert raw audio content into text. The model processes the input audio sampled at 16 kHz and outputs the corresponding text transcription, making it ideal for tasks like transcription, voice command recognition, and more.</p>
</div>
""", unsafe_allow_html=True)
# Model Information Section
st.markdown('<div class="sub-title">Model Information</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<table class="benchmark-table">
<tr>
<th>Attribute</th>
<th>Description</th>
</tr>
<tr>
<td><strong>Model Name</strong></td>
<td>asr_whisper_small_english</td>
</tr>
<tr>
<td><strong>Compatibility</strong></td>
<td>Spark NLP 5.1.4+, PySpark 3.4+</td>
</tr>
<tr>
<td><strong>License</strong></td>
<td>Open Source</td>
</tr>
<tr>
<td><strong>Edition</strong></td>
<td>Official</td>
</tr>
<tr>
<td><strong>Input Labels</strong></td>
<td>[audio_assembler]</td>
</tr>
<tr>
<td><strong>Output Labels</strong></td>
<td>[text]</td>
</tr>
<tr>
<td><strong>Language</strong></td>
<td>en</td>
</tr>
<tr>
<td><strong>Model Size</strong></td>
<td>1.1 GB</td>
</tr>
</table>
</div>
""", unsafe_allow_html=True)
# References Section
st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://sparknlp.org/2023/10/17/asr_whisper_small_english_en.html" target="_blank">Whisper Model on Spark NLP</a></li>
<li><a class="link" href="https://huggingface.co./openai/whisper-small.en" target="_blank">Whisper Model on Hugging Face</a></li>
<li><a class="link" href="https://arxiv.org/abs/2212.04356" target="_blank">Whisper Paper</a></li>
<li><a class="link" href="https://github.com/openai/whisper" target="_blank">Whisper GitHub Repository</a></li>
</ul>
</div>
""", unsafe_allow_html=True)
# Community & Support
st.markdown('<div class="sub-title">Community & Support</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
<li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
<li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
<li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
</ul>
</div>
""", unsafe_allow_html=True)
|