import streamlit as st # Custom CSS for better styling st.markdown(""" """, unsafe_allow_html=True) # Main Title st.markdown('
Whisper: Advanced Speech Recognition
', unsafe_allow_html=True) # Overview Section st.markdown("""

The Whisper model, developed by OpenAI, was introduced in the paper Robust Speech Recognition via Large-Scale Weak Supervision. Whisper is a cutting-edge speech recognition model designed to handle a wide range of tasks by learning from an extensive dataset of 680,000 hours of multilingual and multitask audio transcripts.

Whisper's robust architecture allows it to perform well across different speech processing tasks without the need for fine-tuning. Its zero-shot transfer capabilities enable it to generalize effectively, making it a versatile tool for developers and researchers alike.

""", unsafe_allow_html=True) # Use Cases Section st.markdown('
Use Cases
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True) # How to Use Section st.markdown('
How to Use Whisper
', unsafe_allow_html=True) st.code(''' audioAssembler = AudioAssembler() \\ .setInputCol("audio_content") \\ .setOutputCol("audio_assembler") speechToText = WhisperForCTC \\ .pretrained("asr_whisper_small_english")\\ .setInputCols("audio_assembler") \\ .setOutputCol("text") pipeline = Pipeline().setStages([audioAssembler, speechToText]) pipelineModel = pipeline.fit(data) pipelineDF = pipelineModel.transform(data) ''', language='python') st.markdown("""

This example demonstrates how to use Whisper in a Spark NLP pipeline to convert raw audio content into text. The model processes the input audio sampled at 16 kHz and outputs the corresponding text transcription, making it ideal for tasks like transcription, voice command recognition, and more.

""", unsafe_allow_html=True) # Model Information Section st.markdown('
Model Information
', unsafe_allow_html=True) st.markdown("""
Attribute Description
Model Name asr_whisper_small_english
Compatibility Spark NLP 5.1.4+, PySpark 3.4+
License Open Source
Edition Official
Input Labels [audio_assembler]
Output Labels [text]
Language en
Model Size 1.1 GB
""", unsafe_allow_html=True) # References Section st.markdown('
References
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True) # Community & Support st.markdown('
Community & Support
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True)