import streamlit as st # Custom CSS for better styling st.markdown(""" """, unsafe_allow_html=True) # Main Title st.markdown('

Whisper: Advanced Speech Recognition

', unsafe_allow_html=True) # Overview Section st.markdown("""

The Whisper model, developed by OpenAI, was introduced in the paper Robust Speech Recognition via Large-Scale Weak Supervision. Whisper is a cutting-edge speech recognition model designed to handle a wide range of tasks by learning from an extensive dataset of 680,000 hours of multilingual and multitask audio transcripts.

Whisper's robust architecture allows it to perform well across different speech processing tasks without the need for fine-tuning. Its zero-shot transfer capabilities enable it to generalize effectively, making it a versatile tool for developers and researchers alike.

""", unsafe_allow_html=True) # Use Cases Section st.markdown('

Use Cases

', unsafe_allow_html=True) st.markdown("""

Transcription Services: Automate transcription of audio files in English for media, legal, and academic purposes.
Voice-Activated Assistants: Enhance voice command recognition in smart devices and applications.
Broadcast Media: Provide real-time transcription and subtitling for live broadcasts.
Multilingual Translation: Use as a base for developing multilingual speech-to-text and translation services.

""", unsafe_allow_html=True) # How to Use Section st.markdown('

How to Use Whisper

', unsafe_allow_html=True) st.code(''' audioAssembler = AudioAssembler() \\ .setInputCol("audio_content") \\ .setOutputCol("audio_assembler") speechToText = WhisperForCTC \\ .pretrained("asr_whisper_small_english")\\ .setInputCols("audio_assembler") \\ .setOutputCol("text") pipeline = Pipeline().setStages([audioAssembler, speechToText]) pipelineModel = pipeline.fit(data) pipelineDF = pipelineModel.transform(data) ''', language='python') st.markdown("""

This example demonstrates how to use Whisper in a Spark NLP pipeline to convert raw audio content into text. The model processes the input audio sampled at 16 kHz and outputs the corresponding text transcription, making it ideal for tasks like transcription, voice command recognition, and more.

""", unsafe_allow_html=True) # Model Information Section st.markdown('

Model Information

', unsafe_allow_html=True) st.markdown("""

Attribute	Description
Model Name	asr_whisper_small_english
Compatibility	Spark NLP 5.1.4+, PySpark 3.4+
License	Open Source
Edition	Official
Input Labels	[audio_assembler]
Output Labels	[text]
Language	en
Model Size	1.1 GB

""", unsafe_allow_html=True) # References Section st.markdown('

References

', unsafe_allow_html=True) st.markdown("""

""", unsafe_allow_html=True) # Community & Support st.markdown('

Community & Support

', unsafe_allow_html=True) st.markdown("""

Official Website: Documentation and examples
Slack: Live discussion with the community and team
GitHub: Bug reports, feature requests, and contributions
Medium: Spark NLP articles
YouTube: Video tutorials

""", unsafe_allow_html=True)