Replaced librosa with torchaudio for audio loading and resampling. Added speech detection (energy-based or webrtcvad for accuracy). Improved /translate-audio endpoint to handle silent audio gracefully.
Modified the _initialize_tts_model method to include the clean_up_tokenization_spaces parameter; Added logging configuration to configure the logging level for transformers in app.py
Modified the tokenization step to include clean_up_tokenization_spaces=True; Added clean_up_tokenization_spaces=True in the text_to_speech method; Added a print statement to confirm the TTS model is loaded