Xenova/whisper-tiny.en · How to create onnx model for Whisper (fine tuned) that can be used with Transformers JS?

I am really impressed by how TransformersJS works with Whisper, but I have a requirement to finetune it with some proprietary information (some words that it is failing to understand)
I am following this document (https://huggingface.co./blog/fine-tune-whisper) to finetune Whisper tiny and the generated model works fine.
But when I try to create quantized onnx version using Optimum cli, it keeps generating the model without any encoder decoder model.
I have been trying different versions for almost a month now without much luck.
Does anyone have a Colab notebook which i can follow to generate a quantized onnx please?
Apologies if this is not the correct place for this discussion.
Thank you!