Check it out! π
Demo: onnx-community/model-explorer
Dataset: onnx-community/model-explorer
Source code: https://github.com/xenova/model-explorer
You can find two examples here:
The model itself has a maximum context length, so you can't feed everything through the model at once, unfortunately. To solve this, I implemented streaming in v1.2.0, which you can use as follows:
import { KokoroTTS } from "kokoro-js";
const model_id = "onnx-community/Kokoro-82M-v1.0-ONNX";
const tts = await KokoroTTS.from_pretrained(model_id, {
dtype: "fp32", // Options: "fp32", "fp16", "q8", "q4", "q4f16"
// device: "webgpu", // Options: "wasm", "webgpu" (web) or "cpu" (node).
});
const text = "Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects. It can even run 100% locally in your browser, powered by Transformers.js!";
const stream = tts.stream(text);
let i = 0;
for await (const { text, phonemes, audio } of stream) {
console.log({ text, phonemes });
audio.save(`audio-${i++}.wav`);
}
This is great! Does it work for nested cases too? For example,
Last week she said, βHi there. How are you?β
should remain a single chunk.
Hi there - we recently fixed this issue and will release a new version for it soon!
Hey! Oh that's awesome - great work! Feel free to adapt any code/logic of mine as you'd like!
import { KokoroTTS } from "kokoro-js";
const tts = await KokoroTTS.from_pretrained(
"onnx-community/Kokoro-82M-ONNX",
{ dtype: "q8" }, // fp32, fp16, q8, q4, q4f16
);
const text = "Life is like a box of chocolates. You never know what you're gonna get.";
const audio = await tts.generate(text,
{ voice: "af_sky" }, // See `tts.list_voices()`
);
audio.save("audio.wav");
For this demo, ~150MB if using WebGPU and ~120MB if using WASM.
npm i @huggingface/transformers
.We have Transformers.js, the JavaScript/WASM/WebGPU port of the python library, which supports ~100 different architectures.
Docs: https://huggingface.co./docs/transformers.js
Repo: http://github.com/xenova/transformers.js
Is that the kind of thing you're looking for? :)