Upload optimized ONNX model w/ GQA
#26
by
Xenova
HF Staff
- opened
No description provided.
Xenova
changed pull request title from
Upload optimized model w/ GQA
to Upload optimized ONNX model w/ GQA
New demo! https://huggingface.co./spaces/HuggingFaceTB/SmolLM2-1.7B-Instruct-WebGPU
Much faster now...
Xenova
changed pull request status to
merged