did you get it to work since?
Julien Chaumond PRO
AI & ML interests
Recent Activity
Organizations
julien-c's activity

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference β with more coming soon), we've started enabling Pay as you go (=PAYG)
What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.
You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.

Our v2.0 quants set new benchmarks on 5-shot MMLU and KL Divergence, meaning you can now run & fine-tune quantized LLMs while preserving as much accuracy as possible.
Llama 4: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
DeepSeek-R1: unsloth/DeepSeek-R1-GGUF-UD
Gemma 3: unsloth/gemma-3-27b-it-GGUF
We made selective layer quantization much smarter. Instead of modifying only a subset of layers, we now dynamically quantize all layers so every layer has a different bit. Now, our dynamic method can be applied to all LLM architectures, not just MoE's.
Blog with Details: https://docs.unsloth.ai/basics/dynamic-v2.0
All our future GGUF uploads will leverage Dynamic 2.0 and our hand curated 300Kβ1.5M token calibration dataset to improve conversational chat performance.
For accurate benchmarking, we built an evaluation framework to match the reported 5-shot MMLU scores of Llama 4 and Gemma 3. This allowed apples-to-apples comparisons between full-precision vs. Dynamic v2.0, QAT and standard iMatrix quants.
Dynamic v2.0 aims to minimize the performance gap between full-precision models and their quantized counterparts.
the 50 lines of code Agent in Javascript π₯
I spent the last few weeks working on this, so I hope you will like it.
I've been diving into MCP (Model Context Protocol) to understand what the hype was all about.
It is fairly simple, but still quite powerful: MCP is a standard API to expose sets of Tools that can be hooked to LLMs.
But while doing that, came my second realization:
Once you have a MCP Client, an Agent is literally just a while loop on top of it. π€―
β‘οΈ read it exclusively on the official HF blog: https://huggingface.co./blog/tiny-agents

the 50 lines of code Agent in Javascript π₯
I spent the last few weeks working on this, so I hope you will like it.
I've been diving into MCP (Model Context Protocol) to understand what the hype was all about.
It is fairly simple, but still quite powerful: MCP is a standard API to expose sets of Tools that can be hooked to LLMs.
But while doing that, came my second realization:
Once you have a MCP Client, an Agent is literally just a while loop on top of it. π€―
β‘οΈ read it exclusively on the official HF blog: https://huggingface.co./blog/tiny-agents
which provider do you use?
We'll ship a provider="auto"
in the coming days BTW, cc
@sbrandeis
@Wauplin
@celinah
In the meantime, the model is served by those providers, you can use one of them, for instance, add provider="novita"
to your code:
Hey, things have been in flux somewhat, but they should stabilize now. Sorry about the moving parts!
More details, from @michellehbn :
In February, Inference billing usage had been a fixed rate while we added pay-as-you-go support so now, usage in March on takes into account compute time x price of the hardware. We're really sorry for any confusion or scare! We have more information about Inference Providers here: https://huggingface.co./docs/inference-providers/en/index
it's definitely the future:)

Build faster than ever with lightning fast upload and download speeds starting today on the Hub β‘
Xet storage is rolling out access across the Hub - join the waitlist here https://huggingface.co./join/xet
You can apply for yourself, or your entire organization. Head over to your account settings for more information or join anywhere you see the Xet logo on a repository you know.
Have questions? Join the conversation below π or open a discussion on the Xet team page xet-team/README

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference β with more coming soon), we've started enabling Pay as you go (=PAYG)
What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.
You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.

Six months after joining Hugging Face the Xet team is kicking off the first migrations from LFS to our storage for a number of repositories on the Hub.
More on the nitty gritty details behind the migration soon, but here are the big takeaways:
π€ We've successfully completed the first migrations from LFS -> Xet to test the infrastructure and prepare for a wider release
β No action on your part needed - you can work with a Xet-backed repo like any other repo on the Hub (for now - major improvements on their way!)
π Keep an eye out for the Xet logo to see if a repo you know is on our infra! See the screenshots below to spot the difference π
β© β© β© Blazing uploads and downloads coming soon. Wβre gearing up for a full integration with the Hub's Python library that will make building on the Hub faster than ever - special thanks to @celinah and @Wauplin for their assistance.
π Want Early Access? If youβre curious and want to test it out the bleeding edge that will power the development experience on the Hub, weβd love to partner with you. Let me know!
This is the culmination of a lot of effort from the entire team. Big round of applause to @sirahd @brianronan @jgodlewski @hoytak @seanses @assafvayner @znation @saba9 @rajatarya @port8080 @yuchenglow