The IQ2_BN and IQ2_BN_R4 version of microsoft/bitnet-b1.58-2B-4T-gguf for use with ik_llama.cpp.
I recommend the IQ2_BN_R4 version but you use -rtr
on IQ2_BN to convert on runtime.
The chat template in the model looks incorrect (I did not change it, this is from the original Microsoft GGUF).
An example of correct usage from their transformers PR:
<|begin_of_text|>User: Hey, are you conscious? Can you talk to me?<|eot_id|>Assistant:
I was able to follow the example above and it worked for multi-turn conversations.
- Downloads last month
- 1,058
Hardware compatibility
Log In
to view the estimation
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for tdh111/bitnet-b1.58-2B-4T-GGUF
Base model
microsoft/bitnet-b1.58-2B-4T