Add Hugging Face paper link and clarify repos
Browse filesThis PR adds a link to the Hugging Face paper page for better discoverability and clarifies which repo is used for training/finetuning and which contains the GGUF weights.
README.md
CHANGED
@@ -1,15 +1,15 @@
|
|
1 |
---
|
2 |
-
license: mit
|
3 |
-
license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
|
4 |
language:
|
5 |
- en
|
|
|
|
|
|
|
6 |
pipeline_tag: text-generation
|
7 |
tags:
|
8 |
- chat
|
9 |
- bitnet
|
10 |
- text-generation
|
11 |
- large-language-model
|
12 |
-
library_name: transformers
|
13 |
---
|
14 |
|
15 |
# BitNet b1.58 2B4T - Scaling Native 1-bit LLM
|
@@ -18,7 +18,7 @@ This repository contains the weights for **BitNet b1.58 2B4T**, the first open-s
|
|
18 |
|
19 |
Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-bit LLMs can achieve performance comparable to leading open-weight, full-precision models of similar size, while offering substantial advantages in computational efficiency (memory, energy, latency).
|
20 |
|
21 |
-
➡️ **Technical Report:** [BitNet b1.58 2B4T Technical Report](https://arxiv.org/abs/2504.12285)
|
22 |
|
23 |
➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
|
24 |
|
@@ -98,7 +98,8 @@ chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)
|
|
98 |
# Generate response
|
99 |
chat_outputs = model.generate(**chat_input, max_new_tokens=50)
|
100 |
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
|
101 |
-
print("
|
|
|
102 |
```
|
103 |
|
104 |
## How to Use (with `bitnet.cpp`)
|
@@ -141,4 +142,4 @@ BitNet b1.58 2B4T was evaluated against leading open-weight full-precision LLMs
|
|
141 |
The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
|
142 |
|
143 |
## Disclaimer
|
144 |
-
This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.
|
|
|
1 |
---
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
+
library_name: transformers
|
5 |
+
license: mit
|
6 |
+
license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
|
7 |
pipeline_tag: text-generation
|
8 |
tags:
|
9 |
- chat
|
10 |
- bitnet
|
11 |
- text-generation
|
12 |
- large-language-model
|
|
|
13 |
---
|
14 |
|
15 |
# BitNet b1.58 2B4T - Scaling Native 1-bit LLM
|
|
|
18 |
|
19 |
Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-bit LLMs can achieve performance comparable to leading open-weight, full-precision models of similar size, while offering substantial advantages in computational efficiency (memory, energy, latency).
|
20 |
|
21 |
+
➡️ **Technical Report:** [BitNet b1.58 2B4T Technical Report](https://arxiv.org/abs/2504.12285) ➡️ **Hugging Face Paper:** [Hugging Face Paper](https://huggingface.co/papers/2504.12285)
|
22 |
|
23 |
➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
|
24 |
|
|
|
98 |
# Generate response
|
99 |
chat_outputs = model.generate(**chat_input, max_new_tokens=50)
|
100 |
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
|
101 |
+
print("
|
102 |
+
Assistant Response:", response)
|
103 |
```
|
104 |
|
105 |
## How to Use (with `bitnet.cpp`)
|
|
|
142 |
The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
|
143 |
|
144 |
## Disclaimer
|
145 |
+
This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.
|