Add Hugging Face paper link and clarify repos

This PR adds a link to the Hugging Face paper page for better discoverability and clarifies which repo is used for training/finetuning and which contains the GGUF weights.

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -1,15 +1,15 @@
 ---
-license: mit
-license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
 language:
 - en
 pipeline_tag: text-generation
 tags:
 - chat
 - bitnet
 - text-generation
 - large-language-model
-library_name: transformers
 ---
 # BitNet b1.58 2B4T - Scaling Native 1-bit LLM
@@ -18,7 +18,7 @@ This repository contains the weights for **BitNet b1.58 2B4T**, the first open-s
 Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-bit LLMs can achieve performance comparable to leading open-weight, full-precision models of similar size, while offering substantial advantages in computational efficiency (memory, energy, latency).
-➡️ **Technical Report:** [BitNet b1.58 2B4T Technical Report](https://arxiv.org/abs/2504.12285)
 ➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
@@ -98,7 +98,8 @@ chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)
 # Generate response
 chat_outputs = model.generate(**chat_input, max_new_tokens=50)
 response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
-print("\nAssistant Response:", response)
 ```
 ## How to Use (with `bitnet.cpp`)
@@ -141,4 +142,4 @@ BitNet b1.58 2B4T was evaluated against leading open-weight full-precision LLMs
 The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
 ## Disclaimer
-This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.

 ---
 language:
 - en
+library_name: transformers
+license: mit
+license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
 pipeline_tag: text-generation
 tags:
 - chat
 - bitnet
 - text-generation
 - large-language-model
 ---
 # BitNet b1.58 2B4T - Scaling Native 1-bit LLM
 Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-bit LLMs can achieve performance comparable to leading open-weight, full-precision models of similar size, while offering substantial advantages in computational efficiency (memory, energy, latency).
+➡️ **Technical Report:** [BitNet b1.58 2B4T Technical Report](https://arxiv.org/abs/2504.12285) ➡️ **Hugging Face Paper:** [Hugging Face Paper](https://huggingface.co/papers/2504.12285)
 ➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
 # Generate response
 chat_outputs = model.generate(**chat_input, max_new_tokens=50)
 response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
+print("
+Assistant Response:", response)
 ```
 ## How to Use (with `bitnet.cpp`)
 The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
 ## Disclaimer
+This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.