Service

GitHub - microsoft/BitNet: Official inference framework for 1-bit LLMs

microsoft

2025.04.20

·GitHub·by Anonymous

#LLM#1-bit LLM#Inference Framework#BitNet#cpp

Key Points

1BitNet.cpp is Microsoft's official inference framework designed to enable fast and lossless inference of 1-bit Large Language Models, such as BitNet b1.58, on CPU and upcoming NPU platforms.
2This framework achieves substantial performance improvements, including speedups of up to 6.17x and energy reductions of up to 82.2% on x86 CPUs, making it possible to run large 100B models on a single CPU at human reading speeds.
3Built upon elements of llama.cpp and T-MAC, bitnet.cpp aims to significantly enhance the potential for efficient local deployment of LLMs and inspires further development of large-scale 1-bit models.

w \in \{-1, +1\}

Service

microsoft

2025.04.20

·GitHub·by Anonymous

#LLM#1-bit LLM#Inference Framework#BitNet#cpp

1BitNet.cpp is Microsoft's official inference framework designed to enable fast and lossless inference of 1-bit Large Language Models, such as BitNet b1.58, on CPU and upcoming NPU platforms.
2This framework achieves substantial performance improvements, including speedups of up to 6.17x and energy reductions of up to 82.2% on x86 CPUs, making it possible to run large 100B models on a single CPU at human reading speeds.
3Built upon elements of llama.cpp and T-MAC, bitnet.cpp aims to significantly enhance the potential for efficient local deployment of LLMs and inspires further development of large-scale 1-bit models.

w \in \{-1, +1\}