artalis-io/bitnet.c
CMITactive
Health
Minimal, zero-dependency LLM inference in pure C11. CPU-first with NEON/AVX2 SIMD. Flash MoE (pread + LRU expert cache). TurboQuant 3-bit KV compression (8.9x less memory per session). 20+ GGUF quant formats. Compiles to WASM.
Health Breakdown
Activity25
Community25
Maintenance15
Popularity15
#avx2#c#cpu-inference#gguf#inference#kv-cache#llm#moe#neon#quantization#simd#turboquant#wasm
Community
CMIT
active
★ 195 contributors3d ago