Open source repositories tagged with #paged-attention, ranked by health score.
Pure Rust + CUDA LLM inference engine — no PyTorch, OpenAI-compatible, serves Qwen3 to Kimi-K2