What is BLIS?¶

BLIS (Blackbox Inference Simulator) is a discrete-event simulator for LLM inference serving systems. It models multi-instance clusters with configurable admission control, request routing, KV-cache dynamics, scheduling policies, and token generation — all without requiring real GPUs.

Why Simulate Inference Serving?¶

Deploying LLM inference at scale requires answering capacity planning questions that are expensive to answer with real hardware:

How many instances do I need to serve 1,000 requests/second at p99 TTFT < 200ms?
Which routing policy minimizes tail latency for my workload mix?
How much KV cache memory do I need before preemptions degrade throughput?
What happens at 2x traffic — does latency degrade gracefully or catastrophically?

Running these experiments on real GPUs costs thousands of dollars and takes days. BLIS answers them in seconds on a laptop.

Who Should Use BLIS¶

Audience	Use Case
Capacity planners	Determine instance counts, GPU memory, and TP configurations before procurement
Platform engineers	Compare routing policies, tune scorer weights, evaluate admission control strategies
Researchers	Run controlled experiments on scheduling, batching, and caching algorithms
Developers	Validate new policies against existing ones before deploying to production

What BLIS Is Not¶

Setting expectations

Not a benchmark — BLIS simulates serving behavior, it does not generate real GPU load
Not primarily a load generator — BLIS focuses on simulation. Real-mode traffic generation against OpenAI-compatible endpoints is available but experimental. For production load testing, use tools like inference-perf or genai-perf

Key Features¶

See the Home page feature list for the full capabilities catalog, including the workload specification DSL, metrics pipeline, latency model backends, and policy framework.

Next Steps¶

Installation — Build BLIS from source
Quick Start — Run your first simulation in 30 seconds
Tutorial: Capacity Planning — End-to-end walkthrough
User Guide — Task-oriented how-to guides