Mathematical Foundation

The M/M/c/K model extends the standard M/M/c queue by imposing a hard capacity limit K on the total number of customers in the system (being served + waiting). When K patients are present, new arrivals are blocked and diverted elsewhere.

Blocking Probability

The critical performance metric is PB, the probability that an arriving customer finds the system full:

$$P_B = P_K = P_0 \frac{(c\rho)^K}{c! \cdot c^{K-c}}$$

Where P₀ is modified for the finite system:

$$P_0 = \left[\sum_{n=0}^{c-1} \frac{(c\rho)^n}{n!} + \frac{(c\rho)^c}{c!}\sum_{n=c}^{K}\left(\frac{\rho}{1}\right)^{n-c}\right]^{-1}$$

Key Difference from M/M/c

Unlike the infinite-capacity model, M/M/c/K can be stable even when ρ ≥ 1 because the finite buffer prevents unbounded growth. However, this stability comes at the cost of high blocking rates.

Effective Arrival Rate

Due to blocking, the effective throughput is reduced:

$$\lambda_{eff} = \lambda(1 - P_B)$$

Real-World Applications Across Industries

Healthcare: ICU Bed Management

Operational Context:

  • ICU bed capacity: 20 beds (K = 20)
  • Average arrival rate: 8 patients/day (λ = 8/24 ≈ 0.33/hr)
  • Average ICU length of stay: 72 hours (μ = 1/72 ≈ 0.0139/hr)
  • Beds (servers): Each bed is a "server" → c = 20
  • Policy goal: < 5% diversion rate

Analysis: High-utilization system where K = c. Any arrival when all beds are full triggers ambulance diversion.

Manufacturing: Buffer Storage Between Work Centers

Operational Context:

  • Buffer capacity: 50 units (K = 50)
  • Parts arriving from upstream: 120 units/hour (λ = 120)
  • Downstream processing rate: 5 units/hour/station (μ = 5)
  • Processing stations: 30 stations (c = 30)
  • Target: < 2% blocking (parts rejected/scrapped)

Analysis: Utilization ρ = 120/(30×5) = 0.80. Buffer prevents upstream blockage, but finite K creates blocking when buffer fills.

Cloud Computing: Connection Pool Management

Operational Context:

  • Maximum connections: 1000 (K = 1000)
  • Request rate: 800 requests/second (λ = 800)
  • Average request duration: 2 seconds (μ = 0.5 requests/sec/connection)
  • Connection pool size: 1000 (c = 1000, K = c)
  • Target: < 0.1% request rejection rate

Analysis: Operating at ρ = 800/(1000×0.5) = 1.6 → normally unstable, but K limit creates self-regulation through blocking.

💡 Try this: Test these scenarios below. Enable shock simulation to model surge events: mass casualty (healthcare), machine breakdown (manufacturing), or traffic spike (cloud services).

Interactive Simulation Laboratory

Simulate finite-capacity systems with blocking dynamics. Test surge scenarios to observe blocking cascades, system saturation, and recovery patterns. Visualizations include occupancy tracking and blocking event analysis over time.

Capacity Optimizer Targets
🚨 System Shock Simulation

Test system resilience with surge scenarios (elevated arrivals from hour 10-12)

Interpreting Your Results

Blocking vs. Waiting

Loss Rate (Blocking Probability)

< 2%: Excellent—rare diversions

2-5%: Acceptable for non-emergency

5-10%: High—impacts regional system

> 10%: Critical—system failure

The K vs. c Relationship

K = c: No queue—pure blocking model (ICU, hotel rooms)

K > c: Queue allowed—waiting room/hallway capacity

Optimal K: Balance blocking costs vs. waiting space costs

Cascade Effects

When your facility blocks patients, they're diverted to neighbors. If all regional facilities operate near capacity, blocking creates a cascade failure where diversions compound across the network.

Industry best practice: Keep blocking < 5% to maintain regional resilience.