The 240fps Paradox
Native terminals boast impressive rendering speeds. But when AI streams thousands of tokens per second, speed alone cannot save you.Copy
┌─────────────────────────────────────────────────────────────────┐
│ THE 240FPS PARADOX │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Native Terminals: "We render at 240fps!" │
│ │
│ Alacritty: GPU-accelerated, ~240fps │
│ Ghostty: Metal/Vulkan, ~240fps │
│ WezTerm: wgpu, ~240fps │
│ │
│ But the question is: 240fps of WHAT? │
│ │
└─────────────────────────────────────────────────────────────────┘
Architecture Beats Speed
Copy
┌─────────────────────────────────────────────────────────────────┐
│ QUEUE-BASED @ 240fps │
├─────────────────────────────────────────────────────────────────┤
│ │
│ AI Output: 1000 states/sec (streaming tokens) │
│ Render Speed: 240 states/sec (240fps) │
│ Queue Growth: 760 states/sec BACKLOG │
│ │
│ After 10 seconds: │
│ • 10,000 states generated │
│ • 2,400 states rendered │
│ • 7,600 states in queue │
│ • User is 7.6 SECONDS behind! │
│ │
│ Even 1000fps would not help if AI outputs 2000 states/sec! │
│ The problem is ARCHITECTURAL, not SPEED. │
│ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ OVERWRITE-BASED @ 60fps (MonoTerm) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ AI Output: 1000 states/sec (streaming tokens) │
│ Render Speed: 60 states/sec (60fps) │
│ Queue Growth: 0 (overwrite, not queue) │
│ │
│ After 10 seconds: │
│ • 10,000 states generated (all processed by Alacritty) │
│ • 600 frames rendered │
│ • 0 states in queue │
│ • User is 0 SECONDS behind (always current) │
│ │
│ Lower FPS, but ALWAYS showing CURRENT state. │
│ │
└─────────────────────────────────────────────────────────────────┘
Traditional Terminal Design
Copy
┌─────────────────────────────────────────────────────────────────┐
│ TRADITIONAL APPROACH │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Design Philosophy: "Every byte must be rendered in order" │
│ │
│ PTY Output: [S1] [S2] [S3] [S4] ... [S1000] │
│ | | | | | │
│ +------------------------------------+ │
│ Queue: | S1 -> S2 -> S3 -> ... -> S1000 | FIFO │
│ +------------------------------------+ │
│ | │
│ Renderer: S1 --> S2 --> S3 --> ... (16ms each) │
│ │
└─────────────────────────────────────────────────────────────────┘
What Happens Under High Output
Copy
┌─────────────────────────────────────────────────────────────────┐
│ TIMELINE: HIGH OUTPUT SCENARIO │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Time PTY State Rendering User Sees │
│ ---- --------- --------- --------- │
│ T=0 S1 S1 S1 (current) │
│ T=50ms S50 S2 S2 (48 behind) │
│ T=100ms S100 S3 S3 (97 behind) │
│ T=500ms S500 S10 S10 (490 behind!) │
│ T=1s S1000 S20 S20 (980 behind!) │
│ ... │
│ T=5s DONE S100 Still rendering old states │
│ T=20s DONE S1000 Finally current │
│ │
└─────────────────────────────────────────────────────────────────┘
Symptoms During Catch-Up
Copy
┌─────────────────────────────────────────────────────────────────┐
│ CATCH-UP PHASE SYMPTOMS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. REFLOW │
│ • Layout shifts as old states render │
│ • Scrollbar jumps │
│ • Repeats for EVERY queued state │
│ │
│ 2. FLICKERING │
│ • Cursor position changes rapidly │
│ • Text appears/disappears │
│ • Selection breaks │
│ │
│ 3. INPUT LAG │
│ • User types, but echo is queued behind backlog │
│ • Feels unresponsive │
│ │
│ 4. MEMORY GROWTH │
│ • Queue holds all intermediate states │
│ • Significant memory consumption │
│ │
└─────────────────────────────────────────────────────────────────┘
Why Terminals Use Queue Design
Copy
┌─────────────────────────────────────────────────────────────────┐
│ HISTORICAL CONTEXT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1970s-2000s: │
│ • Human typing speed much slower than render speed │
│ • Queue rarely grows, stays small │
│ • Design works fine │
│ │
│ 2020s+: │
│ • AI output speed exceeds terminal render speed │
│ • Queue grows continuously during AI streaming │
│ • Design breaks down │
│ │
│ ----------------------------------------------------------- │
│ │
│ Human Input: ~10 chars/sec (bursty) │
│ AI Output: ~1000+ tokens/sec (continuous) │
│ │
│ 240fps handles human input with ease. │
│ 240fps CANNOT handle AI output without queue growth. │
│ │
└─────────────────────────────────────────────────────────────────┘
The Correctness Assumption
Copy
┌─────────────────────────────────────────────────────────────────┐
│ TRADITIONAL BELIEF │
├─────────────────────────────────────────────────────────────────┤
│ │
│ "Every state must be shown to be correct" │
│ │
│ ----------------------------------------------------------- │
│ │
│ But this assumption is WRONG for terminal emulators: │
│ │
│ • User only cares about CURRENT state │
│ • Intermediate states are transient │
│ • Showing old states is WORSE than skipping them │
│ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MONOTERM INSIGHT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ "Correctness = showing CURRENT state, not showing ALL states" │
│ │
│ -> Overwrite instead of queue │
│ -> Memory stays fixed │
│ -> Time stays synchronized │
│ │
└─────────────────────────────────────────────────────────────────┘
The Real Metric: Time Synchronization
Copy
┌─────────────────────────────────────────────────────────────────┐
│ TIME DELTA: THE TRUE MEASURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Traditional Metric: Frames Per Second (FPS) │
│ AI-Native Metric: Time Delta (current vs displayed) │
│ │
│ +---------------+---------+-------------------------------+ │
│ | Terminal | FPS | Time Delta (AI streaming) | │
│ +---------------+---------+-------------------------------+ │
│ | Alacritty | 240 | Grows (seconds behind) | │
│ | Ghostty | 240 | Grows (same problem) | │
│ | WezTerm | 240 | Grows (same problem) | │
│ | MonoTerm | 60 | Always 0 (current state) | │
│ +---------------+---------+-------------------------------+ │
│ │
└─────────────────────────────────────────────────────────────────┘
AI-Human Concurrency
Copy
┌─────────────────────────────────────────────────────────────────┐
│ THE NEW CHALLENGE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Traditional: Human types -> Computer responds -> Human reads│
│ AI-Assisted: Human + AI typing concurrently │
│ │
│ +------------------------------------------------------------+ │
│ | Human-Only Mode: |│
│ | Human Input: # # # # # (slow, bursty) |│
│ | Terminal Load: _____#____#___ (low, manageable) |│
│ +------------------------------------------------------------+ │
│ │
│ +------------------------------------------------------------+ │
│ | AI-Human Concurrent Mode: |│
│ | Human Input: # # # # # (same as before) |│
│ | AI Output: ################ (continuous, high) |│
│ | Terminal Load: ################ (constantly high) |│
│ +------------------------------------------------------------+ │
│ │
└─────────────────────────────────────────────────────────────────┘
Queue vs Overwrite Under Concurrency
Copy
┌─────────────────────────────────────────────────────────────────┐
│ QUEUE-BASED (Problem) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Human types "ls -la" while AI generates 1000 lines │
│ │
│ T=0: "ls -la" echo │
│ T=1s: AI line 50 (queue: 950 pending) │
│ T=2s: AI line 100 (queue: 1850 pending) │
│ T=5s: AI line 250 (queue: 4750 pending) │
│ │
│ Human CANNOT see their input amid AI output queue! │
│ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ OVERWRITE-BASED (MonoTerm) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Human types "ls -la" while AI generates 1000 lines │
│ │
│ T=0: Current state (includes all updates so far) │
│ T=1s: Current state (all coalesced) │
│ T=2s: Current state (always up-to-date) │
│ T=5s: Current state (human input visible immediately) │
│ │
│ Human CAN see their input immediately! │
│ │
└─────────────────────────────────────────────────────────────────┘
Summary
Copy
┌─────────────────────────────────────────────────────────────────┐
│ QUEUE vs OVERWRITE COMPARISON │
├─────────────────────────────────────────────────────────────────┤
│ │
│ +------------------+------------------+---------------------+ │
│ | Aspect | Queue (240fps) | Overwrite (60fps) | │
│ +------------------+------------------+---------------------+ │
│ | Rendering speed | Faster | Slower | │
│ | State displayed | Old (backlog) | Current (always) | │
│ | Time delta | Grows under load | Always 0 | │
│ | Memory | Grows with queue | Fixed | │
│ | Reflow | Every state | None | │
│ | AI-native ready | No | Yes | │
│ | Human input echo | Delayed | Immediate | │
│ +------------------+------------------+---------------------+ │
│ │
│ Core Principle: Architecture beats speed for AI workloads. │
│ │
└─────────────────────────────────────────────────────────────────┘
SMPC/OFAC Applied
| Principle | Application |
|---|---|
| SMPC | One mechanism (overwrite) solves two problems (memory + time) |
| Simple is better than fast | |
| 60fps current > 240fps old | |
| OFAC | Accept that intermediate states can be discarded |
| Order emerges from accepting chaos of high output | |
| Time synchronization as emergent property |