Two Scenarios: Vim vs LLM Streaming
Terminal emulators face two fundamentally different usage patterns.The Two Usage Patterns
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ SCENARIO 1: VIM / INTERACTIVE EDITING │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Input: Frequent, continuous keystrokes │
│ Output: Small, incremental (one char, one line) │
│ Critical Metric: INPUT LATENCY │
│ User Expectation: Instant response to keystrokes │
│ │
│ ┌─────────────────────┐ │
│ │ Game Analogy: │ │
│ │ │ │
│ │ COMPETITIVE FPS │ (CS2, Valorant) │
│ │ │ │
│ │ Every millisecond │ │
│ │ of input lag │ │
│ │ matters │ │
│ └─────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────┐
│ SCENARIO 2: LLM STREAMING / HIGH OUTPUT │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Input: Rare (user is watching, not typing) │
│ Output: Massive, continuous (thousands of lines/sec) │
│ Critical Metric: VISUAL STABILITY │
│ User Expectation: Smooth, readable streaming text │
│ │
│ ┌─────────────────────┐ │
│ │ Game Analogy: │ │
│ │ │ │
│ │ 4K VIDEO STREAMING │ (Netflix, YouTube) │
│ │ │ │
│ │ Input latency │ │
│ │ is irrelevant │ │
│ │ Buffering prevents │ │
│ │ stuttering │ │
│ └─────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────────┘
Traditional Terminals: One Optimization Only
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ TRADITIONAL TERMINAL IN VIM SCENARIO │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Keystroke --> Parse --> Render --> Display │
│ │ │
│ v │
│ ~13-31ms latency (optimized!) │
│ │
│ Output volume: Low (matches render capacity) │
│ Reflow frequency: Low (no accumulation) │
│ │
│ Result: PERFECT │
│ │
│ Like: FPS-optimized hardware playing FPS game │
│ │
└───────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────┐
│ TRADITIONAL TERMINAL IN LLM SCENARIO │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ LLM chunk --> Parse --> Render (immediately!) │
│ LLM chunk --> Parse --> Render (immediately!) │
│ LLM chunk --> Parse --> Render (immediately!) │
│ ...4000 times per second... │
│ │
│ Output volume: Massive (exceeds render capacity) │
│ Reflow frequency: 4000-6700/sec (unsustainable) │
│ │
│ Result: CATASTROPHIC FAILURE │
│ │
│ Like: Playing 4K video without buffering │
│ Using FPS reflexes to watch Netflix │
│ │
└───────────────────────────────────────────────────────────────────────┘
MonoTerm: Adaptive to Both
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ MONOTERM IN VIM SCENARIO │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Keystroke --> Rust Parse --> Grid Update --> ACK --> Render │
│ │ │
│ v │
│ ~50ms latency (acceptable for editing) │
│ │
│ Output volume: Low │
│ ACK cycle: Fast (low output = quick turnaround) │
│ │
│ Result: GOOD ENOUGH │
│ │
│ Like: Adaptive sync monitor │
│ Not the absolute fastest, but smooth │
│ │
└───────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────┐
│ MONOTERM IN LLM SCENARIO │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ [Rust Backend - No Reflow Zone] │
│ │
│ LLM chunk --> Grid Update (memory only) │
│ LLM chunk --> Grid Update (memory only) │
│ LLM chunk --> Grid Update (memory only) │
│ ...100 times... (DOM never touched) │
│ │
│ [Frontend - ACK Gated] │
│ │
│ "Ready?" <── ACK <── Snapshot received --> Single render │
│ │
│ Output volume: Massive (absorbed by backend) │
│ Reflow frequency: 5-10/sec (sustainable) │
│ │
│ Result: OPTIMAL │
│ │
│ Like: Buffered 4K streaming │
│ Smooth playback regardless of source bitrate │
│ │
└───────────────────────────────────────────────────────────────────────┘
Comparison Matrix
| Scenario | Game Analogy | Traditional | MonoTerm |
|---|---|---|---|
| Vim | Competitive FPS | Optimal (~13-31ms) | Good (~50ms) |
| LLM Streaming | 4K Video | Crash (no buffer) | Optimal (ACK buffer) |
Why Reflow is Different from Frame Drop
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ FRAME DROP (Game) - Cheap │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Frame 1 --> Render --> Display │
│ Frame 2 --> [GPU busy] --> SKIP (free) │
│ Frame 3 --> Render --> Display │
│ │
│ Cost of skip: ~0 (just don't render) │
│ Result: Momentary stutter, recovers instantly │
│ │
└───────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────┐
│ REFLOW (Terminal) - Expensive │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Chunk 1 --> DOM update --> REFLOW (16ms, blocking) │
│ Chunk 2 --> DOM update --> queued (waiting) │
│ Chunk 3 --> DOM update --> queued (waiting) │
│ Chunk 4 --> DOM update --> queued (waiting) │
│ │ │
│ v │
│ CANNOT SKIP! │
│ │ │
│ v │
│ REFLOW 1 (16ms) --> blocking │
│ REFLOW 2 (16ms) --> blocking │
│ REFLOW 3 (16ms) --> blocking │
│ REFLOW 4 (16ms) --> blocking │
│ │
│ Cost: Cumulative debt that must be paid │
│ Result: Progressive slowdown --> freeze --> crash │
│ │
└───────────────────────────────────────────────────────────────────────┘
Better Analogy: Shader Compilation
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ NORMAL GAME │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Shaders compiled at load time │
│ Gameplay: Just render with pre-compiled shaders │
│ │
│ Result: Smooth 60fps │
│ │
└───────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────┐
│ REFLOW STORM = COMPILING SHADERS EVERY FRAME │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Frame 1 --> Compile shader --> 50ms freeze --> Render │
│ Frame 2 --> Compile shader --> 50ms freeze --> Render │
│ Frame 3 --> Compile shader --> 50ms freeze --> Render │
│ │
│ Effective FPS: ~15 (unplayable) │
│ Input during compile: Ignored │
│ │
│ This is what LLM streaming does to traditional terminals. │
│ │
└───────────────────────────────────────────────────────────────────────┘
MonoTerm Solution: Off-Thread Processing
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ GAME PATTERN MONOTERM PATTERN │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ Shader compilation stutter Parsing (expensive) │
│ --> Pre-compile at load --> Rust backend (no DOM) │
│ │
│ GC spikes Grid updates │
│ --> Object pooling --> Memory only (no reflow) │
│ │
│ Asset loading hitches DOM updates │
│ --> Stream in background --> Only on ACK (controlled) │
│ │
│ Pattern: Do expensive work Pattern: Do expensive work │
│ OUTSIDE gameplay OUTSIDE the UI thread │
│ │
└───────────────────────────────────────────────────────────────────────┘
Key Insight
Copy
┌───────────────────────────────────────────────────────────────────────┐
│ THE AI ERA SHIFT │
├───────────────────────────────────────────────────────────────────────┤
│ │
│ The AI era introduced a new terminal usage pattern: │
│ │
│ HIGH OUTPUT, LOW INPUT │
│ │
│ Traditional terminals only optimize for: │
│ │
│ LOW OUTPUT, HIGH INPUT (Vim, shell) │
│ │
│ MonoTerm handles both: │
│ │
│ Interactive editing --> ACK cycles quickly │
│ LLM streaming --> ACK absorbs the flood │
│ │
│ One architecture. Two optimal modes. Zero crashes. │
│ │
└───────────────────────────────────────────────────────────────────────┘
Related Research
- Terminal Analysis - Architecture comparison
- Architecture Diagrams - Flow visualizations
- Success Factors - Why the architecture works