Established 2026 · Fort Ann, New York

Information theory
across substrates.

The Windstorm Institute studies the mathematical constraints governing information processing in biological, neural, and artificial systems. Seven systems from six domains. One throughput band.

Seven information processing systems from six domains — Ribosome, English Consonants, Chromatic Scale, Neural Working Memory, AI Transformers, Morse Code, and TCP/IP — all converge on a throughput band of 3 to 6 bits per event, centered at 4.16 bits.
1,749
Models Evaluated
4.39
Bits — Ribosome Floor
6
Papers Published
29
Organisms Verified
10&sup9;×
Landauer Gap

The Research Arc

Seven papers. One question. From observation to law to propagation — and now, to falsification.

WHAT exists  →  WHERE it sits  →  WHY it must  →  HOW it propagates

Forma Animae Organon

The instrument of the soul's form

The Windstorm Institute's research is guided by a simple philosophical premise: information is not a metaphor for life — it is the substrate of life. The ribosome is not "like" a decoder. It IS a decoder. The brain is not "like" a computer. It IS a serial information processor. When we discovered that these systems all converge on the same throughput band, we were not finding an analogy. We were uncovering the mathematical skeleton that all serial decoders share.

The Forma Animae Organon is our name for this lens. It is not a theory — it is a way of looking. It asks: if you strip away the chemistry, the biology, the engineering, what mathematical structure remains?

The answer, across six papers and thousands of experiments, is the rate-distortion surface and the thermodynamic cost landscape. These are the bones. Everything else is flesh.

Where biology meets information theory meets AI.

We investigate why serial decoding systems — from ribosomes to transformers — converge on similar throughput constraints despite operating on radically different substrates.

Rate-Distortion Theory

Deriving mechanistic bounds on serial decoding throughput using Shannon's M-ary rate-distortion framework. Zero-free-parameter predictions for biological receivers.

Molecular Information Processing

The ribosome as an information channel. Thermodynamic anchoring of throughput to kT via Hopfield kinetic proofreading. Why 21 amino acids — not 10, not 100.

AI Throughput Constraints

Large-scale empirical studies of tokenizer vocabulary independence. 1,749-model sweeps demonstrating that vocabulary size is a redundancy parameter, not an information parameter.

What the throughput basin means for the real world.

The throughput basin isn't just a theoretical curiosity. It has concrete implications for AI hardware, synthetic biology, and the search for extraterrestrial life.

AI Hardware

The throughput basin predicts that AI models gain nothing from larger vocabularies and waste most of their energy on precision they don't need. Quantization research, efficient architectures, and cooling innovation are the paths to the thermodynamic limit. Optimize joules per decision, not FLOPS per second.

Synthetic Biology

Expanding the genetic code beyond 21 amino acids will cost super-linear energy per addition. Each new amino acid requires exponentially more discrimination infrastructure. The throughput basin constrains what synthetic biology can achieve affordably.

Astrobiology

Any alien biochemistry that processes serial information under noise faces the same rate-distortion geometry. The effective throughput per step would land in the same 3-6 bit neighborhood. The basin is universal — it doesn't depend on Earth chemistry.

Two regimes. One mathematics. A billion-fold efficiency gap.

Paper 5 revealed that the throughput basin is not universal in the way we first expected. There are two regimes — and the difference explains everything.

Regime A — Biology

Alphabet-Bound (α > 1)

Biology builds alphabets through pairwise molecular recognition. Each new symbol must be physically distinguished from every existing one. Cost scales super-linearly. Result: a throughput basin at 3–6 bits — the ribosome's M = 21 amino acids sits at the computed optimum.

~2%
above thermodynamic minimum
Regime B — Silicon

Capacity-Bound (α < 1)

Silicon builds vocabularies through learned parameters. Each new weight is independent. Cost scales sub-linearly. Result: no basin — but AI still converges on ~4.4 bits/token because it learned from language produced by biological brains that ARE constrained by the basin.

~10&sup9;×
above Landauer floor

Evolution is a better optimizer — for this particular problem. The ribosome has had 3.8 billion years to close the gap between its performance and the thermodynamic limit. Silicon has had decades. The mathematics is the same. The engineering maturity is not.

A note on the φ numbers. The "~109× above Landauer" figure is the useful-dissipation fraction per discrimination event — the thermodynamically relevant energy attributed to the irreversible logical step itself. Paper 7's RTX 5090 measurements report φ ≈ 1015–1018 for total GPU wall power, which additionally pays for memory access, cooling, power-supply conversion, and idle circuitry. Both numbers are correct; they measure different physical boundaries. See Paper 7 §3.4 for the full reconciliation.

Published research.

All papers include reproducible Python code, full experiment protocols, and honest limitations. We lead with falsified predictions because that's how science works.

01

The Fons Constraint

The foundational observation: AI tokenizer vocabularies do not cluster near 64 — but effective information per processing event does converge across substrates. The falsified prediction that started everything.

2026 information theory falsification zenodo doi: 10.5281/zenodo.19274048
02

The Receiver-Limited Floor: Rate-Distortion Bounds on Serial Decoding Throughput

M-ary rate-distortion derivation applied to ribosomes, phonology, and music. Empirical tokenizer sweep across 1,749 models confirms vocabulary independence of bits-per-byte (p = 0.643).

2026 information theory empirical zenodo doi: 10.5281/zenodo.19322973
03

The Throughput Basin: Cross-Substrate Convergence and Decomposition of Serial Decoding Throughput

Basin decomposition I_eff = R_M(ε) + Δ_s + ξ across 31 systems. Three independent evolutionary simulations converge to K ≈ 19-30. Co-evolutionary discovery of the genetic code's parameters from pure optimization.

2026 cross-substrate evolutionary zenodo doi: 10.5281/zenodo.19323194
04

The Serial Decoding Basin τ: Five Experiments on Convergence, Thermodynamic Anchoring, and the Geometry of Receiver-Limited Throughput

Five reproducible experiments forming a convergent evidence chain. Thermodynamic prediction of ribosome throughput to Δ = 0.003 bits. Falsifiable wet-lab prediction included.

2026 experimental reproducible zenodo doi: 10.5281/zenodo.19323423
05

The Dissipative Decoder: Thermodynamic Cost Bounds on the Serial Decoding Throughput Basin — and Why Silicon Escapes Them

Derives WHY the throughput basin exists from thermodynamic cost minimization. Two-regime framework: Regime A (biology, α > 1) produces a basin; Regime B (silicon, α < 1) escapes it. Kazusa-verified thermophilic validation (partial r = −0.451, p = 0.014, n = 29). Silicon benchmark: 27 models on RTX 5090. The ribosome operates within 2% of its thermodynamic minimum; silicon operates ~10&sup9;× above its Landauer floor.

2026 thermodynamics empirical falsification zenodo doi: 10.5281/zenodo.19433048
06

The Inherited Constraint: Biological Throughput Limits Shape the Information Structure of Human Language and, Through It, AI

Explains WHY AI converges on ~4.2 bits/token despite having no thermodynamic basin: it inherits the fingerprint from biological training data. Natural language BPT ≈ 4.4 bits matches the ribosome (4.39) and basin centroid (4.16 ± 0.19). Destroying syntax doubles surprise to 10.8 bits. Shannon (1951) independently estimated ~5 bits/word 75 years ago.

2026 linguistics AI cognition empirical zenodo doi: 10.5281/zenodo.19432911
07

The Throughput Basin Origin: Four Orthogonal Experiments on Whether Serial Decoding Convergence Is Architectural, Thermodynamic, or Data-Driven

Nine experiments testing whether the throughput basin is architectural, thermodynamic, or data-driven. Models extract bits per source byte equal to source entropy at both 92M and 1.2B parameters, with no attractor near 4 bits across entropy levels 5–8. PCFG-8 (structured 8-bit data) achieves 6.59 BPT. The refined equation: BPT ≈ source_entropy − f(structural_depth). Published with full internal adversarial review; all blocking items resolved.

2026 experimental adversarial review zenodo doi: 10.5281/zenodo.19498582

Long-form writing for a general audience.

Research explained in plain language. No jargon walls, no dumbing down — just honest exposition of what the data says and why it matters.

New here? Read in this order
  1. The Speed Limit of Thought — the overview (~18 min)
  2. Why 64 Codons — where it started (~8 min)
  3. 1,749 Models and a Flat Line — the AI evidence (~10 min)
  4. 31 Systems — the cross-substrate convergence (~12 min)
  5. Predicting the Ribosome from Pure Physics — the physics anchor (~12 min)
  6. Why the Basin Exists — the thermodynamic argument (~14 min)
  7. The Inherited Constraint — why AI lands nearby (~15 min)
  8. The Mirror, Not the Wall — Paper 7, the test (~12 min)
Seven systems converging on the throughput basin
April 2026 ~18 min read

The Speed Limit of Thought: How Biology, Brains, and AI All Hit the Same Wall — and Why Physics Says They Must

The overview. From ribosomes to transformers, every system that decodes serial information under noise lands in the same narrow throughput band. Six papers, one universal constraint, and what it means for the future of AI, synthetic biology, and the search for alien life.

Read article →
April 2026 ~8 min read

Why the Genetic Code Uses 64 Codons

Two independent proofs — Shannon and Eigen — both derive triplet encoding as mathematical necessity. The falsified prediction that launched the research program.

Read article →
p=.643
April 2026 ~10 min read

1,749 Models and a Flat Line

Why bigger vocabularies don't help AI. A 750x vocabulary difference produces a 5% throughput difference. The receiver sets the limit.

Read article →
31
April 2026 ~12 min read

When a Computer Reinvents the Genetic Code

31 systems across six domains cluster in a 3–6 bit band. An evolutionary simulation rediscovers the genetic code from pure math.

Read article →
Δ.003
April 2026 ~12 min read

Predicting the Ribosome from Pure Physics

Four measured parameters. Zero fitting. Three decimal places of accuracy. The ribosome operates within 2% of its thermodynamic minimum.

Read article →
η
April 2026 ~14 min read

Why the Basin Exists — and Why Silicon Escapes It

Two cost regimes, one mathematics. Biology: alphabet-bound, α > 1, throughput basin at M ≈ 20. Silicon: capacity-bound, α < 1, no basin. The ribosome at 2% of its thermodynamic minimum; silicon at 10&sup9;× above Landauer.

Read article →
4.4
April 2026 ~15 min read

The Inherited Constraint: How Language Carries the Fingerprint of Biological Throughput Limits

AI has no thermodynamic basin — so why does it converge on ~4.2 bits/token? Because it learned from language shaped by brains that do. The shuffling cascade: syntax carries 3.3 bits. Shannon predicted this 75 years ago.

Read article →
8.92
April 2026 ~12 min read

The Mirror, Not the Wall: Why AI's 4-Bit Limit Is About the Data, Not the Machine

Train the same model on a synthetic 8-bit-entropy corpus and it climbs to 8.92 bits per token, not four. The basin moved with the data. Published with the institute's full internal adversarial review attached — read the article and the review as a unit.

Read article →

Compute infrastructure.

Windstorm Labs is the experimental arm of the Institute — GPU clusters, autonomous AI research agents, and large-scale empirical science.

Primary Compute

RTX 5090

32GB VRAM. Runs 1,749-model evaluation sweeps, evolutionary simulations, and model training.

Agent Fleet

8 Nodes

Autonomous AI research agents coordinated across distributed infrastructure. Parallel experiment execution.

Models Evaluated

1,749

Largest known tokenizer-information survey. Vocabulary sizes spanning 256 to 256K tokens on shared corpus.

Open Science

100%

All code, data, and experiment protocols published. Every result reproducible on commodity hardware.

Institute: Fort Ann, NY  |  Labs: Mount Pleasant, SC

Who we are.

Grant Lavell Whitmer III

Founder & Principal Investigator

U.S. Naval Academy graduate. Cross-disciplinary researcher working at the intersection of information theory, molecular biology, and artificial intelligence. Creator of the Throughput Constraint framework and the Forma Animae Organon — the philosophical lens through which the Institute approaches its research. Author of the forthcoming popular book The Pattern, which brings the throughput basin story to a general audience.

Windstorm Labs

Experimental Division

A fleet of autonomous AI research agents executing large-scale empirical experiments, adversarial review, and computational simulations. Headquartered on an NVIDIA RTX 5090 in Mount Pleasant, South Carolina.

Advisory & Collaboration

Open Positions

We are seeking advisory board members with expertise in information theory, computational biology, and rate-distortion theory. If our work interests you, we want to hear from you.

Why does the ribosome process 4.39 bits per codon — and why does a transformer process roughly the same?

Two systems separated by 3.8 billion years of evolution, built on entirely different substrates, solving the same mathematical problem: decode one symbol per time step from a noisy serial stream while minimizing discrimination cost. The rate-distortion surface doesn't care whether the receiver is RNA, neurons, or silicon. We're mapping that surface.