Subquadratic SubQ: Sparse Attention for Faster, Cheaper LLMs • Meteora Web Agency

A Miami-based AI startup, Subquadratic, emerged from stealth mode last month with a huge claim: solving a mathematical bottleneck that had constrained large language models for nearly a decade. Skepticism was widespread, but independent evaluation now suggests the company's breakthrough is real.

The Dense Attention Problem

At the core of every major LLM lies dense attention, where each token is multiplied with every other token, leading to quadratic growth in computation. For a 10,000-word text, nearly 50 million multiplications are needed. Subquadratic replaces dense attention with dynamic sparse attention, selecting only relevant token pairs and drastically cutting computation.

SubQ's Sparse Solution

SubQ dynamically chooses which words to focus on, adapting to each input. According to co-founder and CTO Alex Whedon, this selection is the "secret sauce." The model supports up to 12 million tokens in a single context window, compared to the one million typical of competitors, enabling analysis of entire codebases or document libraries.

Independent Validation

Subquadratic hired Appen, an independent evaluator, to test SubQ. Results were striking: on pure speed, SubQ was 56 times faster than models using FlashAttention, a previous sparse technique. On LiveCodeBench, which tests coding problem-solving, SubQ scored 89.7%, matching top models from Google DeepMind, OpenAI, and Anthropic. Appen's Jeanine Sinanan-Singh called the results "shocking" and potentially game-changing.

Cost and Performance

While SubQ won't replace all models, for specific tasks it offers enormous speed gains at a fraction of the cost. CEO Justin Dangel claims running Nvidia's RULER 128 test on Anthropic Opus 4.6 costs about $2,600, while SubQ does it for $8. That's a 325x cost reduction.

Future Implications

Subquadratic hopes to usher in a new age of efficiency. "Nobody will be building on transformers in a few years," says Dangel. The startup has won over skeptics like AI engineer Dan McAteer, who initially compared SubQ to Theranos. The democratization of AI may well start in Miami. For more on AI geopolitical tensions, read our article on SK Telecom and Anthropic. Also check Adobe's expansion of Firefly. An authoritative external resource: Wikipedia on attention mechanisms.

Source: https://www.technologyreview.com/2026/06/19/1139313/a-startup-claims-it-broke-through-a-bottleneck-thats-holding-back-llms