AIMay 18, 20264 views

SubQ Architecture Cracks 12 Million Token Window to Beat Major LLMs

A new startup architecture called Subquadratic Selective Attention is redefining long-context AI performance by supporting windows of up to 12 million tokens. In recent MRCR v2 benchmarks, the model achieved a 92.1% accuracy rate, significantly outperforming industry giants like GPT-5.5, which scored 74%, and Claude Opus 4.7, which trailed at 32.2%. This leap in context handling allows the system to process massive datasets or entire codebases without losing track of critical information.

Instead of just expanding capacity, the technology focuses on efficiency through selective attention to maintain high retrieval accuracy at scale. The model is currently being deployed through three primary channels:

A dedicated Developer API for custom integrations.
SubQ Code, an AI agent specifically designed for navigating and writing complex codebases.
SubQ Search, a tool optimized for deep research across vast information pools.

By moving beyond the quadratic scaling limitations of traditional Transformers, this architecture provides a more reliable foundation for deep-context tasks that previously paralyzed even the most advanced commercial models.

SubQ Architecture Cracks 12 Million Token Window to Beat Major LLMs

More

Supply Chain Hack Infects NPM, PyPI, and RubyGems with Persistent Malware

SpaceX Proposes 1 Million Satellites Amid Scientist Warnings