TrendPulse Logo

Probably Secures $9M to Tackle AI Hallucinations via Deterministic Validation

Source: TechCrunchView Original
technology

Startup Probably has secured $9 million in seed funding led by Andreessen Horowitz to address the persistent issue of AI hallucinations. The company aims to bridge the gap between the probabilistic nature of Large Language Models (LLMs) and the high-reliability requirements of enterprise applications. By implementing a system of deterministic validation, Probably seeks to ensure that AI outputs are factually accurate and verifiable, targeting a 99.99% accuracy rate that is currently elusive for standard generative models.

The core of Probably’s technology is a "data science mech suit"—a specialized harness that forces LLMs to adhere to strict data constraints. Instead of relying solely on the model's internal knowledge, the system cross-references every output against a deterministic validator. This process reduces ambiguity and forces the model to operate within a controlled environment. Founder Peter Elias notes that by refining the context and constraints, the system can achieve high performance using significantly smaller, less computationally expensive models.

This approach has profound implications for the future of AI deployment, particularly regarding cost and efficiency. Because the system relies on smaller models, it can run on local hardware rather than massive data centers, drastically lowering token costs. This is a critical development for industries like accounting, medicine, and data science, where precision is non-negotiable and current AI budget bloat is a major concern. By shifting the focus from building larger, more complex models to building better "harnesses," Probably is challenging the industry trend of scaling models at any cost.

Ultimately, Probably’s strategy highlights a growing tension in the AI sector: the misalignment between the business models of major AI labs—which often profit from the iterative, error-prone nature of their models—and the needs of enterprise users who require reliability. By proving that smaller, constrained models can outperform frontier models in precision-sensitive tasks, Probably is positioning itself as a vital infrastructure layer for the next generation of reliable, enterprise-grade AI.

Related Articles