Perplexity's Deep Research Reaches State-of-the-Art Performance with Opus 4.5 Integration
Perplexity has upgraded its Deep Research technology to achieve state-of-the-art performance, integrating Anthropic's Opus 4.5 and releasing the open-source DRACO benchmark to validate research capabilities.

The AI Research Arms Race Just Shifted
The competitive landscape for AI-powered research tools is heating up. While ChatGPT and other generalist AI platforms continue to dominate headlines, Perplexity has quietly upgraded its Deep Research technology to achieve state-of-the-art (SOTA) performance—a move that signals a fundamental shift in how enterprises and researchers approach complex information synthesis.
The upgrade represents more than incremental improvement. According to Perplexity's research team, the enhanced Deep Research now delivers measurably superior performance on real-world research tasks, backed by rigorous benchmarking rather than marketing claims alone.
What's Changed: Technical Deep Dive
Opus 4.5 Integration
The core upgrade centers on integrating Anthropic's Opus 4.5 model, which brings improved reasoning capabilities and multi-step research execution. This enables Deep Research to:
- Conduct deeper iterative searches across multiple sources
- Synthesize contradictory information with greater nuance
- Generate more comprehensive research reports with better source attribution
- Handle complex, multi-faceted research queries that previously required manual intervention
Developers are already noticing the difference, with reports of improved accuracy and more reliable citation practices compared to competing solutions.
The DRACO Benchmark: Validation Through Transparency
Rather than relying solely on proprietary metrics, Perplexity released DRACO, an open-source benchmark designed to evaluate Deep Research performance in real-world conditions. This move addresses a critical pain point in the AI industry: the lack of standardized, reproducible evaluation frameworks.
DRACO allows independent researchers to:
- Test Deep Research capabilities against standardized research tasks
- Compare performance across different AI research tools
- Identify specific strengths and weaknesses in research synthesis
- Validate claims before adopting the technology
Market Context: Why This Matters Now
The broader AI research tool market is fragmenting. Organizations increasingly recognize that general-purpose chatbots fall short for specialized research workflows. Deep Research's SOTA performance addresses this gap directly, positioning Perplexity as a serious alternative to ChatGPT for knowledge workers, analysts, and researchers.
The timing is strategic. As enterprises evaluate AI tooling for 2026, they're demanding:
- Accuracy and verifiability – not just speed
- Transparent evaluation – benchmarks that can be independently validated
- Specialized capabilities – tools built for specific workflows rather than one-size-fits-all solutions
The Skepticism Factor
It's worth noting that some users have reported accuracy issues with Perplexity's broader platform, highlighting that even SOTA performance claims require scrutiny. The release of DRACO is partly a response to this—demonstrating commitment to measurable, verifiable performance rather than relying on company claims alone.
What's Next
The upgrade is available to Perplexity Max users, with broader rollout expected in coming months. The open-source DRACO benchmark signals Perplexity's confidence in its technology while inviting external validation—a rare move in an industry typically dominated by proprietary claims.
For organizations evaluating AI research tools, the combination of SOTA performance and transparent benchmarking represents a meaningful inflection point. Whether Deep Research can sustain this advantage depends on consistent execution and continued refinement based on real-world usage patterns.


