Research System Design
How to build a research agent that actually finds signal in noise. Source prioritisation, query design, bias detection, and the synthesis patterns ZELDA uses to produce intelligence reports overnight.
Signal vs noise
The biggest problem with AI research agents isn't that they can't find information — it's that they find too much and can't tell the difference between what matters and what doesn't. A research agent that returns 40 sources of varying quality is less useful than one that returns 5 high-signal findings with clear sourcing.
ZELDA's research system is built around a single principle: filter aggressively before synthesising. She identifies and discards noise first, then works with what remains.
The SSBAA market intelligence ZELDA produced on Day 1: AI agent market at $7.6B (2025) → $183B (2033), 49.6% CAGR. Average business response time: 47 hours. AI agent response time: 60 seconds. Top creator MRR on comparable platforms: $73K/month. Zapier ARR: $400M. n8n: $40M ARR, $2.3B valuation. All sourced, all verifiable.
Source prioritisation
Not all sources are equal. ZELDA uses a three-tier source hierarchy:
Primary sources — always preferred
Company earnings reports, SEC filings, government data, peer-reviewed research, official press releases, direct survey data. These are facts with provenance. ZELDA cites these by default when available.
Secondary sources — use with attribution
Reputable journalism (WSJ, FT, Reuters, TechCrunch), established analyst reports (Gartner, IDC), well-sourced blog posts from known experts. Used when primary data isn't available. Always attributed.
Tertiary sources — verify before using
Forum posts, social media, SEO content, aggregator sites. These can surface trends and conversations but contain high rates of misinformation. ZELDA flags these as "unverified" and doesn't include them in reports without corroboration.
Query design
How you ask ZELDA to research something determines the quality of what you get back. Vague queries produce vague outputs. Here's the query structure we use:
Information extraction
Once ZELDA has identified relevant sources, the extraction phase pulls structured data from unstructured content. The patterns she uses:
- Numerical extraction — prices, growth rates, market sizes, user counts. Always extracted with the date the figure was published.
- Sentiment extraction — from reviews, forums, social media. Categorised as positive/negative/neutral with example quotes.
- Event extraction — product launches, funding rounds, leadership changes, price changes. Timestamped.
- Gap extraction — what are competitors NOT doing? What complaints appear repeatedly in reviews that nobody is addressing?
Bias detection
Research agents will reproduce whatever bias exists in their sources unless you explicitly instruct them to check for it. The three bias patterns ZELDA is instructed to flag:
⚠️ Vendor bias: "Independent" comparisons written by companies with a stake in the outcome. ZELDA checks author attribution and flags any piece where the author works for or has investment in one of the companies being compared.
⚠️ Recency bias: Over-indexing on the most recent information at the expense of context. ZELDA includes publication dates on all sources and flags when a claim is based solely on articles under 30 days old.
⚠️ Confirmation bias: Only surfacing sources that support a pre-existing view. ZELDA's research prompt explicitly asks her to include at least one source that contradicts the expected finding.
Synthesis patterns
Raw research isn't useful. Synthesis is what turns 12 sources into a finding you can act on. ZELDA uses three synthesis patterns:
Convergence synthesis
Multiple independent sources pointing at the same conclusion. The more independent sources agree, the higher confidence the finding. ZELDA reports confidence as High / Medium / Low based on source count and tier.
Gap synthesis
What does the market want that nobody is providing? ZELDA compares complaint patterns in competitor reviews against available products. Repeated complaints with no existing solution = opportunity.
Trend synthesis
Directional change over time. Not just "the market is $7.6B" but "the market was $2.1B two years ago and is projected at $183B in seven years." The trajectory matters more than the snapshot.
Research reports
ZELDA outputs research as structured markdown reports, not free-form prose. Every report follows this structure:
Market intelligence we use at SSBAA
Here's the competitive landscape data ZELDA surfaced on Day 1 that shaped our positioning:
| Metric | Data point | Source tier |
|---|---|---|
| AI agent market size (2025) | $7.6 billion | Tier 1 |
| AI agent market size (2033 projection) | $183 billion | Tier 1 |
| CAGR | 49.6% | Tier 1 |
| Avg business response time to leads | 47 hours | Tier 2 |
| AI agent response time | 60 seconds | Tier 2 |
| Buyers going with first responder | 78% | Tier 2 |
| Zapier ARR (2025 est.) | $400M | Tier 2 |
| n8n ARR | $40M | Tier 1 |
| Cheapest done-for-you competitor | $297/month | Tier 2 |
| SSBAA positioning gap | Done-for-you at $29/month | Internal |
Scheduling research
ZELDA runs a research cycle every night at 22:30. The nightly scope is focused — competitor moves, relevant news, anything that might affect the content brief for the next day. The weekly deep-dive runs Sunday nights and covers the full market landscape.