How We Evaluate AI Tools

Every tool and agent on Intelloro goes through a structured, source-cited evaluation pipeline. No star ratings guessed from marketing pages. No paid placements inflating scores. Every field traces back to a URL, a review platform, or a verified data source.

Quick Answer

Intelloro evaluates AI tools across 70+ fields using a 3-layer pipeline: automated web scraping (up to 10 pages per tool), Google Search Grounding for gap-filling, and human-verified cross-referencing against 15 external sources including G2, Capterra, Trustpilot, and compliance databases. Dimension scores combine algorithmic calculations and LLM assessment. All sources are logged and visible on each tool's verification record.

The 3-Layer Evaluation Pipeline

Layer 1 — Web Scraping

We scrape up to 10 pages per tool (homepage, pricing, features, security, integrations, API docs, customers, changelog). A Gemini LLM extracts 44 structured fields from the content. Fields require direct evidence — no guessing.

Layer 2 — Google Search Grounding

For any fields still null after scraping, we run a Google Search Grounding pass. Gemini searches the web and fills gaps using live search results. All grounded values are tagged with source attribution.

Layer 3 — Human-Verified Cross-Referencing (Approach 3)

A trained reviewer runs 15 mandatory external source checks: G2, Capterra, TrustRadius, Product Hunt, Trustpilot, GetApp, Make.com, Zapier, compliance searches (SOC 2, GDPR, HIPAA), status page verification, and enterprise customer lookup. Every source is logged. Sessions are saved to our database with full audit trail.

How Dimension Scores Work

Each tool is scored across 6 dimensions (1–10). We use two methods depending on what can be measured objectively:

Calculated2 of 6 dimensions
  • Integration Power — counted from verified integrations list (0=1, 1–3=4–6, 10–29=8, 30+=10)
  • Customizability — additive: API (+3), open-source (+2), webhooks (+1), MCP (+1), SSO (+1)
AI-Assessed4 of 6 dimensions
  • Ease of Use — LLM assessment from onboarding, docs, UI complexity signals. Upgraded with G2/Capterra user ratings when available.
  • Output Quality — LLM assessment from user reviews and benchmarks
  • Value for Money — LLM assessment from pricing vs. features. Upgraded with Capterra user ratings when available.
  • Support & Ecosystem — LLM assessment from docs, community size, support channels. Upgraded with G2 user ratings when available.

When Approach 3 verification finds G2 or Capterra per-dimension ratings (e.g. "Ease of Use 8.9 · Quality of Support 8.7"), these real user data points override the AI estimates for the matching dimensions.

15 Mandatory External Sources

Every Approach 3 verification runs all 15 sources. No skipping. Results are logged in the session record.

Ratings

  • G2
  • Capterra
  • TrustRadius
  • Product Hunt
  • Trustpilot
  • GetApp

Compliance

  • SOC 2 search
  • GDPR search
  • HIPAA search
  • Status page verification

Adoption

  • Enterprise customer search

Integrations

  • Make.com listing
  • Zapier listing
  • MCP compatibility (dev tools)
  • Uptime SLA from status page

What We Don't Do

  • No paid placements — sponsored tools are clearly labeled and never get inflated scores
  • No inference for boolean fields — HIPAA, SOC 2, MCP require explicit statements, not marketing language
  • No cross-tool contamination — competitor comparison articles are never used as evidence for the tool being evaluated
  • No thin content — pages with fewer than 300 characters of relevant content are marked UNVERIFIABLE, not used as evidence
  • No crowdsourced ratings without verification — user review counts and ratings come only from verified platforms (G2, Capterra, Trustpilot)

Data Quality Score

Every tool has a Data Quality Score (0–100%) and grade (A–F) computed from field coverage, source reliability, and verification status. Fields verified by Approach 3 are markedVERIFIEDand weighted higher in the score. Fields filled only by the Smart Generator are markedLLMand weighted lower. Empty fields that are confirmed correct (e.g., no integrations for a standalone tool) are markedVERIFIEDrather than penalized.

A

90–100%

B

75–89%

C

60–74%

D

40–59%

F

<40%

FAQ

How often are tool profiles updated?

Smart Generator profiles are generated once and updated when significant website changes are detected. Approach 3 verified profiles are re-verified on a rolling schedule. Sponsored tools are prioritized for frequent updates.

Can tool vendors update their own profiles?

Vendors can claim their listing and submit corrections. All vendor-submitted changes are reviewed by our team before going live — they do not auto-publish.

How are integrations verified?

Integrations are extracted from the tool's official integrations page, then cross-checked against Make.com and Zapier listings. Each integration is verified individually — bulk arrays are never auto-passed.

What does UNVERIFIABLE mean?

A field is UNVERIFIABLE when no source in the 15-source checklist could confirm or deny the value. The original DB value is preserved unchanged. UNVERIFIABLE is not the same as false — it means we couldn't confirm.