Self-Reported Attribution Is Unreliable — and It's Still the Best GEO Signal You Have

Self-reported attribution beats analytics 10:1 for GEO. Design form fields, sales scripts, and CRM taxonomies that minimize bias and surface AI pipeline. $0 to implement — start capturing what your dashboard misses.
Table of Contents
Self-reported attribution has a credibility problem. Buyers misremember. They pick the first option in a dropdown. They write "Google" when they mean "I asked ChatGPT, then Googled the brand it recommended." Every research methods textbook warns about recall bias in self-reported data. And yet — for GEO measurement, self-reported attribution is the best signal available.
The reason is not that self-reported data is good. It's that the alternative is worse. Analytics-tracked AI referral traffic captures roughly 2 percent of actual AI influence on B2B pipeline. Self-reported attribution, with all its flaws, captures roughly 20 to 30 percent. A flawed signal that captures 10× more of the truth beats a precise signal that captures almost none of it.
This article is about making the flawed signal less flawed — designing form fields, sales scripts, and CRM taxonomies that minimize the known biases in self-reported attribution and surface the AI-influenced pipeline your analytics stack is blind to.
Why Self-Reported Attribution Wins, Despite Everything
The case against self-reported data is well documented. Buyers exhibit recall bias — they remember the last touchpoint, not the first. They exhibit social desirability bias — they report what sounds reasonable, not what's accurate. They exhibit option-order bias — in dropdowns, they pick the first or last option disproportionately. They exhibit platform confusion — "I found you on Google" when the search was for your brand name, which they learned from an AI recommendation 48 hours earlier.
The case for self-reported data is simpler: analytics-tracked AI attribution captures 0.5 percent of true AI influence. The 54x gap documented at Gumlet — 27 percent of signups self-reported AI as their discovery source, while GA4 attributed 0.5 percent of sessions to AI platforms — is not an anomaly. SegmentStream found an 8x recovery when switching from last-click to first-click plus self-reported re-attribution. GenerateMore.ai found 37 percent of SEO-driven demos hidden under "Direct Traffic" in the CRM — demos the buyers said came from AI or Google, not direct navigation.
Self-reported data captures influence. Analytics-tracked data captures clicks. For a channel where 83 percent of influenced buyers never click an AI link — arriving instead via branded search or direct URL — the metric that captures influence will always outperform the metric that captures clicks, even with measurement error.
Designing the Form Field
The single highest-ROI change in GEO measurement is the form field. Get it right, and you surface attribution data no analytics tool can produce. Get it wrong, and you reinforce the biases that make self-reported data unreliable.
Free text, not dropdown
Dropdowns are the most common format for "How did you hear about us?" — and the worst for GEO attribution. A dropdown with options like "Google Search," "Social Media," "Friend/Colleague," and "Other" will never capture "ChatGPT recommended you." The buyer scans the list, doesn't see their answer, and picks "Google Search" — because they did eventually Google you, even though the discovery happened in an AI conversation.
A free-text field surfaces the actual answer. Buyers write things like: "I asked ChatGPT for the best video optimization API and it recommended you alongside Cloudinary." This response contains three pieces of intelligence: the AI platform (ChatGPT), the query context (video optimization API), and the competitive set (Cloudinary). No analytics tool produces this. No dropdown captures it.
Implementation: Make the field optional — required fields increase form abandonment. Place it last, after the primary conversion fields. Use microcopy: "This helps us understand how buyers like you find solutions — we read every response." Completion rates for well-designed free-text fields range from 30 to 50 percent. The responses you get are worth the non-responses you don't.
The checkbox compromise
If your form volume exceeds roughly 100 submissions per week, manual review of free-text responses becomes a bottleneck. At that scale, add a multi-select checkbox field alongside the free-text field. Options: "AI tool (ChatGPT, Perplexity, Claude, etc.)," "Google search," "Social media," "Colleague or friend," "Review site (G2, Capterra, etc.)," "Other." Keep the free-text field. The checkbox captures structured data for reporting. The free-text captures the detail the checkbox misses. Tag leads where both the checkbox AND the free-text field indicate AI discovery — these are your highest-confidence AI-attributed leads.
Designing the Sales Script
Self-reported attribution on forms captures the buyers who remember to mention AI. It misses the buyers who discovered you through AI but don't think to write it — because the AI conversation happened days or weeks ago, and by the time they fill out your form, the discovery moment has faded from active memory.
Your SDR team is the backstop. Their discovery calls surface AI influence that forms miss — not because SDRs have better data collection tools, but because a conversation prompts memory in a way a form field doesn't.
The script
One question, asked in the first five minutes of the discovery call, while the buyer is describing their research process:
"Before you reached out to us, were you using any AI tools — ChatGPT, Perplexity, Claude, any of those — to research solutions? And if so, do you remember what you were searching for?"
This question has four design properties that reduce reporting error:
It names specific platforms. "ChatGPT, Perplexity, Claude" triggers recognition memory — the buyer hears the name and remembers "yes, I did use ChatGPT for this." A generic question ("did you use any AI tools?") requires recall memory, which is weaker.
It's positioned during natural narrative flow. The question lands when the buyer is already describing their research process. It feels like a follow-up, not an interrogation.
It asks for the query, not just the platform. "What were you searching for?" produces the commercial-intent query language that feeds your keyword cluster strategy. Even if the buyer doesn't remember the exact platform, they'll remember the question they were trying to answer.
It doesn't lead the witness. "Were you using any AI tools?" is neutral — it doesn't assume AI use. Leading questions ("You probably found us through ChatGPT, right?") produce confirmation bias. Neutral questions produce better data.
Making it stick
The script only works if it's used. In the Siteup case study, we found that SDR compliance with the AI discovery question was inconsistent — some SDRs asked it every call, some forgot entirely — until we made it a required field in the CRM. The opportunity can't advance stages without an AI Discovery Source value. Compliance went to 100 percent. The solution is inelegant but effective: if the data matters, make it required.
Designing the CRM Taxonomy
Self-reported data lands in two places: the free-text form field (unstructured, rich, hard to report on) and the SDR's call notes (also unstructured). Without a CRM taxonomy that converts unstructured attribution into queryable fields, the data sits in a text box forever.
The minimum viable taxonomy
Three CRM fields, all custom:
AI Discovery Source (picklist): AI Search, ChatGPT, Perplexity, Claude, Gemini, Copilot, AI Recommendation (other). Populated by form field review or SDR data entry.
AI Discovery Query (text): The specific question or search the buyer described. "Best video optimization API for developers, comparable to Cloudinary." Populated by SDR call notes.
AI Influenced Deal (checkbox, on Opportunity): Checked when the deal's AI Discovery Source is non-null OR when the sales team confirms AI influence during the deal cycle. This is the field you filter pipeline reports on.
The tagging workflow
Weekly: export all leads where the "How did you hear about us?" free-text field contains a response. Scan for AI platform mentions — "ChatGPT," "Perplexity," "Claude," "Gemini," "Copilot," "AI," "LLM," " asked " (as in "I asked an AI"). Tag matching leads with the appropriate AI Discovery Source value. If the buyer named a specific query, copy it to the AI Discovery Query field. This takes five minutes per week at typical B2B SaaS lead volumes.
After roughly 200 tagged responses, you can automate the keyword-matching rule — a CRM workflow or Zapier automation that tags leads automatically when the free-text field matches an AI platform keyword. The automation captures about 90 percent of AI-attributed responses. Keep the weekly manual review as a quality check.
The Honest Limitations
Self-reported attribution will always have error. Some buyers will write "Google" when AI was the real discovery mechanism. Some SDRs will forget to ask the question even when it's required. Some free-text responses will be ambiguous ("I found you online" — through what?).
The goal is not perfect attribution. The goal is attribution that's good enough to make investment decisions — and 10× better than the analytics-only alternative. A CRM report showing 20 percent of qualified opportunities with confirmed AI touchpoints, trended quarterly, is evidence. A GEO dashboard showing 342 AI mentions is a data point. The CFO accepts the first.
Ready to instrument your attribution? Sign up for Siteup to build an attribution-native GEO program, or see our pricing.
FAQ
What completion rate should I expect on the free-text form field?
Thirty to 50 percent, depending on form length, field placement, and whether the field is required. Free-text fields have lower completion rates than dropdowns — but the responses are qualitatively better. One response that says "ChatGPT recommended you for GEO content automation" is worth 100 dropdown selections of "Search Engine."
How do I handle buyers who say "Google" when AI was clearly involved?
You can't catch every case. The SDR script is your best backstop — asking "were you using any AI tools?" during the discovery call surfaces the AI influence that the form missed. If both the form and the SDR report "Google," accept it. The goal is improved measurement, not perfect measurement. A 10× improvement in attribution capture is worth pursuing even if 20 percent remains unmeasured.
At what lead volume should I automate the tagging?
Around 200 total responses — that's enough data to validate that your keyword-matching rules (flagging "ChatGPT," "Perplexity," etc.) are capturing the right responses with acceptably low false positives. Below 200, manual review is fast enough that automation isn't worth the setup time.
Related Articles

The GEO Maturity Model: From Citation Counting to Revenue Attribution in Four Stages
Most GEO teams are stuck counting citations. Diagnose your stage in 3 questions. The 4-stage maturity model with costs, timelines, and warning signs.

Beyond AI Impressions: How GEO Software Can Tie Citation Growth to Pipeline and Revenue Outcomes
GEO software ROI goes beyond citation counts. Learn the attribution framework with real data on the 54x measurement gap and agent automation in GEO revenue.

How Siteup Built an Attribution-Native GEO Program That Traces AI Citations to Closed Revenue
How Siteup built an attribution-native GEO program using its own conversion-endpoint metadata at content creation to trace AI citations to closed revenue.