The State of AI Citations 2026

How Large Language Models Source Information About Brands — And What It Means for Communications

Executive Summary

Generative AI has become a primary research surface for American consumers, professionals, and decision-makers. ChatGPT processes more than three billion prompts a month. Zero-click search behavior rose from 56% of queries in 2024 to 69% by May 2025. When Google’s AI Overviews appear, click-through rate for the top organic result drops 34.5%.

The implication for brand communications is direct: the question is no longer whether a brand ranks in search. It is whether the brand is named, cited, and described accurately when an AI answers the question a buyer is actually asking.

This report synthesizes findings from the largest publicly available citation datasets — including more than 680 million tracked AI citations across ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, and Google AI Mode — and translates the patterns into priorities for communications, public relations, and reputation management.

Five findings that should reshape how brands invest in earned media and reputation work:

Citation patterns differ sharply by platform. Only an estimated 11% of domains are cited by both ChatGPT and Perplexity. A single content strategy will not succeed across the AI surface.
Reddit and Wikipedia exert outsized influence. Wikipedia accounts for nearly half of ChatGPT’s top-10 source share. Reddit accounts for roughly 46.7% of Perplexity’s top-10 share. Most communications budgets do not reflect this concentration of authority.
Brand search volume is the strongest known predictor of AI citation likelihood — a 0.334 correlation, materially stronger than backlinks. Brand-building activity now compounds directly into AI visibility.
Citation patterns are volatile. ChatGPT’s Reddit citation share collapsed from roughly 60% to 10% in mid-September 2025 before stabilizing. Strategies built on a single source or platform are fragile by design.
Vertical concentration is severe. In B2B SaaS CRM, TechRadar alone accounts for 8.86% of category citations. In Healthcare, NIH, Healthline, Mayo Clinic, Cleveland Clinic, and ScienceDirect lead Google’s AI Overviews. Brands not present in those gatekeepers are invisible to AI buyers.

The communications discipline most equipped to address these patterns is public relations. Earned media, third-party validation, Wikipedia governance, and community engagement are the levers that move AI citations. Generative Engine Optimization (GEO) is, in operational terms, an extension of PR — not a replacement for it.

1. The Citation Landscape

The data behind AI citation behavior has matured rapidly. Where 2024 reporting relied on small samples and platform-specific snapshots, 2025 produced multiple multi-million-citation datasets that allow cross-platform comparison.

Datasets referenced in this analysis

Profound: 680 million citations across ChatGPT, Google AI Overviews, and Perplexity, August 2024–June 2025.
Goodie: 5.7 million citations across ChatGPT (GPT-4o), Gemini 1.5, Claude 3.7 Sonnet, and Perplexity Pro, February–June 2025; expanded to 58.6 million citations October 2025–March 2026.
Surfer AI Tracker: 46 million citations across 36 million AI Overviews, March–August 2025.
Semrush: 230,000 prompts across ChatGPT, Google AI Mode, and Perplexity over thirteen weeks, August–October 2025.
Peec AI: 30 million citations spanning ChatGPT, Google AI Mode, Gemini, Perplexity, and Google AI Overviews, with March 2026 snapshot.
Ahrefs: 15,000 queries comparing AI citations to Google top-10 results.
BrightEdge and WebFX: dedicated healthcare-vertical analyses covering 130,000+ U.S. healthcare queries and three-year tracking of AI Overview deployment in healthcare.

The traffic context

Zero-click searches rose from 56% of queries in 2024 to 69% by May 2025. Aggregate organic traffic to news websites declined from 2.3 billion monthly visits in mid-2024 to under 1.7 billion by May 2025 — a loss exceeding 600 million monthly visits in less than a year. Healthline, the most-cited consumer health domain in Google AI Mode at 113,728 citations tracked, has warned that publishers in its category may see 20–35% session losses as AI Overviews provide medical information directly.

Wikipedia presents a paradox illustrative of the broader shift: it is the single most-cited source in AI answers and simultaneously experienced an 8% decline in human pageviews comparing 2025 to 2024. AI consumes Wikipedia faster than humans now do.

2. Platform-by-Platform Source Behavior

The five major AI surfaces reach the same users with materially different citation behaviors. Understanding the differences is the foundation of any GEO strategy.

ChatGPT

ChatGPT, with browsing enabled, queries the Bing index and selects three to ten sources per response. Wikipedia is its dominant source — 47.9% of ChatGPT’s top-10 source share according to Profound and Discovered Labs analyses, and 7.8% of total citations across Profound’s full dataset. Reddit was historically near-equivalent in volume before a sharp adjustment in mid-September 2025. Seer Interactive analysis of more than 500 citations found that 87% of SearchGPT citations match Bing’s top-10 organic results, while only 56% match Google’s. ChatGPT also mentions brands roughly 3.2 times more often than it cites them with links.

Perplexity

Perplexity operates a proprietary index of more than 200 billion URLs, processed across 400+ petabytes of storage. Its citation behavior is the most community-skewed of any major platform: Reddit accounts for approximately 46.7% of Perplexity’s top-10 source share. Profound’s analysis identifies Perplexity’s most-cited domains as G2, Gartner, NerdWallet, PCMag, TripAdvisor, and Yelp — a pattern suggesting Perplexity favors review aggregators and structured comparison sources for commercial intent queries.

Google AI Overviews and AI Mode

Google’s AI surfaces favor a more diversified mix and demonstrate strong self-referential bias. Approximately 43% of AI Overview citations link back to Google-owned properties, including YouTube. Google AI Mode consistently cites LinkedIn (approximately 15% of responses, per Semrush), with Reddit, YouTube, and Google.com clustered as top sources. AI Overviews maintain the strongest correlation with traditional search rankings — 93.67% of citations link to at least one top-10 organic result — though only 4.5% of AI Overview URLs directly match a Page-1 organic URL, suggesting Google draws from deeper pages on authoritative domains.

Gemini

Gemini’s citation behavior closely mirrors Google AI’s, with heavy reliance on LinkedIn, Medium, Quora, Reddit, and Wikipedia. Gemini and Google AI together represent the strongest distribution channel for content already performing well in Google’s organic ecosystem.

Claude

Claude takes the most conservative citation approach. By default it relies on parametric knowledge through its training cutoff and does not browse unless tools are enabled. Anthropic introduced a Citations API in mid-2025; early testing reported by Endex showed source hallucinations dropping from 10% to 0% with the API and a 20% increase in references per response. For brands, Claude’s caution rewards formal authoritative tone, technical precision, and explicit source citation in the underlying content.

Cross-platform overlap

Only an estimated 11% of domains are cited by both ChatGPT and Perplexity. Ahrefs analysis of 15,000 queries found that only 12% of URLs cited by AI tools overlap with Google’s top-10 organic results. The remaining 88% of AI citations are pulling from sources that do not rank on page one. The implication is unambiguous: ranking well in Google does not produce AI visibility, and a single-platform optimization strategy leaves most of the surface uncovered.

3. Vertical Patterns

Citation behavior also varies sharply by vertical. The patterns below combine Surfer’s industry analysis (March–August 2025), Goodie’s vertical breakdowns (February–June 2025 and October 2025–March 2026), and Profound’s category data.

Healthcare

Healthcare exhibits the most institutional citation pattern of any consumer vertical. In Google’s AI Overviews, the top cited health domains are NIH (~39%), Healthline (~15%), Mayo Clinic (~14.8%), Cleveland Clinic (~13.8%), and ScienceDirect (~11.5%). YouTube also plays a meaningful role at approximately 28% of health citations, primarily for patient-friendly explanations.

BrightEdge tracking shows AI Overview presence in healthcare grew from 59% to 89% of queries over two years. Treatment and procedure queries now show 100% AI Overview presence, up from 45% in 2023. Pain-related queries show 98% presence; symptom queries 93%; medical coding 90%. WebFX analysis of 130,070 U.S. healthcare queries found that queries of seven or more words trigger AI Overviews 73.9% of the time — the highest rate of any query length.

Financial Services

In finance, Surfer reports YouTube leading at approximately 23% of citations, followed by Wikipedia (~7.3%), LinkedIn (~6.8%), and Investopedia (~5.7%). Goodie’s analysis of 109,000 finance-specific citations identified NerdWallet as the consistent leader across ChatGPT, Gemini, Claude, and Perplexity. CNBC, Forbes, Bloomberg, and Business Insider hold strong reference positions, particularly for breaking news and market commentary. Reddit’s r/personalfinance and r/investing surface during volatility, when users seek peer-validated insight. Notably, Wise’s appearance at #8 in retail-banking citations demonstrates that fintech brands with strong editorial coverage can earn citation authority alongside legacy financial media.

Beauty and Personal Care

Beauty is the most community-driven category in the dataset. Goodie’s analysis of 135,419 beauty citations (February–June 2025) found Reddit ranked #1 across all four major LLMs. In Goodie’s October 2025–March 2026 skincare analysis, Wikipedia anchored the top at 4.39% but Kaja Beauty, an indie brand, captured 4.20% — one of the strongest brand-level citation shares observed in any vertical. Vogue, Business of Fashion, and Who What Wear round out the top editorial sources. Indie outperforms legacy more often than traditional category authority would predict.

B2B SaaS

B2B technology categories show the highest concentration of any vertical. Goodie’s October 2025–March 2026 CRM and Sales Software analysis found TechRadar at 8.86% citation share — the highest single-category figure recorded across the study. G2, Capterra, and TrustRadius dominate review-driven evaluation queries. Forbes, TechCrunch, and Gartner appear consistently in Claude and Perplexity citations. Vendor brand sites are largely absent from category top-10s, which suggests buyers reach AI long before they reach a vendor’s website.

Consumer and Retail

In consumer categories, Profound and Goodie data converge on a small set of category-defining sources: Wirecutter, Amazon, Reddit (particularly r/BuyItForLife and category-specific subreddits), Tom’s Guide, and RTINGS. The Strategist (New York Magazine), Apartment Therapy, and Good Housekeeping appear with category-specific concentration. Private-label brands (Costco’s Kirkland, Trader Joe’s, Amazon Basics) show meaningful citation share in categories where shoppers ask AI for value-driven recommendations.

Travel

Profound’s analysis of online-travel and travel-booking categories shows Reddit at the top of citation share, with Wikipedia consistently appearing for destination context. TripAdvisor remains a Perplexity favorite. The Points Guy, One Mile at a Time, and Reddit’s r/awardtravel exert disproportionate influence on premium-cabin and loyalty-program recommendations.

4. The Reddit and Wikipedia Questions

Two community sources — Reddit and Wikipedia — account for a disproportionate share of LLM citations across nearly every vertical. Most communications budgets do not reflect their gravitational pull.

Reddit is the most-cited domain across all major AI platforms combined, according to Peec AI’s March 2026 analysis of 30 million citations. The Digital Bloom reported Reddit citation growth of approximately 450% in some surfaces during 2025. Per Surfer’s research, content with 50 or more upvotes in active subreddit threads is the precise pattern that Perplexity and Google AI Mode extract as citations.

The implication: Reddit cannot be optimized through traditional content marketing. It rewards genuine subject-matter participation in relevant subreddits, and it punishes promotional behavior. For brands, this raises the bar on community engagement — and it makes ethical, transparent participation a meaningful PR discipline rather than an afterthought.

Wikipedia

Wikipedia accounts for approximately 47.9% of ChatGPT’s top-10 source share and 7.8% of total citations across Profound’s 680-million-citation dataset. Surfer’s industry analysis places Wikipedia at 18.4% of all AI citations across verticals.

Wikipedia is hard to edit and intentionally hostile to brand-led modifications, which is precisely what makes it a high-trust signal for LLMs. For brands, the work is two-part: (1) ensure entity presence on Wikidata and Wikipedia for notable people, products, and organizations; (2) maintain ongoing content accuracy through proper editorial channels — not direct edits. Wikipedia hygiene is now reputation infrastructure.

5. The September 2025 Volatility Event

Citation patterns in 2025 demonstrated meaningful volatility. The most documented example is the September 2025 ChatGPT shift captured by Semrush across 230,000 prompts.

In early August 2025, ChatGPT cited Reddit in nearly 60% of prompt responses. By mid-September, that share had collapsed to roughly 10%. Wikipedia’s share on ChatGPT dropped from approximately 55% of responses to less than 20% in the same window. The drop was isolated to ChatGPT — Perplexity and Google AI Mode did not see equivalent shifts.

Several SEOs initially attributed the shift to Google’s removal of its num=100 search parameter in mid-September. Sergei Rogulin, Semrush’s Head of Organic and AI Visibility, offered a different interpretation: an internal model adjustment intended to reduce over-citation of dominant sources and increase resistance to manipulation. The winners from the shift were PR Newswire, Forbes, and Medium — all of which gained citation share as Reddit and Wikipedia lost it.

Two operational lessons follow. First, AI citation patterns are not static and should not be optimized as if they were. Second, the platforms appear to be actively rebalancing their source distributions, which means a single-source dependency is a strategic vulnerability.

6. Implications for Brand Communications

Read together, the data above produces a coherent set of priorities for any organization invested in brand reputation, demand generation, or crisis preparedness.

Earned media is GEO infrastructure

Tier-one publications and trade press do not produce equivalent AI visibility. Coverage in Forbes, PR Newswire, Medium, and outlet-specific category leaders (Wirecutter for consumer, NerdWallet for finance, TechRadar for SaaS, Healthline for health) translates into measurable citation share. Brands should evaluate their PR pipeline not only by reach and sentiment but by whether the placement is in a publication LLMs actually cite.

Brand-building compounds into AI visibility

With brand-search volume showing the strongest known correlation to AI citation likelihood (0.334) — materially stronger than backlinks — investments in awareness, distinctive positioning, and category leadership produce direct AI returns. The discipline that builds branded search demand is, again, public relations.

Wikipedia and entity presence are reputation infrastructure

Establishing entity presence on Wikidata, Wikipedia (where notability supports it), and four or more authoritative third-party platforms is associated with a 2.8x increase in citation likelihood. Wikipedia governance — accuracy monitoring, dispute resolution, and proper editorial process — is a discipline that PR teams are uniquely positioned to manage.

Community is unowned but unignorable

Reddit cannot be controlled and should not be astroturfed. It can, however, be engaged authentically. Subject-matter experts, executives, and brand specialists who participate in relevant subreddits with substance — and who disclose affiliation transparently — produce the kind of content that LLMs treat as authoritative community insight.

Sentiment travels through sources

LLMs do not generate sentiment about a brand independently. They reflect the sentiment of the sources they cite. A brand whose citations skew toward negative Reddit threads, defensive press coverage, or critical reviews will be described in those terms. Sentiment management is, in part, source-portfolio management.

Crisis memory is durable

LLMs preserve documented crises in their training data and continue to cite them in present-tense answers years after resolution. Crisis communications now extends beyond the news cycle into long-tail AI memory. Recovery requires sustained, accurate, well-sourced content production — not a one-time response.

Volatility requires continuous monitoring

The September 2025 shift demonstrated that platforms will rebalance source weighting unilaterally, sometimes dramatically. Brands that monitor only quarterly will discover problems too late. Continuous AI visibility tracking is now part of communications operations infrastructure.

7. The 5W AI Communications Framework

5W’s approach to building brand authority across AI platforms integrates the disciplines above into a single operating model:

Earned Media for AI Citation Pickup — Targeted placement in publications and outlets that LLMs actually cite, with content structured for machine retrieval: clear factual claims, original data, and explicit attribution that LLMs can extract and re-use.
Wikipedia and Entity Infrastructure — Notability assessment, entity establishment, ongoing accuracy governance, and dispute management — handled through the platform’s editorial channels rather than against them.
Community Engagement — Authentic, transparent participation in the subreddits and forums that drive citation share within a brand’s category. Executive thought leadership and subject-matter expertise published in the formats LLMs surface — including substantive long-form Reddit replies, LinkedIn essays, and YouTube explainers.
Generative Engine Optimization — Site-level structure (schema, citation-friendly formatting, statistic-rich content), entity reinforcement, and content adapted for the retrieval and synthesis behaviors specific to each major platform.
Crisis Resilience in AI Memory — Proactive content programming for organizations with material historical incidents, structured to provide LLMs with accurate, current, well-sourced reference material that competes with legacy crisis coverage in the answer-generation process.
Continuous Measurement — Quarterly visibility audits across the major AI surfaces, sentiment tracking on cited sources, and competitor benchmarking — surfaced to clients in formats that translate directly into communications priorities.

Methodology Note

This report is a synthesis of publicly available citation research published between August 2024 and April 2026. Where percentages are cited, they reflect the source studies’ published findings as of the dates indicated; AI citation patterns are volatile and current values may differ. 5W is conducting parallel proprietary research designed to extend these findings into communications-specific outcomes — vertical brand visibility leaderboards, hallucination tracking, crisis-memory analysis, and earned-media-to-AI-citation pipeline measurement. Findings from those studies will be released on a quarterly basis beginning in 2026.

Sources Cited

Profound, AI Platform Citation Patterns: How ChatGPT, Google AI Overviews, and Perplexity Source Information (August 2025 update; 680 million citations, August 2024–June 2025).
Goodie, What Are the Most Cited Domains in LLMs? (September 2025; 5.7 million citations, February–June 2025); Most Cited Domains in AI Search: Industry Breakdown (April 2026; 58.6 million citations, October 2025–March 2026).
Surfer, AI Citation Report 2025: Which Sources AI Overviews Trust Most Across Industries (October 2025; 46 million citations across 36 million AI Overviews, March–August 2025).
Semrush, The Most-Cited Domains in AI: A 3-Month Study (November 2025; 230,000 prompts across 13 weeks).
Peec AI, citation analysis (March 2026; 30 million citations).
BrightEdge, Healthcare and AI Overviews: How Google Sharpened Its Approach Over Three Years (December 2025).
WebFX, AI Overviews in Healthcare: What Our Study of 130K+ Health Queries Reveals (September 2025).
The Digital Bloom, 2025 AI Visibility Report: How LLMs Choose What Sources to Mention (December 2025).
Discovered Labs, AI Citation Patterns: How ChatGPT, Claude, and Perplexity Choose Sources (December 2025).
Ahrefs, AI citation overlap analysis (August 2025; 15,000 queries).
Seer Interactive, SearchGPT citation analysis (500+ citations).
Endex, evaluation of Anthropic Citations API.
Amsive, The Leading Brands & Domains in AI Search Across 10 Business Categories (February 2026; in partnership with Profound).

Talk to 5W About AI Visibility

5W’s research powers client work in AI visibility, crisis communications, and category leadership. To see how your brand surfaces inside ChatGPT, Claude, Perplexity, and Gemini — and to build the infrastructure that moves citations in your favor — talk to 5W.

Book an AI Visibility Audit →