AI tools are now answering the questions that used to begin with a search. The answers are not random. They come from a specific set of sources. The sources are not the ones the PR industry has spent two decades cultivating. This volume maps them.
A founder asks ChatGPT which publications would best cover her product launch. The answer she gets shapes a $50,000 PR investment. A homeowner asks Claude which roofers are reputable in his area. The answer shapes who gets called. A general counsel asks Perplexity about a regulatory matter. The answer shapes what outside counsel gets retained. A hospital procurement officer asks Gemini about a medical device supplier. The answer shapes a seven-figure contract. A board chair asks Google AI Overviews who the leading voice in a sector is. The answer shapes who gets the speaking slot, the advisory seat, the term sheet.
This volume is about that specific set of sources.
For two decades, public relations operated inside a system where coverage in a known set of publications produced predictable results. Get into The Wall Street Journal, and your customers, investors, and policymakers would see it. The system had a map. The 5W Retrieval Index is the first reference work for that map in the AI era. Volume I covers 38 sectors — from AI itself to pharma, fintech, beauty, cybersecurity, luxury, capital markets, biotech, entertainment, sports, and beyond. Each sector is a chapter. Each chapter names the sources the engines actually cite, grades them on a fixed composite, identifies the structural finding that defines retrieval behavior in that sector, and tells operators what moves the needle.
The central finding across all 38 sectors is the same: the publications people read are not always the publications the AI engines cite. The most-read journalism is not the most-cited journalism. The training-data economy and the paywall economy are running in opposite directions. The gap between them is the new retrieval map. This volume is that map.
Paywalled prestige publications consistently rank below their authority would predict. Open-access archives — even on lower-prestige domains — consistently rank above theirs.
Clean HTML, named-entity schema, stable taxonomies, and consistent metadata raise extractability. Engines retrieve from sources they can parse cleanly.
Sources with stable URLs accumulate authority through co-citation over time. Refresh-and-replace platforms forfeit the compounding.
Reddit, Stack Exchange, and sector-specific forums carry retrieval weight on opinion, experience, and consensus queries that editorial publishers cannot match through declaration alone.
Government databases (CISA, FDA, SEC EDGAR, NAEP), trade-body publications (IAB, OWASP, NAR), and commercial measurement firms (Nielsen, Circana, A.M. Best, STR) function as primary citation tiers across nearly every sector.
Sources that name brands, people, products, and locations with consistent taxonomy are retrieved more reliably than sources that describe them in prose without entity anchors.
Authority is cumulative. Long-tenured publications on stable domains gain citation share that newer entrants cannot match through quality alone in short time horizons.
Subreddits, Discord exports, and Stack Exchange communities operate as the consensus layer for sectors where editorial publishing has not caught up to the industry’s pace.
Engines retrieve from what they can reach. Access controls — paywalls, registration walls, geographic gates — translate directly into retrieval forfeiture.
The most-read journalism is not always the most-cited journalism. The training-data economy and the paywall economy are running in opposite directions, and the gap is the new retrieval map.
The 5W Retrieval Index is the first reference work mapping how AI engines select their sources. It scores media properties on a fixed five-component composite — citation frequency, cross-engine breadth, query-type breadth, extractability, and crawl access — normalized to 0–100, and groups them into four retrieval tiers.
Volume II publishes Q4 2026, completing the 60-sector slate. The annual flagship report, The State of AI Sources, publishes December 2026 and will track year-over-year shifts in retrieval behavior across the full sector set.
Ronn Torossian is the founder and chairman of 5W AI Communications, the AI Communications Firm. He is the publisher of Everything-PR and the author of two best-selling editions of For Immediate Release. The 5W AI Research Team conducted the cross-engine retrieval analysis underlying all 38 sector editions.
220 pages. 38 sectors. The first reference work for the AI retrieval economy.
Download PDF →