<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The SEO Pub</title>
	<atom:link href="http://theseopub.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://theseopub.com</link>
	<description>SEO Tips, Tricks, and More</description>
	<lastBuildDate>Tue, 09 Jun 2026 15:09:06 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://theseopub.com/wp-content/uploads/2026/01/cropped-theseopub-green-logo-transparent-32x32.png</url>
	<title>The SEO Pub</title>
	<link>https://theseopub.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Build Validated, Entity-Linked Schema for Any Page (Free Claude Skill)</title>
		<link>https://theseopub.com/build-validated-entity-linked-schema/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 09 Jun 2026 12:31:32 +0000</pubDate>
				<category><![CDATA[SEO]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3528</guid>

					<description><![CDATA[A few weeks ago I wrote about adding custom schema to WordPress and why the templated output from Yoast and Rank Math leaves value on the table. The follow-up question I got most often was some version of: &#8220;Okay, but how do I actually generate the schema?&#8221; This is the answer. I built a Claude [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">A few weeks ago I wrote about <a href="https://theseopub.com/how-to-add-custom-schema-to-wordpress-pages/">adding custom schema to WordPress</a> and why the templated output from Yoast and Rank Math leaves value on the table. The follow-up question I got most often was some version of: &#8220;Okay, but how do I actually generate the schema?&#8221;</p>



<p class="wp-block-paragraph">This is the answer. I built a Claude skill that takes a page, figures out every schema type that applies, builds them into a single connected graph, and links the page&#8217;s main entities to their verified Wikipedia and Wikidata entries. It&#8217;s free.</p>



<p class="wp-block-paragraph">The part I&#8217;m most happy with is the verification step, which I&#8217;ll get to below. Most schema generators will happily invent a Wikidata link that doesn&#8217;t exist. This one checks first.</p>



<h2 class="wp-block-heading">What the Skill Does</h2>



<p class="wp-block-paragraph">You give it a published URL or a draft of the page content. It does the rest.</p>



<p class="wp-block-paragraph"><strong>It reads the page and classifies it.</strong> Article, product, local business, service page, event, recipe, whatever it is. A page is often several of these at once, so it captures all of them.</p>



<p class="wp-block-paragraph"><strong>It selects every schema type that applies.</strong> Not just the obvious one. A product page also has a publishing Organization, sits in a BreadcrumbList, and may have Review and Offer data. The skill draws from the full Schema.org library, including the specialized types (Event, JobPosting, Course, Recipe, VideoObject, SoftwareApplication, Dataset, and more), and it picks the most specific type available. Plumber instead of LocalBusiness instead of Organization, because specificity helps both Google and AI systems.</p>



<p class="wp-block-paragraph"><strong>It builds one connected graph.</strong> Instead of scattering separate schema blocks across the page, it outputs a single <code>@graph</code> with every entity cross-referenced by <code>@id</code>. The Organization is defined once and referenced everywhere it&#8217;s needed. This is the current standard, and it&#8217;s the format AI systems parse most cleanly, since connected entities are easier to interpret than scattered, disconnected blocks.</p>



<p class="wp-block-paragraph"><strong>It links your main entities to authoritative sources.</strong> This is the entity work from the <a href="https://theseopub.com/entity-disambiguation/">disambiguation note</a> put into practice. The skill identifies the real-world entities your page is about and adds <code>sameAs</code> links to their Wikipedia, Wikidata, and Google Knowledge Graph entries. That&#8217;s how you tell Google and the AI systems exactly which &#8220;Apple&#8221; or &#8220;Mercury&#8221; or &#8220;Springfield&#8221; you mean.</p>



<p class="wp-block-paragraph"><strong>It verifies those links before including them.</strong> Here&#8217;s the part that matters. The skill searches for each entity, confirms the page actually exists, and confirms it refers to the correct entity using the page&#8217;s context. It will not output a link it hasn&#8217;t verified. If it can&#8217;t find an authoritative entry for an important entity (your own brand, for example, which may not be in Wikidata yet), it tells you and asks whether you have a page to use instead.</p>



<p class="wp-block-paragraph"><strong>It flags deprecated rich results.</strong> If your page warrants FAQPage or HowTo schema, the skill includes it but notes that Google retired the visible rich result. The schema still helps with understanding and AI citation, so it&#8217;s worth keeping, but you won&#8217;t see the SERP enhancement you might expect.</p>



<h2 class="wp-block-heading">Why the Verification Step Matters</h2>



<p class="wp-block-paragraph">Ask most AI tools to generate schema with entity links and they&#8217;ll produce something that looks right. The Wikidata Q-numbers will be formatted correctly. The Wikipedia URLs will look plausible. And some meaningful percentage of them will point to pages that don&#8217;t exist, or worse, point to the wrong entity entirely.</p>



<p class="wp-block-paragraph">A <code>sameAs</code> link to the wrong entity is actively harmful. If your page is about your software company and the schema links it to a Wikidata entry for an unrelated company with a similar name, you&#8217;ve just told Google and every AI system that your business is something it isn&#8217;t. A hallucinated link that points nowhere is wasted. A confident link that points to the wrong thing is damage.</p>



<p class="wp-block-paragraph">The skill handles this by treating verification as mandatory. It uses Wikidata&#8217;s actual entity search, confirms the entry resolves, and checks that the entity&#8217;s type and description match your page&#8217;s context before it includes anything. The output includes a verification report showing exactly what was confirmed, what wasn&#8217;t found, and where it used something you provided. You can see the work.</p>



<h2 class="wp-block-heading">What Are Claude Skills?</h2>



<p class="wp-block-paragraph">If you haven&#8217;t used them, skills are reusable instruction sets for Claude. You install one, and Claude follows that workflow whenever you ask it to do that kind of task, without you having to write a detailed prompt each time. I covered them in more detail in <a href="https://theseopub.com/entity-analysis-systematized/">the Entity Analysis note</a>.</p>



<p class="wp-block-paragraph">Skills work in Claude.ai (web and app), Claude Code, and the API. You need a paid Claude plan.</p>



<p class="wp-block-paragraph">One important note for this skill: it needs web search enabled, because the entity verification step depends on it. If web search is off, the skill will tell you it can&#8217;t verify the entity links rather than guessing.</p>



<h2 class="wp-block-heading">How to Install It</h2>



<ol class="wp-block-list">
<li>Download the skill file: <strong>[DOWNLOAD LINK]</strong></li>



<li>Open Claude.ai</li>



<li>Go to <strong>Settings > Capabilities</strong></li>



<li>Scroll to the Skills section and upload the file</li>



<li>Toggle the skill <strong>ON</strong></li>



<li>Make sure web search is also enabled</li>



<li>Start a new chat</li>
</ol>



<p class="wp-block-paragraph">To use it, give it a URL:</p>



<p class="wp-block-paragraph">&#8220;Using the schema builder skill, build schema for <a href="https://example.com/your-page">https://example.com/your-page</a>&#8220;</p>



<p class="wp-block-paragraph">Or paste a draft, or attach a document, if the page isn&#8217;t published yet. The skill will run in Draft Mode and remind you to update the URLs and re-verify before publishing.</p>



<p class="wp-block-paragraph">The output is raw JSON-LD, ready to drop into the <code>custom_schema</code> field from the WordPress note. If you&#8217;d rather place it directly in a template, the skill also gives you the <code>&lt;script&gt;</code>-wrapped version.</p>



<h2 class="wp-block-heading">Sample Output</h2>



<p class="wp-block-paragraph">Here&#8217;s the full output the skill produced for a real page: Wrike&#8217;s guide article &#8220;21 Project Management Tools I Tested in 2026,&#8221; a long listicle that reviews and compares 21 named tools.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p class="wp-block-paragraph"><strong>Schema Markup: 21 Project Management Tools I Tested in 2026</strong></p>



<p class="wp-block-paragraph">Mode: Live</p>



<p class="wp-block-paragraph"><strong>Page Classification</strong> Review listicle published by an organization, part of a larger guide. Selected types: Organization (Wrike, as publisher), WebSite, WebPage, Article, Person (author), ImageObject, ItemList (the ranked tools), BreadcrumbList, FAQPage.</p>



<p class="wp-block-paragraph"><strong>Entity Verification Report</strong></p>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>Entity</th><th>Role</th><th>Verified Reference</th><th>Status</th></tr></thead><tbody><tr><td>Project management software</td><td>Page topic (<code>about</code>); publisher&#8217;s field (<code>knowsAbout</code>)</td><td>wikidata.org/entity/Q167035</td><td>Verified</td></tr><tr><td>Project management</td><td>Publisher&#8217;s field (<code>knowsAbout</code>)</td><td>wikidata.org/entity/Q179012</td><td>Verified</td></tr><tr><td>Wrike</td><td>Publisher (<code>sameAs</code>)</td><td>wikidata.org/entity/Q7337792</td><td>Verified</td></tr><tr><td>Asana, monday.com, Adobe Workfront, Smartsheet, Jira, ClickUp, Microsoft Project, Basecamp, Trello</td><td>Reviewed tools (<code>mentions</code>)</td><td>Each linked to its own verified Wikidata entry</td><td>Verified</td></tr><tr><td>Gantt chart, Kanban, Scrum, Agile software development</td><td>Methodologies discussed (<code>mentions</code>)</td><td>Each linked to its own verified Wikidata entry</td><td>Verified</td></tr><tr><td>Artem Gurnov</td><td>Author</td><td>none</td><td>Not linked (see note)</td></tr></tbody></table></figure>



<p class="wp-block-paragraph">A few things worth pointing out, because they show the skill making real decisions rather than just emitting tags.</p>



<p class="wp-block-paragraph">The publisher&#8217;s field is grounded, not just its brand. Wrike&#8217;s <code>knowsAbout</code> points at project management software and the project management discipline, the field it actually operates in. That is the difference between schema that says &#8220;this is a company called Wrike&#8221; and schema that says &#8220;this is a company whose expertise is project management software.&#8221; The second is what helps a search engine or AI system place the page.</p>



<p class="wp-block-paragraph">The reviewed tools are treated as what the page is actually about. On a best-tools listicle, the products are the subject, so the skill grounds every one of them with a verified <code>sameAs</code> and captures the ranking in an <code>ItemList</code>. It does not leave them as bare text, and it does not skip the ones that were inconvenient to find.</p>



<p class="wp-block-paragraph">It typed those tool references carefully, and this is the subtle part. Each one is a plain <code>Thing</code> carrying a <code>sameAs</code>, not a <code>SoftwareApplication</code>. That matters because <code>SoftwareApplication</code> is a rich-result type, and Google&#8217;s validator requires it to carry at least two of <code>offers</code>, <code>aggregateRating</code>, <code>applicationCategory</code>, or <code>operatingSystem</code>. The page doesn&#8217;t state real per-product prices or first-party ratings, so typing each competitor as a <code>SoftwareApplication</code> would either throw validation errors or pressure you into inventing data. A plain <code>Thing</code> with a verified <code>sameAs</code> grounds the entity just as well (the Wikidata entry for Asana already says it&#8217;s project management software) and keeps the markup clean. Reserve the rich-result types for the page&#8217;s actual primary subject.</p>



<p class="wp-block-paragraph">The author was deliberately not linked. &#8220;Artem Gurnov&#8221; has no Wikidata entry that is confirmably this person, so the skill left his <code>sameAs</code> empty rather than link a stranger who happens to share the name.</p>



<p class="wp-block-paragraph">The restraint held on three traps. The page cites Capterra ratings for each tool, but those are third-party and self-serving to mark up on Wrike&#8217;s own page, so no <code>aggregateRating</code> was added. The page embeds a video with no visible upload date, and <code>VideoObject</code> requires one, so none was fabricated. And the four real FAQ questions on the page were captured as a genuine <code>FAQPage</code>.</p>



<p class="wp-block-paragraph"><strong>Rich Result Notes</strong> The FAQPage is included because the page has genuine question-and-answer content, but note that Google retired the FAQ rich result in May 2026. The schema still helps with page understanding and AI citation; it just won&#8217;t render an accordion in Google anymore. No <code>VideoObject</code> or <code>aggregateRating</code> was fabricated to chase a rich result the page can&#8217;t honestly support.</p>



<p class="wp-block-paragraph"><strong>JSON-LD</strong> (the complete output, nothing trimmed)</p>



<p class="wp-block-paragraph">json</p>



<pre class="wp-block-code"><code>{
  "@context": "https://schema.org",
  "@graph": &#91;
    {
      "@type": "Organization",
      "@id": "https://www.wrike.com/#organization",
      "name": "Wrike",
      "url": "https://www.wrike.com/",
      "logo": {
        "@type": "ImageObject",
        "@id": "https://www.wrike.com/#logo",
        "url": "https://web-static.wrike.com/tp/storage/uploads/ea10c069-f31a-4c6f-ad2e-8e5998be5a95/wrike-website-logo-light.svg",
        "caption": "Wrike"
      },
      "image": { "@id": "https://www.wrike.com/#logo" },
      "description": "Wrike is a collaborative work management platform used by more than 20,000 companies and 2.4 million people in over 140 countries.",
      "sameAs": &#91;
        "https://en.wikipedia.org/wiki/Wrike",
        "https://www.wikidata.org/entity/Q7337792",
        "https://www.google.com/search?kgmid=/m/027r1c3",
        "https://twitter.com/wrike",
        "https://www.youtube.com/channel/UCiXd-ioJzcXqPWB8osZy6vA"
      ],
      "knowsAbout": &#91;
        { "@type": "Thing", "name": "Project management software", "sameAs": "https://www.wikidata.org/entity/Q167035" },
        { "@type": "Thing", "name": "Project management", "sameAs": "https://www.wikidata.org/entity/Q179012" },
        { "@type": "Thing", "name": "Work management" },
        { "@type": "Thing", "name": "Task management" },
        { "@type": "Thing", "name": "Team collaboration" }
      ]
    },
    {
      "@type": "WebSite",
      "@id": "https://www.wrike.com/#website",
      "name": "Wrike",
      "url": "https://www.wrike.com/",
      "publisher": { "@id": "https://www.wrike.com/#organization" },
      "potentialAction": {
        "@type": "SearchAction",
        "target": { "@type": "EntryPoint", "urlTemplate": "https://www.wrike.com/search/?q={search_term_string}" },
        "query-input": "required name=search_term_string"
      }
    },
    {
      "@type": "Person",
      "@id": "https://www.wrike.com/#/schema/person/artem-gurnov",
      "name": "Artem Gurnov",
      "jobTitle": "Director of Account Development",
      "worksFor": { "@id": "https://www.wrike.com/#organization" },
      "image": "https://web-static.wrike.com/cdn-cgi/image/format=auto/tp/storage/uploads/00bbddf6-a1bd-44ed-af14-113ce02c50c8/artem-gurnov.png"
    },
    {
      "@type": "WebPage",
      "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#webpage",
      "url": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/",
      "name": "21 Project Management Tools I Tested in 2026",
      "description": "I tested 21 project management tools and reviewed pricing, features, and team fit across different use cases.",
      "isPartOf": { "@id": "https://www.wrike.com/#website" },
      "about": { "@type": "Thing", "name": "Project management software", "sameAs": "https://www.wikidata.org/entity/Q167035" },
      "primaryImageOfPage": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#primaryimage" },
      "dateModified": "2026-04-27",
      "breadcrumb": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#breadcrumb" },
      "mainEntity": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#article" },
      "hasPart": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#faq" }
    },
    {
      "@type": "ImageObject",
      "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#primaryimage",
      "url": "https://www.wrike.com/tp/storage/uploads/58488fef-e405-4971-8e5e-2cc1566fdf92/meta-banner-wrike.png"
    },
    {
      "@type": "Article",
      "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#article",
      "isPartOf": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#webpage" },
      "mainEntityOfPage": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#webpage" },
      "headline": "21 Project Management Tools I Tested in 2026",
      "description": "I tested 21 project management tools and reviewed pricing, features, and team fit across different use cases.",
      "image": { "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#primaryimage" },
      "author": { "@id": "https://www.wrike.com/#/schema/person/artem-gurnov" },
      "publisher": { "@id": "https://www.wrike.com/#organization" },
      "dateModified": "2026-04-27",
      "about": { "@type": "Thing", "name": "Project management software", "sameAs": "https://www.wikidata.org/entity/Q167035" },
      "mentions": &#91;
        { "@type": "Thing", "name": "Gantt chart", "sameAs": "https://www.wikidata.org/entity/Q192847" },
        { "@type": "Thing", "name": "Kanban", "sameAs": "https://www.wikidata.org/entity/Q180591" },
        { "@type": "Thing", "name": "Scrum", "sameAs": "https://www.wikidata.org/entity/Q460387" },
        { "@type": "Thing", "name": "Agile software development", "sameAs": "https://www.wikidata.org/entity/Q30232" },
        { "@type": "Thing", "name": "Asana", "sameAs": "https://www.wikidata.org/entity/Q4468073" },
        { "@type": "Thing", "name": "monday.com", "sameAs": "https://www.wikidata.org/entity/Q65073369" },
        { "@type": "Thing", "name": "Adobe Workfront", "sameAs": "https://www.wikidata.org/entity/Q4812278" },
        { "@type": "Thing", "name": "Smartsheet", "sameAs": "https://www.wikidata.org/entity/Q7544153" },
        { "@type": "Thing", "name": "Jira", "sameAs": "https://www.wikidata.org/entity/Q1359246" },
        { "@type": "Thing", "name": "ClickUp", "sameAs": "https://www.wikidata.org/entity/Q123524232" },
        { "@type": "Thing", "name": "Microsoft Project", "sameAs": "https://www.wikidata.org/entity/Q80336" },
        { "@type": "Thing", "name": "Basecamp", "sameAs": "https://www.wikidata.org/entity/Q2364173" },
        { "@type": "Thing", "name": "Trello", "sameAs": "https://www.wikidata.org/entity/Q7838011" }
      ]
    },
    {
      "@type": "ItemList",
      "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#toolslist",
      "name": "The 11 best project management tools",
      "itemListOrder": "https://schema.org/ItemListOrderAscending",
      "numberOfItems": 11,
      "itemListElement": &#91;
        { "@type": "ListItem", "position": 1, "name": "Wrike" },
        { "@type": "ListItem", "position": 2, "name": "Asana" },
        { "@type": "ListItem", "position": 3, "name": "monday.com" },
        { "@type": "ListItem", "position": 4, "name": "Adobe Workfront" },
        { "@type": "ListItem", "position": 5, "name": "Smartsheet" },
        { "@type": "ListItem", "position": 6, "name": "Jira" },
        { "@type": "ListItem", "position": 7, "name": "ClickUp" },
        { "@type": "ListItem", "position": 8, "name": "Microsoft Project" },
        { "@type": "ListItem", "position": 9, "name": "Basecamp" },
        { "@type": "ListItem", "position": 10, "name": "Trello" },
        { "@type": "ListItem", "position": 11, "name": "Zoho Projects" }
      ]
    },
    {
      "@type": "BreadcrumbList",
      "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#breadcrumb",
      "itemListElement": &#91;
        { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://www.wrike.com/" },
        { "@type": "ListItem", "position": 2, "name": "Project Management Guide", "item": "https://www.wrike.com/project-management-guide/" },
        { "@type": "ListItem", "position": 3, "name": "FAQ", "item": "https://www.wrike.com/project-management-guide/faq/" },
        { "@type": "ListItem", "position": 4, "name": "Tools", "item": "https://www.wrike.com/project-management-guide/faq/category/tools-2/" },
        { "@type": "ListItem", "position": 5, "name": "21 Project Management Tools I Tested in 2026" }
      ]
    },
    {
      "@type": "FAQPage",
      "@id": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/#faq",
      "url": "https://www.wrike.com/project-management-guide/faq/what-are-project-management-tools/",
      "isPartOf": { "@id": "https://www.wrike.com/#website" },
      "mainEntity": &#91;
        { "@type": "Question", "name": "What is a project management tool?", "acceptedAnswer": { "@type": "Answer", "text": "A project management tool is software designed to help an individual or team manage and organize their projects and tasks. It is usually available free or for a fee, often as a platform or in-browser application." } },
        { "@type": "Question", "name": "What are some project planning tool features?", "acceptedAnswer": { "@type": "Answer", "text": "Common features include planning and scheduling (tasks, subtasks, folders, templates, workflows, and calendars); collaboration (assigning tasks, comments, dashboards, and proofing or approvals); documentation (file editing, versioning, and storage); and evaluation (tracking productivity through resource management and reporting)." } },
        { "@type": "Question", "name": "What are program management tools?", "acceptedAnswer": { "@type": "Answer", "text": "Program management tools are similar to project management tools but operate at a higher level. Projects have clear start and end dates and short-term deliverables, while programs combine several interconnected projects to achieve a long-term business objective. Program management tools include features such as flexible work views, cross-functional resource management, dashboards, reporting, Gantt charts, and timesheets." } },
        { "@type": "Question", "name": "What are the benefits of project management tools?", "acceptedAnswer": { "@type": "Answer", "text": "Project management tools can improve productivity, profit margins, and overall team efficiency. They typically fall into categories such as real-time instant messaging tools, knowledge-based tools, and file-saving tools, and offer features like custom request forms, Gantt charts, and automated real-time updates across departments and locations." } }
      ]
    }
  ]
}</code></pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p class="wp-block-paragraph">That is the entire output. Nine nodes in one connected graph: Wrike as publisher, the website, the page, the article, the author, the primary image, the ranked tool list, the breadcrumb trail, and the FAQ, all cross-referenced by <code>@id</code> so nothing is defined twice. Sixteen distinct entities are grounded to verified Wikidata entries, and every one was confirmed against a real entry before it went in. Paste it into the <code>custom_schema</code> field and the page is done.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">Schema is one of the most underused signals in both traditional and AI search. The <a href="https://theseopub.com/ai-search-is-still-seo/">AirOps study</a> showed a measurable citation advantage for pages with schema, and connecting that schema into a single coherent graph is the current best practice for how it&#8217;s structured.</p>



<p class="wp-block-paragraph">The barrier has always been the work. Generating accurate, connected, entity-linked JSON-LD by hand is tedious, and the tools that automate it tend to either produce flat, disconnected blocks or invent entity links that don&#8217;t hold up. This skill does the work, connects the graph, and verifies the links before it hands them to you.</p>



<p class="wp-block-paragraph">Install it, point it at a page, and check the verification report. The schema you get back is yours to keep, in a custom field, independent of any plugin.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Entity Disambiguation: How Google Figures Out Which &#8220;Apple&#8221; You Mean</title>
		<link>https://theseopub.com/entity-disambiguation/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 02 Jun 2026 12:45:00 +0000</pubDate>
				<category><![CDATA[SEO]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3524</guid>

					<description><![CDATA[Search for &#8220;John Williams&#8221; and Google has a problem. There&#8217;s the composer who scored Star Wars. There&#8217;s a professional wrestler. There&#8217;s a venture capitalist. Same name, three different people, plus an unknown number of less famous John Williamses. Search for &#8220;Python&#8221; and the problem repeats. The programming language, the snake, the Monty Python comedy troupe. [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Search for &#8220;John Williams&#8221; and Google has a problem. There&#8217;s the composer who scored Star Wars. There&#8217;s a professional wrestler. There&#8217;s a venture capitalist. Same name, three different people, plus an unknown number of less famous John Williamses.</p>



<p class="wp-block-paragraph">Search for &#8220;Python&#8221; and the problem repeats. The programming language, the snake, the Monty Python comedy troupe. Search for &#8220;Mercury&#8221; and you could mean the planet, the element, the car brand, or the musician. Search for &#8220;Apple&#8221; and you could mean the company or the fruit.</p>



<p class="wp-block-paragraph">This is the entity disambiguation problem, and solving it is core to how modern search works. Google has to figure out which specific entity you mean, and which specific entity each page is about, before it can match the two. This isn&#8217;t new, and it isn&#8217;t speculative. Google has held patents on it for years, and a major change to the Knowledge Graph in 2025 shows they&#8217;re investing in it harder than ever.</p>



<h2 class="wp-block-heading">Named Entities Are Most of Search</h2>



<p class="wp-block-paragraph">A named entity is a thing with a proper name. A person, a place, a company, a product, an organization. Not &#8220;coffee shop&#8221; but &#8220;Starbucks.&#8221; Not &#8220;president&#8221; but &#8220;George Washington.&#8221;</p>



<p class="wp-block-paragraph">These aren&#8217;t an edge case. According to Microsoft research cited in Google&#8217;s own patent work, 20 to 30% of queries submitted to search engines are themselves named entities, and around 71% of queries contain a named entity. Most of what people search for involves a specific person, place, or thing, not a generic concept.</p>



<p class="wp-block-paragraph">That makes disambiguation essential. If most queries involve named entities, and many of those names are shared by multiple entities, then a search engine that can&#8217;t tell the entities apart is matching strings, not meaning. Google moved past string matching a long time ago.</p>



<h2 class="wp-block-heading">What the Patent Actually Describes</h2>



<p class="wp-block-paragraph">The foundational patent is US9135238B2, &#8220;Disambiguation of named entities,&#8221; filed by Google in June 2006 and granted in September 2015. The inventors, Razvan Bunescu and Marius Pasca, also published an academic paper on the same approach in 2006, so we have a clear picture of the thinking.</p>



<p class="wp-block-paragraph">The patent describes disambiguating named entities using a knowledge base of articles about those entities. At the time, the knowledge base was Wikipedia, referred to in the patent as an &#8220;exemplary knowledge base.&#8221; Today the equivalent is Google&#8217;s Knowledge Graph, supplemented by Wikidata and Wikipedia.</p>



<p class="wp-block-paragraph">The system builds a disambiguation scoring model from several features of that knowledge base:</p>



<ul class="wp-block-list">
<li>Article titles that identify specific entities</li>



<li>Redirect pages that map aliases to a canonical entity (Mark Twain redirects to Samuel Clemens)</li>



<li>Disambiguation pages that list the different senses of an ambiguous name</li>



<li>Hyperlinks between articles, which establish context and relationships</li>



<li>Categories assigned to each entity</li>
</ul>



<p class="wp-block-paragraph">When a query contains an entity name, the system uses the scoring model to identify which article, and therefore which specific entity, the name most likely refers to, based on the other context present. It can then group or organize results by the correct sense of the name.</p>



<p class="wp-block-paragraph">The key insight is that an entity&#8217;s identity is established by its context. How it&#8217;s linked, what it&#8217;s linked to, what categories it belongs to, and what other entities appear alongside it.</p>



<h2 class="wp-block-heading">Two Directions of the Same Problem</h2>



<p class="wp-block-paragraph">A later Google patent on entity metrics draws a useful distinction between two related problems.</p>



<p class="wp-block-paragraph"><strong>Differentiation</strong>&nbsp;is the many-to-one case. Multiple names refer to a single entity. &#8220;George Washington,&#8221; &#8220;Geo. Washington,&#8221; &#8220;the first U.S. president,&#8221; and &#8220;General Washington&#8221; all point to the same person. Google needs to recognize that these different strings resolve to one entity.</p>



<p class="wp-block-paragraph"><strong>Disambiguation</strong>&nbsp;is the one-to-many case. A single name refers to multiple possible entities. &#8220;Georgia&#8221; could be the U.S. state or the country. &#8220;New York&#8221; could be the city or the state. &#8220;John Williams&#8221; could be any of several people. Google needs to recognize which specific entity is meant in a given context.</p>



<p class="wp-block-paragraph">Both problems are solved the same way: through context and through connection to a knowledge base where each entity has a unique identity. In the Knowledge Graph, that unique identity is an entity ID. In Wikidata, it&#8217;s a Q-number (the entity for Portugal is Q45, machine learning is Q2539). These identifiers let Google reference a specific entity unambiguously, even when the human-readable name is shared by many things.</p>



<h2 class="wp-block-heading">How Google Tells Entities Apart</h2>



<p class="wp-block-paragraph">A 2017 Google patent on knowledge-based entity detection describes tagging entities in web pages with unique identifiers that unambiguously identify them, and attaching a confidence score to each disambiguation. The signals that drive that confidence come down to context, and the strongest contextual signal is the other entities present.</p>



<p class="wp-block-paragraph">A page that mentions Apple alongside iPhone, Tim Cook, and Cupertino is clearly about the company. A page that mentions apple alongside orchard, harvest, and pie is clearly about the fruit. Neither page has to state which one it means. The co-occurring entities resolve the ambiguity.</p>



<p class="wp-block-paragraph">This is the practical core of disambiguation. Google identifies the candidate entities a name could refer to, then uses the surrounding context to score which candidate is most likely, then assigns the page to that entity. The more clearly your context points to one specific entity, the higher the confidence and the less room for Google to guess wrong.</p>



<h2 class="wp-block-heading">Why This Matters More Now</h2>



<p class="wp-block-paragraph">Entity disambiguation has long been part of how Google works, but the company has recently made it a visible priority.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://searchengineland.com/google-great-clarity-cleanup-knowledge-graph-ai-future-460836" target="_blank">According to Knowledge Graph tracking from Kalicube, published by its founder Jason Barnard in Search Engine Land</a>, Google ran what&#8217;s been called a &#8220;clarity cleanup&#8221; in June 2025. Over two updates in a single week, the Knowledge Graph contracted by 6.26%, removing more than 3 billion entities. It was the largest contraction in a decade. Ambiguous &#8220;Thing&#8221; entities dropped by around 15%, and temporary event entities (many added during the pandemic) were hit hardest, with close to 77% removed.</p>



<p class="wp-block-paragraph">The interpretation, which Google hasn&#8217;t officially confirmed, is that this was a deliberate move to trade volume for clarity. A leaner, higher-confidence set of entities to underpin AI features like AI Overviews and AI Mode. The logic tracks. If AI-generated answers are going to cite and rely on entities, those entities need to be unambiguous and well-defined. Ambiguous entities are a liability when a system is generating direct answers rather than just ranking links.</p>



<p class="wp-block-paragraph">The practical takeaway: if Google is prioritizing high-confidence, clearly disambiguated entities, then content tied to vague or poorly defined entities is at a disadvantage. Clear entity identity is no longer just a ranking nicety. It&#8217;s increasingly a requirement for visibility in both traditional and AI search. This connects to the&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/ai-search-is-still-seo/" target="_blank">AirOps study</a>&nbsp;covered a few weeks ago, where entity recognition correlated with citation likelihood in AI search.</p>



<h2 class="wp-block-heading">What You Can Actually Do</h2>



<p class="wp-block-paragraph">Disambiguation is something you can influence. The goal is to make Google&#8217;s job easy by removing ambiguity about which entity your content is about. A few concrete things help.</p>



<p class="wp-block-paragraph"><strong>Surround your entity with related entities.</strong>&nbsp;This is the strongest signal and the easiest to apply. If your page is about your software company, make sure the page co-occurs with the entities that establish that context: your product names, your category (project management, email marketing, whatever it is), your integration partners, your competitors, your founders. Don&#8217;t leave Google to guess from a bare brand name.</p>



<p class="wp-block-paragraph"><strong>Use the sameAs property in your schema.</strong>&nbsp;Link your entity to its authoritative profiles: Wikipedia, Wikidata, LinkedIn, Crunchbase, official social accounts. This gives Google a verification anchor that removes ambiguity directly. Instead of inferring which entity you are, Google gets an explicit statement that the entity on your site is the same as a specific, known entity elsewhere. This is the entity-linking layer that templated plugin schema doesn&#8217;t handle, and it&#8217;s worth doing by hand for your most important pages.</p>



<p class="wp-block-paragraph"><strong>Match your schema type to your entity.</strong>&nbsp;A page with Organization schema declaring a company is read differently than a page with Recipe schema mentioning an ingredient. The schema type sets expectations that help Google interpret the content correctly.</p>



<p class="wp-block-paragraph"><strong>Get a Wikidata entry if your entity qualifies.</strong>&nbsp;A Wikidata item with a Q-number gives your entity a machine-readable identity in a database Google and the major AI systems all use. Once it exists, your sameAs schema can point directly to it. Note that Wikidata has notability requirements and entries that don&#8217;t meet them get deleted, so this applies to entities with genuine independent coverage, not every small business.</p>



<p class="wp-block-paragraph"><strong>Be consistent with naming and build external corroboration.</strong>&nbsp;Use the same entity name consistently across your site and your off-site presence. Google builds confidence in an entity&#8217;s identity when multiple trusted sources corroborate the same facts. Unlinked brand mentions, industry directory listings, and coverage in credible publications all reinforce entity recognition, even without links.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">The same-name problem isn&#8217;t new and it isn&#8217;t going away. Google has been working on entity disambiguation since at least 2006, and the 2025 Knowledge Graph cleanup shows the company prioritizing clear, high-confidence entities for the AI era rather than backing off.</p>



<p class="wp-block-paragraph">Your job is to make the disambiguation easy. Establish clear context around your entities. Connect them explicitly to the knowledge bases Google already trusts. Be consistent. The less Google has to guess about which entity your content is about, the more reliably it can match your content to the right searches, in traditional results and in AI answers alike.</p>



<p class="wp-block-paragraph">When Google encounters your &#8220;Apple,&#8221; there should be no question which one you mean.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Add Custom Schema to WordPress Pages (And Why You Should Own Your Schema, Not Rent It)</title>
		<link>https://theseopub.com/how-to-add-custom-schema-to-wordpress-pages/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 26 May 2026 12:57:00 +0000</pubDate>
				<category><![CDATA[SEO]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3515</guid>

					<description><![CDATA[Most WordPress sites running Yoast or Rank Math have schema handled to some degree. The plugins do real work. They output a graph that includes WebPage, WebSite, Organization, BreadcrumbList, and Article on posts. Rank Math goes further with built-in support for Product, Recipe, Event, LocalBusiness, and several other types out of the box. But here&#8217;s [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Most WordPress sites running Yoast or Rank Math have schema handled to some degree. The plugins do real work. They output a graph that includes WebPage, WebSite, Organization, BreadcrumbList, and Article on posts. Rank Math goes further with built-in support for Product, Recipe, Event, LocalBusiness, and several other types out of the box.</p>



<p class="wp-block-paragraph">But here&#8217;s the problem most site owners don&#8217;t think about. All of that schema lives inside the plugin&#8217;s database structure. The custom Product offers you configured. The event details you entered. The FAQ pairs you filled in. All of it stored in plugin-specific tables or post meta keys that only that plugin knows how to read.</p>



<p class="wp-block-paragraph">The moment you switch away from an SEO plugins, most of that schema work disappears with the old plugin. Yes there are some import options when going from one plugin to another, but not always. Also, what if a plugin is deprecated? What then?&nbsp;</p>



<p class="wp-block-paragraph">There&#8217;s a better way. This note covers a simple method to add custom schema to any WordPress page using a custom field and a small function in your child theme. The schema lives with the post, not with the plugin. It survives plugin changes, theme updates, and platform migrations. It gives you complete control over the JSON-LD without UI limitations, doesn&#8217;t require any Pro upgrades, and works with every modern schema type.</p>



<h2 class="wp-block-heading">Why Bother With Schema in the First Place</h2>



<p class="wp-block-paragraph">Three layers to the answer, and they&#8217;re all stronger now than they used to be.</p>



<h3 class="wp-block-heading">Layer 1: Google Rich Results</h3>



<p class="wp-block-paragraph">A lot of the arguments I see against schema mention that you do not need schema to rank in Google Search. While that is absolutely true, schema does enable visible enhancements in Google search results. These show up directly in the SERP, increase CTR, and take up more real estate. The current major rich result types still active in Google:</p>



<ul class="wp-block-list">
<li>Review stars on product pages, recipes, software, and local business listings (Review and AggregateRating schema)</li>



<li>Product availability, pricing, and stock indicators (Product schema with Offer)</li>



<li>Recipe details including cook time, calories, ratings, and image (Recipe schema)</li>



<li>Event listings with dates and locations (Event schema)</li>



<li>Video carousels (VideoObject schema)</li>



<li>Job postings in Google&#8217;s job board (JobPosting schema)</li>



<li>Breadcrumb paths in the SERP listing (BreadcrumbList schema)</li>



<li>Sitelinks search box (WebSite with SearchAction)</li>



<li>Local business info with hours, address, ratings (LocalBusiness schema)</li>



<li>Movie, TV, and book listings</li>
</ul>



<p class="wp-block-paragraph">These deliver immediate CTR benefits and more SERP real estate. The visible payoff.</p>



<h3 class="wp-block-heading">Layer 2: What Google Still Uses Schema For (Even Without Rich Results)</h3>



<p class="wp-block-paragraph">Google has been narrowing the rich results landscape. HowTo rich results were deprecated in September 2023. FAQ rich results were officially discontinued on May 7, 2026, with Search Console support being phased out through August 2026.</p>



<p class="wp-block-paragraph">But Google explicitly said in the FAQ deprecation notice that they will continue using FAQPage structured data to better understand pages. The visible rich result is gone. The semantic signal isn&#8217;t. FAQ and HowTo schema still help Google understand what your page is about. They just don&#8217;t earn a visual SERP enhancement in Google anymore.</p>



<h3 class="wp-block-heading">Layer 3: Other Platforms and AI Search</h3>



<p class="wp-block-paragraph">Again, many opponents of schema will tell you that you don’t need it to rank in Google Search, but Google is not the only platform out there that can send your website real web visitors. These are a few examples beyond Google where schema can improve your visibility across the internet:.</p>



<p class="wp-block-paragraph"><strong>Bing</strong> uses the same Schema.org vocabulary as Google. By extension, ChatGPT Search uses Bing&#8217;s index for retrieval, so Bing schema directly affects ChatGPT visibility.</p>



<p class="wp-block-paragraph"><strong>Pinterest Rich Pins</strong> use schema markup for three pin types: Product (price, availability, description), Recipe (ingredients, cook time, ratings), and Article (headline, description, author). For e-commerce, food, and publishing sites, Rich Pins drive measurable additional traffic.</p>



<p class="wp-block-paragraph"><strong>Voice assistants</strong> (Google Assistant, Siri, Alexa) use structured data to extract concise answers for spoken queries.</p>



<p class="wp-block-paragraph"><strong>AI search citation rates</strong> are where the argument lands hardest right now. The <a href="https://theseopub.com/ai-search-is-still-seo/">AirOps and Kevin Indig study</a> covered a few weeks ago found that pages with JSON-LD schema markup have a 6.5 percentage point citation advantage (38.5% vs 32%). The top-performing types: MedicalWebPage at 47%, BreadcrumbList at 46.2%, FAQPage at 45.6%, Organization at 44.3%, WebSite at 40.6%. FAQPage drives AI citation even though Google killed the rich result.</p>



<h2 class="wp-block-heading">What Yoast and Rank Math Actually Do (And Where They Fall Short)</h2>



<p class="wp-block-paragraph">To be fair, both plugins do real work and this note isn&#8217;t pretending otherwise.</p>



<p class="wp-block-paragraph"><strong>Yoast</strong> outputs a sophisticated schema graph with WebPage, WebSite, Organization, BreadcrumbList, Article, Person/Author, and Image, all linked through @id references. It supports nesting and linked entities. It includes FAQ and HowTo blocks that auto-add schema (though both rich results are now gone in Google). For schema types beyond the base graph, Yoast relies on third-party plugin integrations (Easy Digital Downloads, WP Recipe Maker, The Events Calendar) or developer access to their Schema API.</p>



<p class="wp-block-paragraph"><strong>Rank Math (free)</strong> covers more schema types out of the box than Yoast: Article, Book, Course, Event, JobPosting, LocalBusiness, Music, Person, Product, Recipe, Restaurant, Service, SoftwareApplication, Video. It includes a fill-in-the-blanks Schema Generator for these types.</p>



<p class="wp-block-paragraph"><strong>Rank Math Pro</strong> adds a Custom Schema Builder, Schema Templates, and the ability to extend existing schema with additional properties.</p>



<p class="wp-block-paragraph">So both plugins handle a lot. The issues come down to five things.</p>



<p class="wp-block-paragraph"><strong>1. Auto-populated schema often misses page-specific details.</strong> When Rank Math outputs Product schema for a page, it&#8217;s pulling from generic fields. Specific offer details, GTIN, brand information, multiple price points, return policies, individual reviews tied to ratings: these often aren&#8217;t captured by default or require manual entry in plugin-specific fields. The schema is technically present but isn&#8217;t communicating the full specificity of what&#8217;s on the page.</p>



<p class="wp-block-paragraph"><strong>2. Advanced features are gated behind Pro versions.</strong> Rank Math&#8217;s Custom Schema Builder, Schema Templates, and schema extension features all require Rank Math Pro. The free version handles common types but not custom or extended ones.</p>



<p class="wp-block-paragraph"><strong>3. Yoast requires either compatible third-party plugins or developer code for most specific types.</strong> If you want Product schema but you&#8217;re not using WooCommerce, Recipe schema but you&#8217;re not using WP Recipe Maker, Event schema but you&#8217;re not using The Events Calendar, you&#8217;re either installing additional plugins or writing code against Yoast&#8217;s Schema API.</p>



<p class="wp-block-paragraph"><strong>4. You can&#8217;t easily test new or rarely used schema types.</strong> Schema.org defines hundreds of types and properties, and the vocabulary keeps expanding. SEO plugins support a fraction of what&#8217;s available, focused on the most common types.&nbsp;</p>



<p class="wp-block-paragraph">If you want to test something less common (Dataset for data publishers, ClaimReview for fact-checking, Quiz for educational content, MedicalScholarlyArticle for medical publishing, DiscussionForumPosting, TouristAttraction, or any newer addition to the Schema.org vocabulary), the plugins typically can&#8217;t help.&nbsp;</p>



<p class="wp-block-paragraph">With custom JSON-LD, you can use any schema type you want regardless of whether your plugin supports it. This matters for niche industries, experimental implementations, and getting ahead of trends before everyone else catches on.</p>



<p class="wp-block-paragraph"><strong>5. You can&#8217;t easily add entity linking.</strong> Modern schema can link specific entities to their canonical references in Wikidata, Wikipedia, and Google&#8217;s Knowledge Graph using properties like sameAs, about, and mentions.&nbsp;</p>



<p class="wp-block-paragraph">This is how you tell search engines and AI systems exactly which &#8220;Apple&#8221; or &#8220;Mercury&#8221; or &#8220;Springfield&#8221; your content is referring to. Yoast and Rank Math handle entity linking for the organization or author of the site, but they don&#8217;t have practical UIs for linking the entities mentioned in your actual content.&nbsp;</p>



<p class="wp-block-paragraph">With custom schema, you control exactly which entities your content is about, with explicit references to authoritative entity databases. This is a deeper topic, and I&#8217;ll come back to it in its own note in the future.</p>



<p class="wp-block-paragraph"><strong>6. Portability and control.</strong> This is the one that matters most over the long run.</p>



<p class="wp-block-paragraph">Schema generated through a plugin lives in the plugin&#8217;s database structure. The custom fields, the configured options, the per-page settings, all of it stored in plugin-specific tables or post meta keys that only that plugin knows how to read.</p>



<p class="wp-block-paragraph">The day you switch plugins, most of that schema work disappears with the old plugin. The Product offers you carefully configured in Rank Math&#8217;s Schema Generator. The FAQ pairs you entered through Yoast&#8217;s FAQ block. The event details you set up in The Events Calendar. All of it can be lost and leave you starting from zero.</p>



<p class="wp-block-paragraph">The same problem exists if the plugin changes how it stores data in a major update. Or if you migrate to a different platform entirely. Or if the plugin gets discontinued. Schema generated through a plugin is rented, not owned. The plugin holds the keys.</p>



<p class="wp-block-paragraph">Schema in a custom field is different. It lives with the post in WordPress&#8217;s native wp_postmeta table. It&#8217;s portable across themes, plugins, and even WordPress versions. Switch SEO plugins tomorrow and your schema is still there. Export your site and your schema goes with it. Migrate to a different platform and the schema can migrate with the content. The schema is owned, not rented.</p>



<p class="wp-block-paragraph">That&#8217;s the real argument for adding custom schema this way, even when your plugin technically supports the schema type you need.</p>



<h2 class="wp-block-heading">How to Add Custom Schema to WordPress</h2>



<p class="wp-block-paragraph">The implementation has six parts: disable plugin schema, set up a child theme (if you are not already using one of these you really should be), add a custom field, add a function to your child theme, generate the schema, and validate it.</p>



<h3 class="wp-block-heading">Step 1: Disable Your SEO Plugin&#8217;s Schema Output</h3>



<p class="wp-block-paragraph">Before adding custom schema, disable your existing plugin&#8217;s schema output. Two reasons.</p>



<p class="wp-block-paragraph"><strong>To avoid duplicate or conflicting schema.</strong> If your plugin is outputting Article schema for a post and you add a custom JSON-LD block via the function below, you&#8217;ll have two Article schemas on the page. Google can handle this in some cases, but it&#8217;s not ideal, and it creates ambiguity about which schema is authoritative.</p>



<p class="wp-block-paragraph"><strong>To make the migration to custom schema complete.</strong> The whole point of moving to custom schema is portability. If you leave the plugin&#8217;s schema active, you still have schema living in the plugin&#8217;s database structure. Disabling it entirely means your schema is fully in custom fields, fully under your control, and fully portable.</p>



<p class="wp-block-paragraph">For Yoast: go to Yoast SEO &gt; Settings &gt; Schema settings to disable schema output at the content type level. For full disabling across the site, you can use the wpseo_json_ld_output filter.</p>



<p class="wp-block-paragraph">For Rank Math: go to Rank Math &gt; Dashboard &gt; Modules and toggle off the Schema module.</p>



<p class="wp-block-paragraph">If you are using a different plugin, the options will be similar. If you can’t find it, consult your plugins documentation.</p>



<p class="wp-block-paragraph">A practical note. Disabling plugin schema is a significant change. Make sure you have replacement schema ready before doing it across your whole site. Don&#8217;t disable on a Friday afternoon. Migrate page by page.&nbsp;</p>



<p class="wp-block-paragraph">Set up the function first, add custom schema to your highest-traffic pages, validate everything is working, then disable the plugin&#8217;s schema output. The goal is full custody of your schema, not a gap where pages have no schema at all.</p>



<h3 class="wp-block-heading">Step 2: Use a Child Theme</h3>



<p class="wp-block-paragraph">Editing functions.php directly on a parent theme means your changes get overwritten the next time the theme updates. Always use a child theme. If you don&#8217;t have one, create one or use a code snippets plugin like Code Snippets (formerly WPCode) to run the function instead. Either approach is fine, but I highly recommend using the child theme approach so you are not adding an additional plugin. Plugins add bloat to a site and potential vulnerabilities.</p>



<p class="wp-block-paragraph">The function below works the same way regardless of where it lives.</p>



<h3 class="wp-block-heading">Step 3: Add a Custom Field for Schema</h3>



<p class="wp-block-paragraph">In your page or post editor, scroll down to the Custom Fields section. If you don&#8217;t see it, you may need to enable it under Screen Options at the top of the editor, or install Advanced Custom Fields if your editor doesn&#8217;t expose custom fields easily.</p>



<p class="wp-block-paragraph">Create a custom field with the name custom_schema. You can use a different name if you want. Just be sure to modify the code below to match your field name. The value will be the full JSON-LD schema you want to output on that specific page.&nbsp;</p>



<figure class="wp-block-image size-large"><a href="https://theseopub.com/wp-content/uploads/2026/05/create-new-custom-field.png"><img fetchpriority="high" decoding="async" width="1024" height="352" src="https://theseopub.com/wp-content/uploads/2026/05/create-new-custom-field-1024x352.png" alt="" class="wp-image-3518" srcset="https://theseopub.com/wp-content/uploads/2026/05/create-new-custom-field-1024x352.png 1024w, https://theseopub.com/wp-content/uploads/2026/05/create-new-custom-field-300x103.png 300w, https://theseopub.com/wp-content/uploads/2026/05/create-new-custom-field-768x264.png 768w, https://theseopub.com/wp-content/uploads/2026/05/create-new-custom-field.png 1146w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>



<h3 class="wp-block-heading">Step 4: Add the Function to Your Child Theme&#8217;s functions.php file.</h3>



<p class="wp-block-paragraph">You find this file in the WordPress menu under <strong>Appearance >> Theme File Editor</strong>.</p>



<p class="wp-block-paragraph">Here&#8217;s the function:</p>



<pre class="wp-block-code"><code>/**
 * Output custom JSON-LD schema in the page head.
 * Reads from the 'custom_schema' custom field on each page or post.
 */
function seopub_output_custom_schema() {
    // Only fire on singular content (pages, posts, custom post types).
    if ( ! is_singular() ) {
        return;
    }

    $schema = get_post_meta( get_the_ID(), 'custom_schema', true );

    if ( empty( $schema ) ) {
        return;
    }

    echo "\n" . '&lt;script type="application/ld+json">' . "\n";
    echo $schema . "\n";
    echo '&lt;/script>' . "\n";
}
add_action( 'wp_head', 'seopub_output_custom_schema' );</code></pre>



<p class="wp-block-paragraph">Paste the above code into the functions.php file. Remember to change the &#8216;custom_schema&#8217; part in the code if you used a different name for your custom schema field. </p>



<p class="wp-block-paragraph">In plain language, the function checks whether the current request is for a single page, post, or custom post type. If yes, it reads the value of the custom_schema custom field for that specific content. If there&#8217;s a value, it outputs it as a JSON-LD script tag in the page&#8217;s &lt;head&gt;. If there&#8217;s no value, the function exits silently.</p>



<p class="wp-block-paragraph">Schema only appears on pages where you&#8217;ve explicitly added it. No templated output for every page. No plugin database dependency.</p>



<h3 class="wp-block-heading">Step 5: Generate Your Schema</h3>



<p class="wp-block-paragraph">For each page where you want custom schema, build the appropriate JSON-LD. The easiest options:</p>



<ul class="wp-block-list">
<li><a href="https://schema.org">Schema.org</a> directly for the full reference of available types and properties</li>



<li>Schema Markup Generator from Merkle for common types with a fill-in-the-blanks interface</li>



<li>Schema App or other paid tools for more complex implementations</li>



<li>Or my favorite, just use Claude or ChatGPT. Give them the page URL or a content draft and tell them what schema you want on the page.</li>
</ul>



<p class="wp-block-paragraph">Whatever type you&#8217;re implementing, the schema needs to accurately reflect what&#8217;s actually on the page. Don&#8217;t claim review stars you don&#8217;t have. Don&#8217;t list product prices that aren&#8217;t real. Google has been cracking down on schema that doesn&#8217;t match page content.</p>



<p class="wp-block-paragraph">Insert this into the custom field on the WordPress page or post you want it added to.</p>



<figure class="wp-block-image size-large"><a href="https://theseopub.com/wp-content/uploads/2026/05/insert-schema.png"><img decoding="async" width="1024" height="328" src="https://theseopub.com/wp-content/uploads/2026/05/insert-schema-1024x328.png" alt="" class="wp-image-3519" srcset="https://theseopub.com/wp-content/uploads/2026/05/insert-schema-1024x328.png 1024w, https://theseopub.com/wp-content/uploads/2026/05/insert-schema-300x96.png 300w, https://theseopub.com/wp-content/uploads/2026/05/insert-schema-768x246.png 768w, https://theseopub.com/wp-content/uploads/2026/05/insert-schema.png 1142w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>



<figure class="wp-block-image size-large"><a href="https://theseopub.com/wp-content/uploads/2026/05/insert-schema-2.png"><img decoding="async" width="1024" height="319" src="https://theseopub.com/wp-content/uploads/2026/05/insert-schema-2-1024x319.png" alt="" class="wp-image-3520" srcset="https://theseopub.com/wp-content/uploads/2026/05/insert-schema-2-1024x319.png 1024w, https://theseopub.com/wp-content/uploads/2026/05/insert-schema-2-300x93.png 300w, https://theseopub.com/wp-content/uploads/2026/05/insert-schema-2-768x239.png 768w, https://theseopub.com/wp-content/uploads/2026/05/insert-schema-2.png 1149w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>



<h3 class="wp-block-heading">Step 6: Validate Before Publishing</h3>



<p class="wp-block-paragraph">Always validate your schema before relying on it. Two free tools:</p>



<ul class="wp-block-list">
<li>Schema Markup Validator at <a href="https://validator.schema.org">validator.schema.org</a></li>



<li>Google&#8217;s Rich Results Test at <a href="https://search.google.com/test/rich-results">search.google.com/test/rich-results</a></li>
</ul>



<p class="wp-block-paragraph">If the JSON is malformed, both tools will catch it. The Rich Results Test will also tell you whether your schema is eligible for any of Google&#8217;s current rich result types.</p>



<h2 class="wp-block-heading">A Few Practical Examples</h2>



<p class="wp-block-paragraph">What this looks like in practice across common page types:</p>



<p class="wp-block-paragraph">A product page gets Product schema with name, image, description, brand, offers (price, availability, currency), and aggregateRating if you have eligible reviews.</p>



<p class="wp-block-paragraph">A service page gets Service or LocalBusiness schema with service areas, hours, contact info, and any relevant offers.</p>



<p class="wp-block-paragraph">A blog post with frequently asked questions gets FAQPage schema with the actual Q&amp;A pairs. The rich result is gone in Google, but the schema still helps with understanding and AI citation.</p>



<p class="wp-block-paragraph">An event page gets Event schema with date, location, organizer, and ticket info.</p>



<p class="wp-block-paragraph">A how-to article gets HowTo schema. Same situation as FAQ. Google killed the rich result, but the schema still helps AI systems and other platforms parsing the content.</p>



<p class="wp-block-paragraph">Adding SEO entities to schema markup translates your website&#8217;s content into a language search engines understand. Instead of relying on keyword matching, this semantic approach accurately defines your brand&#8217;s people, places, and concepts, allowing you to control how machines interpret your data. (If you are unfamiliar with this practice, it is something we will cover in a future note.)</p>



<p class="wp-block-paragraph">The point isn&#8217;t to add schema for the sake of adding schema. It&#8217;s to communicate the specific structured information that&#8217;s actually present on the page, in a format machines can read, in a way that lives with your content rather than your plugin.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">Yoast, Rank Math, and similar plugins handle schema. Both do real work. This note isn&#8217;t arguing against using them for everything else they do.</p>



<p class="wp-block-paragraph">But the schema they generate lives in their database structures. The day you switch plugins or platforms, that schema work disappears with them. The Product offers you configured, the events you set up, the FAQ pairs you entered. Entity relationships you added. All of it tied to a plugin you may not be running in three years.</p>



<p class="wp-block-paragraph">Custom schema in a custom field is different. It lives with the post. It survives plugin changes, theme updates, and platform migrations. It gives you complete control over the JSON-LD without UI limitations or Pro upgrades. And it works for every modern schema type, including the ones that drive Google rich results, Pinterest Rich Pins, voice assistant answers, and AI search citation.</p>



<p class="wp-block-paragraph">A custom field, a small function in your child theme, and a few minutes per page. The schema you write today will still be there in five years, regardless of what plugins you&#8217;re running.</p>



<p class="wp-block-paragraph">If you&#8217;re going to invest the time to add custom schema, own it.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How AI Visibility Tools Actually Know What People Are Asking ChatGPT</title>
		<link>https://theseopub.com/how-ai-visibility-tools-actually-know-what-people-are-asking-chatgpt/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 19 May 2026 12:55:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3512</guid>

					<description><![CDATA[How AI Visibility Tools Actually Know What People Are Asking ChatGPT If you&#8217;ve used any AI visibility tool (Semrush&#8217;s AI Visibility Toolkit, Ahrefs Brand Radar, Otterly, Peec, SE Ranking, HubSpot AEO, Profound, Promptmonitor), you&#8217;ve probably seen claims like &#8220;13.5 million prompts tracked&#8221; or &#8220;239 million prompts in our database.&#8221; Most SEOs accept these numbers at [&#8230;]]]></description>
										<content:encoded><![CDATA[
<h1 class="wp-block-heading">How AI Visibility Tools Actually Know What People Are Asking ChatGPT</h1>



<p class="wp-block-paragraph">If you&#8217;ve used any AI visibility tool (Semrush&#8217;s AI Visibility Toolkit, Ahrefs Brand Radar, Otterly, Peec, SE Ranking, HubSpot AEO, Profound, Promptmonitor), you&#8217;ve probably seen claims like &#8220;13.5 million prompts tracked&#8221; or &#8220;239 million prompts in our database.&#8221; Most SEOs accept these numbers at face value without asking the obvious question.</p>



<p class="wp-block-paragraph">Where do those prompts come from? How does a tool know what real people are asking ChatGPT in private sessions?</p>



<p class="wp-block-paragraph">The answer involves clickstream data, third-party panels, and an infrastructure that&#8217;s been quietly powering SEO tools for over a decade. The methodology determines what the data actually means, and which tool you should trust for which question.</p>



<h2 class="wp-block-heading">What Clickstream Data Is</h2>



<p class="wp-block-paragraph">Clickstream data is the chronological record of every action a user takes online. Pages visited. Time on page. Clicks. Search queries entered. Results clicked. The path through a session from start to exit.</p>



<p class="wp-block-paragraph">The term goes back to the early web when &#8220;clicks&#8221; described most of what users did.&nbsp;<a rel="noreferrer noopener" href="https://www.techtarget.com/searchcustomerexperience/definition/clickstream-analysis-clickstream-analytics" target="_blank">TechTarget</a>&nbsp;and&nbsp;<a rel="noreferrer noopener" href="https://matomo.org/blog/2024/04/clickstream-data/" target="_blank">Matomo</a>&nbsp;both define it in roughly the same way: a record of user activity that, when aggregated, reveals behavioral patterns.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://dataforseo.com/blog/the-hidden-gem-of-digital-marketing-clickstream-data-insights" target="_blank">DataForSEO</a>&nbsp;splits clickstream data into two forms. Aggregated data shows totals over time periods. Unaggregated data shows individual user journeys, click sequences, and visit durations. SEO tools mostly use the aggregated form, processed through algorithms that strip personally identifying information.</p>



<h2 class="wp-block-heading">How Clickstream Data Is Collected</h2>



<p class="wp-block-paragraph">There are two collection categories, and the second one is where SEO and AI visibility tools get their data.</p>



<p class="wp-block-paragraph"><strong>First-party clickstream data</strong>&nbsp;is collected by the site owner. You install tracking on your own site through Google Analytics, Hotjar, Amplitude, Matomo, or server log analysis, and you see what your own users do. This is the kind of data you have direct access to and complete control over.</p>



<p class="wp-block-paragraph"><strong>Third-party clickstream data</strong>&nbsp;is collected by data providers who recruit panels of users willing to have their browsing observed. The user installs some piece of software, agrees to data collection in the terms of service (sometimes prominently, sometimes not), and their activity gets aggregated into a panel that data providers sell to third parties.</p>



<p class="wp-block-paragraph">The software users install typically falls into a few categories:</p>



<ul class="wp-block-list">
<li>Browser extensions, often free utilities like coupon finders, ad blockers, or tab managers</li>



<li>Free or freemium antivirus and security software</li>



<li>Free VPN services</li>



<li>Free toolbars</li>



<li>Paid research panels with explicit opt-in</li>



<li>Less commonly today, ISP-level partnerships</li>
</ul>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://victorious.com/blog/clickstream-data/" target="_blank">Victorious</a>&nbsp;explicitly notes that &#8220;SEO tools like Ahrefs and Semrush typically obtain clickstream data by purchasing it from these third-party data providers.&#8221; The tools themselves don&#8217;t run the panels. They buy the data.</p>



<h2 class="wp-block-heading">How SEO Tools Have Used This Data for Over a Decade</h2>



<p class="wp-block-paragraph">This part is worth understanding because it sets up the AI visibility section. Clickstream data isn&#8217;t a new ingredient in SEO tools. It&#8217;s been powering features SEOs interact with every day for years.</p>



<p class="wp-block-paragraph">Keyword search volume estimates after Google obscured the real numbers in Keyword Planner. Competitor traffic estimates in Semrush Traffic Analytics, SimilarWeb, and similar tools. SERP click-through rate data. Keyword difficulty scoring. Audience demographics.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://www.semrush.com/kb/998-where-does-semrush-data-come-from" target="_blank">Semrush&#8217;s own KB article</a>&nbsp;is explicit about the source: &#8220;The data in our Traffic &amp; Market toolkit comes from our panel of over 200 million real, anonymized internet users across more than 190 countries and regions. We partner with hundreds of clickstream data providers to build this panel, which records billions of events on the internet each month.&#8221;</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://shahidshahmiri.com/how-does-ahrefs-get-its-data/" target="_blank">Shahid Shahmiri&#8217;s breakdown of Ahrefs&#8217; data sources</a>&nbsp;explains that Ahrefs runs three data pipelines in parallel: their own crawler (AhrefsBot) for link data, third-party clickstream panels for behavioral data, and Google Keyword Planner for keyword existence. The clickstream layer is what powers their traffic and volume estimates.</p>



<p class="wp-block-paragraph">Every time you&#8217;ve looked at a search volume number in Ahrefs or Semrush, you&#8217;ve been looking at clickstream-derived data. You just may not have known it.</p>



<h2 class="wp-block-heading">A Brief Word on Avast</h2>



<p class="wp-block-paragraph">In January 2020, a joint Vice and PCMag investigation revealed that Avast antivirus, with over 100 million users, was selling clickstream data through its subsidiary Jumpshot. The data was detailed enough to identify individuals despite being marketed as anonymized. Customers included major retailers, analytics firms, and SEO platforms. Avast shut Jumpshot down within weeks of the investigation publishing.</p>



<p class="wp-block-paragraph">This matters because it disrupted the clickstream data market for years. The supply hasn&#8217;t disappeared, but it&#8217;s been consolidated and diversified. Tools that depend on this data now talk about partnering with &#8220;hundreds of providers&#8221; rather than a single source, which is partly a hedge against any single provider blowing up the same way Jumpshot did.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://larryludwig.com/clickstream-data-explained/" target="_blank">Larry Ludwig&#8217;s piece on clickstream data providers</a>&nbsp;covers this history and is worth reading if you want the full context. The short version: the clickstream pipeline that powers SEO tools is real, but the sourcing is often deliberately opaque because the industry learned a lesson from Avast.</p>



<h2 class="wp-block-heading">How AI Visibility Tools Use Clickstream Data: Method 1</h2>



<p class="wp-block-paragraph">The first approach used by AI visibility tools is capturing real prompts from clickstream panel members who use AI platforms. When a panel member opens ChatGPT, Perplexity, Gemini, or Claude and types a prompt, that prompt, the response, and any cited sources get captured by the panel software and aggregated into the tool&#8217;s database.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://www.semrush.com/kb/1607-semrush-ai-visibility-data" target="_blank">Semrush&#8217;s AI Visibility Toolkit KB article</a>&nbsp;states the methodology directly. The exact quote: &#8220;We source billions of real prompts from AI search clickstream data and Google&#8217;s keyword dataset for AI Overviews.&#8221; The toolkit has 239 million prompts and responses across ChatGPT, Gemini, Google AI Overviews, and AI Mode.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://ahrefs.com/blog/chatgpt-visibility-tracking/" target="_blank">Ahrefs Brand Radar</a>&nbsp;uses the same model. Their database currently sits at 13.5 million existing prompts. From Ahrefs&#8217; own piece: &#8220;You can track your ChatGPT visibility across 13.5 million existing prompts inside Ahrefs Brand Radar database.&#8221;</p>



<p class="wp-block-paragraph">What this means in practice. When these tools show you &#8220;Topics&#8221; or &#8220;what people are asking&#8221; reports, they&#8217;re showing aggregated prompts from real users in their panels who happened to use AI platforms during the data collection window. It&#8217;s not synthetic. It&#8217;s not Google search data dressed up to look like prompt data. It&#8217;s actual prompts from a panel large enough to be statistically meaningful at scale.</p>



<h2 class="wp-block-heading">How AI Visibility Tools Use Clickstream Data: Method 2</h2>



<p class="wp-block-paragraph">The second approach is different. Many AI visibility tools don&#8217;t have access to clickstream data at all, or they use it as a supplementary source. Instead, they rely on running prompts that users (or the tool&#8217;s AI suggestions) define, on a schedule, and capturing the responses.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://otterly.ai/" target="_blank">Otterly</a>&nbsp;describes the methodology in their own words: &#8220;An AI visibility tracker works by automatically sending queries (search prompts) to AI search engines like ChatGPT, Perplexity, Google AI Overviews, and AI Mode, and analyzing the responses for brand mentions, citations, and source links.&#8221;</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://peec.ai/" target="_blank">Peec AI</a>&nbsp;runs prompts &#8220;once every 24 hours across your selected AI models.&#8221;&nbsp;<a rel="noreferrer noopener" href="https://seranking.com/chatgpt-visibility-tracker.html" target="_blank">SE Ranking&#8217;s ChatGPT Visibility Tracker</a>&nbsp;&#8220;scans ChatGPT answers for your target keywords and analyzes which of them end up with your brand being mentioned.&#8221;&nbsp;<a rel="noreferrer noopener" href="https://promptmonitor.io/" target="_blank">Promptmonitor</a>&nbsp;lets users &#8220;track specific prompts or questions in AI optimization.&#8221;&nbsp;<a rel="noreferrer noopener" href="https://www.hubspot.com/products/aeo" target="_blank">HubSpot AEO</a>&nbsp;suggests prompts based on company data, then tracks visibility across them.</p>



<p class="wp-block-paragraph">The methodology is essentially rank tracking applied to AI platforms. The user (or the tool) defines prompts. The tool runs them through APIs or by scraping the AI interface. The tool captures and analyzes the responses on a schedule.</p>



<p class="wp-block-paragraph">The strength of this approach is precision. You know exactly what was prompted because the tool prompted it. You can monitor specific prompts you care about over time. You can set up brand-specific or competitor-specific tracking and watch trends.</p>



<p class="wp-block-paragraph">The limitation is that you&#8217;re tracking prompts you defined, not necessarily prompts real users are entering. If you assumed the wrong prompts mattered, you&#8217;re tracking the wrong data.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://surferseo.com/blog/best-ai-visibility-tools/" target="_blank">Surfer SEO&#8217;s overview of AI visibility tools</a>&nbsp;makes the methodology distinction explicit: &#8220;Some rely on API responses, while others track what a real user sees in the interface. On top of that, AI answers vary depending on the model, the user&#8217;s location, language settings, and even the day/time the prompt is run.&#8221;</p>



<h2 class="wp-block-heading">Why Both Methodologies Have Value</h2>



<p class="wp-block-paragraph">Each approach answers different questions.</p>



<p class="wp-block-paragraph">Real prompt data (Method 1) tells you what real people are actually asking. This is useful for content strategy: discovering prompts you didn&#8217;t know existed, understanding the actual language users employ when talking to AI, identifying topics where there&#8217;s measurable search behavior. The limitation is that you&#8217;re seeing what was asked across the panel, not necessarily by your specific audience.</p>



<p class="wp-block-paragraph">Synthesized prompt data (Method 2) tells you how AI platforms answer specific prompts you care about. This is useful for monitoring: tracking whether your brand appears when potential customers ask specific questions, watching trends over time on defined prompts, comparing your visibility to competitors on the same prompts. The limitation is that you&#8217;re tracking prompts you assumed mattered, which may or may not reflect real user behavior.</p>



<p class="wp-block-paragraph">Most mature AI visibility tools combine both. Semrush, Ahrefs, and HubSpot AEO all offer &#8220;real prompt&#8221; databases for discovery alongside synthesized prompt tracking for monitoring. Smaller or newer tools tend to rely entirely on Method 2 because they don&#8217;t have access to clickstream panels at the scale required for meaningful Method 1 data.</p>



<p class="wp-block-paragraph">The practical implication for you. When you see numbers from these tools, ask which methodology they reflect. If a tool says &#8220;We tracked 50 prompts and you appeared in 12,&#8221; that&#8217;s Method 2. If a tool says &#8220;Across our database of 239 million prompts, you were mentioned in 1.2%,&#8221; that&#8217;s Method 1. They&#8217;re measuring different things, and conflating them leads to bad strategic decisions.</p>



<h2 class="wp-block-heading">The Honest Caveats</h2>



<p class="wp-block-paragraph">Two worth flagging.</p>



<p class="wp-block-paragraph"><strong>Privacy and sourcing transparency.</strong>&nbsp;SEO tools are sometimes vague about their exact data sources, and some of that vagueness is deliberate. The Avast situation taught the industry that aggressive data collection can blow up publicly. Tools that buy from &#8220;hundreds of providers&#8221; have plausible deniability about any single provider&#8217;s practices. This isn&#8217;t necessarily wrong, but it&#8217;s worth knowing that the sausage-making is more complicated than the marketing copy suggests.</p>



<p class="wp-block-paragraph"><strong>Sample bias.</strong>&nbsp;Clickstream panels skew toward certain users. People who install free antivirus software, free VPNs, or browser extensions aren&#8217;t a representative sample of the entire internet. They tend to be more price-sensitive, more technically casual, and over-indexed in certain demographics and geographies. The data is meaningful but not perfectly representative. This applies to traditional clickstream-derived metrics (keyword volumes, traffic estimates) as much as it applies to AI prompt data. None of these numbers are precise. They&#8217;re best understood as directional intelligence, not absolute truth.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">The infrastructure powering AI visibility tools isn&#8217;t new. It&#8217;s the same clickstream data pipeline that&#8217;s been informing SEO traffic estimates and keyword volumes for over a decade, repurposed for a new purpose. Understanding where the data comes from helps you read it more accurately.</p>



<p class="wp-block-paragraph">Tools that show you &#8220;real prompts&#8221; are showing you panel-derived data with the same strengths and limitations as Semrush&#8217;s traffic estimates or Ahrefs&#8217; search volume numbers. Tools that show you tracked prompts are showing you a rank-tracking methodology applied to AI platforms.</p>



<p class="wp-block-paragraph">Both are useful. Neither is magic. And the difference matters when you&#8217;re deciding which tool to trust for which question.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Google Just Told You to Stop Publishing Commodity Content</title>
		<link>https://theseopub.com/google-just-told-you-to-stop-publishing-commodity-content/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 12 May 2026 13:00:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3506</guid>

					<description><![CDATA[At Google Search Central Live in Toronto on April 21, 2026, Danny Sullivan drew a clear line between two types of content. Commodity content: generic, easily replicable, the same topics covered the same way across hundreds of sites. Non-commodity content: specific, experience-driven, original, proprietary insight. Google&#8217;s recommendation was direct. Stop publishing the first kind. Start [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">At Google Search Central Live in Toronto on April 21, 2026, Danny Sullivan drew a clear line between two types of content. Commodity content: generic, easily replicable, the same topics covered the same way across hundreds of sites. Non-commodity content: specific, experience-driven, original, proprietary insight. Google&#8217;s recommendation was direct. Stop publishing the first kind. Start publishing more of the second.</p>



<p class="wp-block-paragraph">This is essentially Information Gain repackaged as official, plain-language Google guidance.</p>



<h2 class="wp-block-heading">What Sullivan Actually Said</h2>



<p class="wp-block-paragraph">The &#8220;unique, non-commodity content&#8221; language wasn&#8217;t entirely new. John Mueller had used it in a Search Central blog post in May 2025. But Sullivan&#8217;s Toronto presentation gave it sharper definition with concrete industry examples that make the concept easier to apply.</p>



<p class="wp-block-paragraph">The interior designer example is the most quotable. Commodity content: &#8220;2024 Kitchen Trends You Need to See&#8221; with stock photos of green cabinets and brass hardware found on Pinterest. Non-commodity content: &#8220;Marble vs. Grape Juice: Why I Refused to Install Stone for a Family of Five,&#8221; a video showing actual stain tests with grape juice and turmeric to prove the point. Sullivan used similar contrasts for running stores and real estate.</p>



<p class="wp-block-paragraph">The pattern is consistent. Commodity content can be produced by anyone with no real experience in the space. Non-commodity content requires that someone actually did something, learned something, or has access to information that isn&#8217;t already published everywhere else.</p>



<h2 class="wp-block-heading">What &#8220;Commodity Content&#8221; Actually Means</h2>



<p class="wp-block-paragraph">Commodity content is content that is easy to reproduce. It usually covers a familiar topic in a familiar way, often using the same structure, the same talking points, and the same generalized advice found across dozens or hundreds of other pages. It is not necessarily wrong. It is not always low quality. But it is interchangeable. If one page disappeared, another could fill the gap with no real loss.</p>



<p class="wp-block-paragraph">Non-commodity content contains something that&#8217;s hard to replicate. Direct experience. Original analysis. Proprietary information. Specific examples. Practitioner judgment. Contextual insight. It gives the reader something more than a reorganized summary of public knowledge.</p>



<p class="wp-block-paragraph">The clearest self-test I&#8217;ve seen comes from Shaun Anderson at Hobo: &#8220;Would this be irrevocably lost if this page disappeared tomorrow?&#8221; If the answer is no, it&#8217;s commodity content.</p>



<p class="wp-block-paragraph">A second test, from Florian Krückel at SEO Kreativ, sharpens the same idea: &#8220;Could ChatGPT write this in 90 seconds, and would the result be essentially identical?&#8221; If yes, rewrite it or skip it.</p>



<p class="wp-block-paragraph">Both tests are useful. The first one focuses on what would be lost. The second one focuses on what&#8217;s already easy to produce. Either framing gets you to the same conclusion.</p>



<h2 class="wp-block-heading">The Information Gain Connection</h2>



<p class="wp-block-paragraph">A few weeks ago I wrote about&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/information-gain-the-patent-you-know-the-patent-you-dont/" target="_blank">Information Gain</a>&nbsp;and the Google patents behind it. The 2006 patent (US8140449B1) describes a system that scores documents based on how much novel content they contain relative to all other documents on the same topic. It&#8217;s the mechanism for measuring exactly what Sullivan is describing in plain language.</p>



<p class="wp-block-paragraph">The patent says: pages that introduce information nuggets and entity interactions absent from the rest of the corpus score better. Sullivan says: publish content that contains something other sites don&#8217;t have.</p>



<p class="wp-block-paragraph">Same idea. Different framing.</p>



<p class="wp-block-paragraph">What&#8217;s notable about the commodity content guidance is that it&#8217;s official Google language, not a patent that may or may not be in active use. The patent gave us the mechanism. This gives us the editorial test. Both are pointing at the same underlying truth: content that adds something new to the conversation has a structural advantage. Content that restates what&#8217;s already out there does not.</p>



<h2 class="wp-block-heading">Why This Matters More Now</h2>



<p class="wp-block-paragraph">Two things have changed that make commodity content more dangerous than it used to be.</p>



<p class="wp-block-paragraph">AI has lowered the cost of producing commodity content to nearly zero. Anyone can prompt ChatGPT to produce a competent, generic guide on any topic in seconds. That floods the index with interchangeable pages and forces Google to raise its quality bar to find anything worth surfacing. Google has talked about scaled content abuse and the &#8220;Crawled, currently not indexed&#8221; signal as quality flags. The bar isn&#8217;t moving up because Google decided to be picky. It&#8217;s moving up because the volume of commodity content has exploded.</p>



<p class="wp-block-paragraph">AI Overviews and answer engines are also very good at summarizing common knowledge. If your page is commodity content covering common knowledge, AI systems compress it into summaries with no need to send the user to your site. Non-commodity content, with specific anecdotes and proprietary insight, has the kind of citable detail that gets pulled directly into AI responses with attribution. Commodity content gets summarized away. Non-commodity content gets cited.</p>



<p class="wp-block-paragraph">This connects to the&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/ai-search-is-still-seo/" target="_blank">AirOps and Kevin Indig study</a>&nbsp;I covered last week. Pages with focused, specific content outperformed exhaustive guides. Pages whose headings closely matched the query outperformed pages with broad coverage. The commodity vs. non-commodity framing explains why. Specific is harder to commoditize than broad. Experience-driven is harder to replicate than synthesized.</p>



<h2 class="wp-block-heading">The Honest Caveat</h2>



<p class="wp-block-paragraph">&#8220;Commodity content&#8221; is not an officially confirmed ranking signal. It&#8217;s a strategic recommendation from Google, not a defined penalty or scoring mechanism. Some commodity content is necessary on most sites. A definitions page, a basic explainer, a foundational topic, those have their place.</p>



<p class="wp-block-paragraph"><em>The problem isn&#8217;t publishing any commodity content. The problem is building your entire content strategy on it.</em>&nbsp;If most of your pages could be replicated by any competitor in 90 seconds, your site doesn&#8217;t have a differentiated reason to exist in search results.</p>



<p class="wp-block-paragraph">The commodity content framing is a strategic lens, not a checklist. It tells you what kind of content is increasingly hard to win with, not what you can never publish.</p>



<h2 class="wp-block-heading">How to Apply This</h2>



<p class="wp-block-paragraph">This is a self-audit, not an editorial overhaul.</p>



<p class="wp-block-paragraph">Pull a list of your top pages. Read them. Ask the test questions: would this be irrevocably lost if it disappeared? Could ChatGPT write something essentially identical in 90 seconds? If a page fails the test, it doesn&#8217;t necessarily need to be deleted. It needs something added that only you could provide.</p>



<p class="wp-block-paragraph">Specific things you can add:</p>



<ul class="wp-block-list">
<li>Original data from your own work, your clients, your audits, or your industry</li>



<li>Specific examples with names, numbers, and outcomes, not generic case studies</li>



<li>Direct quotes from practitioners, customers, or subject matter experts</li>



<li>Failed experiments and what you learned from them</li>



<li>Photos, videos, or screenshots of actual work</li>



<li>Decisions you made that contradict standard advice, with the reasoning behind them</li>



<li>Industry-specific knowledge that requires real experience to know</li>
</ul>



<p class="wp-block-paragraph">The pattern across all of these is that they require something other than synthesis of public information. They require that you, or someone you have access to, actually did something or knows something that isn&#8217;t already on the first page of Google.</p>



<p class="wp-block-paragraph">The tactical move isn&#8217;t to publish less. It&#8217;s to make sure each page has at least one thing that wouldn&#8217;t appear on a competitor&#8217;s version of the same content. One specific data point. One real example with a name and a number. One opinion you can defend with experience. That&#8217;s the threshold between commodity and non-commodity, and it&#8217;s lower than people think.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">Information Gain explained the mechanism. The 2006 patent, the entity interactions, the depth weighting, all of it describes how Google can measure whether a page contributes novel content to a topic. Sullivan&#8217;s Toronto talk explained the editorial test in plain language. Both are saying the same thing.</p>



<p class="wp-block-paragraph">If your page can be replaced by any of the other pages ranking for the same query, you&#8217;re not adding anything to the search results. You&#8217;re filling space.</p>



<p class="wp-block-paragraph">The cost of that has gone up. AI has made commodity content cheap to produce, which means there&#8217;s more of it, which means Google&#8217;s bar for what gets indexed and surfaced has risen. AI Overviews and answer engines compress commodity content into summaries that don&#8217;t link back. The middle of the distribution, content that&#8217;s fine but not differentiated, has gotten harder to win with.</p>



<p class="wp-block-paragraph">The fix is the same one it&#8217;s always been. Add something to the conversation that wouldn&#8217;t be there without you.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://www.seroundtable.com/google-commodity-content-41200.html" target="_blank">Read the Search Engine Roundtable coverage of Sullivan&#8217;s Toronto presentation.</a></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>AI Search Is Still SEO (Kevin Indig and AirOps Just Proved It)</title>
		<link>https://theseopub.com/ai-search-is-still-seo/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 05 May 2026 13:00:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3502</guid>

					<description><![CDATA[The AI search panic narrative has been everywhere for the past year. Everything is different now. Traditional SEO is dead. You need an entirely new playbook. The fundamentals don&#8217;t apply anymore. A new study from AirOps and Kevin Indig should put a lot of that to rest. The Fan-Out Effect&#160;analyzed 16,851 queries and 353,799 pages [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">The AI search panic narrative has been everywhere for the past year. Everything is different now. Traditional SEO is dead. You need an entirely new playbook. The fundamentals don&#8217;t apply anymore.</p>



<p class="wp-block-paragraph">A new study from AirOps and Kevin Indig should put a lot of that to rest.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://www.airops.com/report/the-fan-out-effect-what-happens-between-a-query-and-a-citation" target="_blank">The Fan-Out Effect</a>&nbsp;analyzed 16,851 queries and 353,799 pages across ChatGPT&#8217;s full retrieval pipeline. The findings are clear and the implications are direct. AI search is still SEO. The principles haven&#8217;t changed. A few specific tactics need adjusting, but anyone who told you to throw out your SEO playbook was wrong.</p>



<p class="wp-block-paragraph">This note covers the findings that matter most, and validates a few things I have shared here the past few months.</p>



<h2 class="wp-block-heading">Retrieval Rank Is the Whole Game</h2>



<p class="wp-block-paragraph">The single most important finding from the study. A page at position 1 in ChatGPT&#8217;s retrieval results has a 58% citation rate. By position 10, that drops to 14%. A 4x gap that no amount of content quality can close.</p>



<p class="wp-block-paragraph">ChatGPT doesn&#8217;t pull from some magical alternative source. It runs web searches, gets back ranked results, and cites from there. The retrieval system underneath is doing the heavy lifting. If you don&#8217;t rank well in traditional search, you don&#8217;t get cited in AI search.</p>



<p class="wp-block-paragraph">The study tested this against every other variable and the conclusion held. A page with perfect content relevance at rank 11 or worse got cited 21.5% of the time. A page with mediocre content relevance at rank 1 got cited 55.9%. Rank overrides content quality.</p>



<p class="wp-block-paragraph">That&#8217;s the headline argument. The &#8220;AI search makes traditional SEO obsolete&#8221; narrative collapses under this finding. ChatGPT citations flow through the same retrieval mechanics that have always determined organic search visibility. Great SEO isn&#8217;t your obstacle in AI search. It&#8217;s your advantage.</p>



<h2 class="wp-block-heading">Heading Match Is the Primary On-Page Lever</h2>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://theseopub.com/semantic-distance/" target="_blank">Last week&#8217;s note covered semantic distance</a>&nbsp;and the Google patent that describes how heading structure creates semantic relationships on a page. That note explained the mechanics. This study quantifies the impact.</p>



<p class="wp-block-paragraph">Pages whose headings closely match the query are cited 41% of the time. Pages with weak heading matches get cited 29% of the time. That 12-point gap holds even after controlling for retrieval rank.</p>



<p class="wp-block-paragraph">The study compared heading match against every other content signal: word count, topical breadth, body copy depth, schema markup, readability. Heading structure was the strongest content predictor of citation. By a meaningful margin.</p>



<p class="wp-block-paragraph">This connects directly to what last week&#8217;s note covered. Headings aren&#8217;t just keyword placement opportunities. They&#8217;re semantic containers that define what a page is about. When the container clearly maps to the query someone is asking, AI systems and traditional search engines both reward it. When the container is vague or off-topic, both penalize it.</p>



<h2 class="wp-block-heading">Heading Structure Has a Sweet Spot</h2>



<p class="wp-block-paragraph">The study also found a sweet spot for how many subheadings to use, and a counterintuitive pattern below it.</p>



<p class="wp-block-paragraph">For articles, the optimal range is 4 to 10 H2-H4 subheadings (33.2% citation rate). The strange finding: articles with 1 to 3 subheadings (28%) perform worse than articles with zero subheadings (30.1%). Half-measures are worse than no structure at all. Either commit to proper structure or don&#8217;t bother.</p>



<p class="wp-block-paragraph">The sweet spot also varies by page type. Articles do best with 4 to 10 subheadings. Product pages, oddly, perform best with zero subheadings (43.2%) and worst with 21 or more (25%). The &#8220;other&#8221; bucket (forums, landing pages) tracks the article pattern.</p>



<p class="wp-block-paragraph">The takeaway: don&#8217;t apply article-page heading structure to product pages. Product pages are typically focused on a single item and don&#8217;t need editorial scaffolding. Different page types have different optimal structures, and forcing the wrong structure on a page hurts more than it helps.</p>



<h2 class="wp-block-heading">Domain Authority Doesn&#8217;t Translate</h2>



<p class="wp-block-paragraph">A few weeks ago&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/4-seo-metrics-youre-reading-wrong/" target="_blank">I wrote about how Domain Authority and similar metrics get misused</a>. This study delivers one of the most direct empirical contradictions of DA-based thinking I&#8217;ve seen.</p>



<p class="wp-block-paragraph">Always-cited pages have lower DA (53) than never-cited pages (56). Backlinks show a 3x inverse gap. The always-cited pages have an average of 1.1 million backlinks, while the never-cited pages have 3.2 million.</p>



<p class="wp-block-paragraph">Pages that get cited consistently have&nbsp;<em>fewer</em>&nbsp;links and&nbsp;<em>lower</em>&nbsp;DA than pages that never get cited.</p>



<p class="wp-block-paragraph">The site-type breakdown is even more damning. Five of the highest-DA site types in the study produce wildly different citation rates: YouTube (DA 100) at 2.4%, Reddit (DA 92) at 29.9%, Major News (DA 94) at 32%, Health Publishers (DA 90) at 46.4%, Wikipedia (DA 95) at 59.2%. Almost identical authority. Citation rates spanning 25x.</p>



<p class="wp-block-paragraph">DA tells you nothing about citation likelihood. Just like it tells you nothing about how Google evaluates content.</p>



<h2 class="wp-block-heading">Length Isn&#8217;t the Answer</h2>



<p class="wp-block-paragraph">In the&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/4-seo-concepts-that-arent-helping-you/" target="_blank">recent note on SEO concepts that aren&#8217;t helping you</a>, I covered why word count chasing doesn&#8217;t work. The study confirms it.</p>



<p class="wp-block-paragraph">The citation sweet spot is 500 to 2,000 words. Pages over 5,000 words underperform pages under 500 words. Long-form padding actively hurts you in AI search.</p>



<p class="wp-block-paragraph">The reason is the same one that applies in traditional search. Word count itself does nothing. What helps is covering the topic with depth and specificity. What hurts is padding to hit a target. AI systems appear to be even less tolerant of filler than traditional search results, probably because they&#8217;re trying to extract specific, citable information rather than rank pages.</p>



<p class="wp-block-paragraph">If your content strategy revolves around hitting word count targets, that strategy is working against you in both traditional and AI search.</p>



<h2 class="wp-block-heading">Focused Beats Comprehensive</h2>



<p class="wp-block-paragraph">This finding partially complicates the standard SEO playbook. The &#8220;ultimate guide&#8221; approach to content has been a dominant strategy for years. The study suggests it actively hurts AI citation rates.</p>



<p class="wp-block-paragraph">Pages covering 26 to 50% of ChatGPT&#8217;s fan-out subtopics outperform pages covering 100% of them. When primary query relevance is held constant, exhaustive coverage actually reduces citation rate.</p>



<p class="wp-block-paragraph">The study&#8217;s interpretation: exhaustive coverage signals &#8220;generalist&#8221; content that addresses many topics without depth. Moderate coverage paired with strong primary relevance signals focused expertise.</p>



<p class="wp-block-paragraph">This loosely connects to information gain. The point isn&#8217;t to cover everything that has ever been written about a topic. The point is to cover the right things with depth. A page that nails one question outperforms a page that adequately addresses five. Fan-out subtopics aren&#8217;t a content checklist. They&#8217;re context.</p>



<p class="wp-block-paragraph">(<em>Side note:&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/how-to-find-information-gain-opportunities/" target="_blank">read the recent note on information gain here</a>.</em>&nbsp;Also, I just published a new video expanding on that note that is worth checking out. You can watch that below or over on YouTube.)</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Information Gain: What It Is and Where It Actually Helps" width="500" height="281" src="https://www.youtube.com/embed/31jaqXRU2-s?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p class="wp-block-paragraph">If you&#8217;ve been building 5,000-word ultimate guides on the assumption that more comprehensive equals more rankable, this study says you should reconsider. Focused, deep coverage of the primary query is what gets cited.</p>



<h2 class="wp-block-heading">Schema Markup Is a Real Signal</h2>



<p class="wp-block-paragraph">Pages with JSON-LD schema markup have a 6.5 percentage point citation advantage (38.5% vs 32%). The study verified this isn&#8217;t explained by other factors. Schema and non-schema pages have similar word counts, heading counts, DA, and query match scores. The schema markup itself is contributing the lift.</p>



<p class="wp-block-paragraph">The top-performing schema types:</p>



<ul class="wp-block-list">
<li>MedicalWebPage: 47% citation rate</li>



<li>BreadcrumbList: 46.2%</li>



<li>FAQPage: 45.6%</li>



<li>Organization: 44.3%</li>



<li>WebSite: 40.6%</li>
</ul>



<p class="wp-block-paragraph">Schema markup helps AI systems parse and categorize page content. If you&#8217;ve been treating schema as optional, this is a reason to reconsider. It&#8217;s one of the few signals in the study that delivers a clear advantage independent of everything else.</p>



<h2 class="wp-block-heading">Write at a Higher Reading Level Than You Think</h2>



<p class="wp-block-paragraph">This one is genuinely counterintuitive. The &#8220;write for an 8th grader&#8221; advice has been floating around SEO content guidance for years. The study contradicts it directly.</p>



<p class="wp-block-paragraph">Flesch-Kincaid grade 16-17 (college level) writing performs best at 35.9% citation rate. Kindergarten-level writing performs worst at 29.6%. The signal peaks at college-level vocabulary and sentence structure, then tapers slightly above grade 18.</p>



<p class="wp-block-paragraph">The study&#8217;s interpretation is that expert-written content tends to use higher-grade vocabulary and more complex sentence structure, and AI systems appear to favor that signal as a marker of expertise.</p>



<p class="wp-block-paragraph">The practical takeaway: don&#8217;t dumb your content down past the level of expertise your audience expects. If you&#8217;re writing for practitioners, write at a practitioner level. If you&#8217;re writing for technical audiences, use the technical language they actually use. Oversimplifying for an imagined &#8220;8th grade reader&#8221; who doesn&#8217;t exist in your actual audience may be costing you visibility in AI search.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">AI search is still SEO. The principles haven&#8217;t changed.</p>



<p class="wp-block-paragraph">Rank well in retrieval, because nothing else matters if you can&#8217;t be found. Use headings that match the query, with proper structure for your page type. Write focused content of appropriate length. Use schema markup. Write at the reading level your audience actually expects. Don&#8217;t chase domain authority, because no one is using it.</p>



<p class="wp-block-paragraph">The &#8220;AI changes everything&#8221; narrative was wrong. The &#8220;you need a completely new playbook&#8221; narrative was wrong. A few tactics need adjusting (length targets are tighter, exhaustive coverage hurts more than it helps, expert-level writing matters more than it did), but the fundamentals still work.</p>



<p class="wp-block-paragraph">The fundamentals are still the work.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://www.airops.com/report/the-fan-out-effect-what-happens-between-a-query-and-a-citation" target="_blank">Read the full AirOps and Kevin Indig study.</a></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Semantic Distance: Why Your Heading Structure Matters More Than You Think</title>
		<link>https://theseopub.com/semantic-distance/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 28 Apr 2026 13:00:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3495</guid>

					<description><![CDATA[Most SEOs think about headings as places to put keywords. Put your target keyword in the H1, sprinkle related terms into H2s and H3s, and move on. That&#8217;s not wrong, but it misses what headings actually do from Google&#8217;s perspective. There&#8217;s a Google patent that describes how the search engine uses HTML structure &#8211; headings, [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Most SEOs think about headings as places to put keywords. Put your target keyword in the H1, sprinkle related terms into H2s and H3s, and move on. That&#8217;s not wrong, but it misses what headings actually do from Google&#8217;s perspective.</p>



<p class="wp-block-paragraph">There&#8217;s a Google patent that describes how the search engine uses HTML structure &#8211; headings, lists, tables, divs &#8211; to determine how semantically close terms on a page are to each other. That closeness directly affects how relevant Google considers the page for queries containing those terms. Once you understand the mechanics, it changes how you think about content organization.</p>



<h2 class="wp-block-heading">The Concept in 30 Seconds</h2>



<p class="wp-block-paragraph">Semantic distance is a measure of how far apart two meanings are. &#8220;Dog&#8221; and &#8220;cat&#8221; are semantically close. They share context: pets, fur, domestication. &#8220;Dog&#8221; and &#8220;carburetor&#8221; are semantically distant. Almost no shared context.</p>



<p class="wp-block-paragraph">Search engines use semantic distance to match intent, not just keywords. That&#8217;s the general idea, and it underpins a lot of how modern search works.</p>



<p class="wp-block-paragraph">But there&#8217;s a specific, on-page version of semantic distance that most SEOs aren&#8217;t thinking about: how Google interprets the structural distance between terms within a single page. Not the conceptual distance between words in a language model. The literal structural distance between terms as defined by your HTML.</p>



<h2 class="wp-block-heading">The Patent</h2>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://patents.google.com/patent/US7716216B1/en" target="_blank">US7716216B1</a>, &#8220;Document ranking based on semantic distance between terms in a document.&#8221; Filed 2004, granted 2010, assigned to Google. Inventors: Georges R. Harik and Monika H. Henzinger. A continuation patent (US8060501B1) was granted in 2011.&nbsp;<a rel="noreferrer noopener" href="https://www.seobythesea.com/2010/05/google-defines-semantic-closeness-as-a-ranking-signal/" target="_blank">Bill Slawski wrote the definitive breakdown</a>&nbsp;on SEO by the Sea.</p>



<p class="wp-block-paragraph">The patent describes a system that locates semantic structures in HTML documents such as headings, lists, tables, divs, even elements styled with larger font sizes, and uses those structures to calculate distance values between terms. Those distance values feed into ranking scores that determine how relevant the page is for a given query.</p>



<p class="wp-block-paragraph">The key insight: the search engine doesn&#8217;t just count the number of words between two terms to figure out how close they are. It looks at the HTML structure and uses that structure to override simple word-count proximity.</p>



<h2 class="wp-block-heading">How It Works</h2>



<p class="wp-block-paragraph">The classic example from the patent makes this concrete. Imagine a page with the heading &#8220;Saturn Facts&#8221; and a list beneath it:</p>



<ul class="wp-block-list">
<li>Orbit: 10,759 Days</li>



<li>Rotation Period: 10.7 Hours</li>



<li>Mass: 568.5 x 10²⁴ kg</li>



<li>Volume: 82,713 x 10¹⁰ km³</li>



<li>Distance from the Sun: 1,434 x 10⁶ km</li>
</ul>



<p class="wp-block-paragraph">Two things happen under the patent&#8217;s logic.</p>



<p class="wp-block-paragraph">First, &#8220;Saturn&#8221; in the heading is considered semantically close to every item in the list, regardless of position. The word count between &#8220;Saturn&#8221; and the last list item doesn&#8217;t matter. The heading creates a semantic container, and everything inside that container is equally close to the heading. The page is equally relevant for &#8220;Saturn mass,&#8221; &#8220;Saturn volume,&#8221; and &#8220;Saturn distance from the sun.&#8221;</p>



<p class="wp-block-paragraph">Second, terms within the same list item are closer than terms across different list items. Here&#8217;s the counterintuitive part: &#8220;Saturn&#8221; and &#8220;Distance&#8221; (the heading and the last list item) are considered closer than &#8220;Days&#8221; and &#8220;Rotation&#8221; (the last word of item 1 and the first word of item 2), even though that second pair is visually adjacent on the page. The list boundary between items creates semantic separation that overrides physical proximity.</p>



<p class="wp-block-paragraph">The patent lays out three rules:</p>



<ol class="wp-block-list">
<li>Both terms in the same list item: close.</li>



<li>One term in a heading, one in any list item under that heading: approximately equally close, regardless of which list item.</li>



<li>Terms in different list items: farther apart than either of the above.</li>
</ol>



<p class="wp-block-paragraph">The patent also notes that Google looks beyond formal HTML heading tags. A larger font size used as a visual heading can be interpreted as a heading element even without an H1 or H2 tag.</p>



<p class="wp-block-paragraph">Slightly off-topic, but this is something I have been stressing with people for years. Google can understand heading structures by the layout even if proper H tags are missing. That&#8217;s why when you &#8220;fix&#8221; the missing H tags on a page or correct the order, you don&#8217;t see ranking improvement.</p>



<p class="wp-block-paragraph">Google is trying to understand the visual and structural hierarchy of the page, not just parse HTML tags.</p>



<h2 class="wp-block-heading">Why This Changes How You Think About Headings</h2>



<p class="wp-block-paragraph">Most SEOs treat headings as keyword opportunities. The patent reframes them as semantic containers that define relationships between every piece of content beneath them.</p>



<p class="wp-block-paragraph"><strong>Heading text creates relationships, not just labels.</strong>&nbsp;If your H2 says &#8220;Installation Costs by Material Type&#8221; and the paragraphs beneath discuss pricing for hardwood, tile, and carpet, Google considers &#8220;Installation Costs&#8221; semantically close to all three materials. That heading establishes a relationship between the cost concept and every material mentioned in the section.</p>



<p class="wp-block-paragraph">Now consider what happens if that H2 instead says &#8220;Additional Information.&#8221; The same content sits beneath it, but the semantic container is weak. Google gets far less signal about how &#8220;installation costs&#8221; relates to the content below. The heading is doing less work.</p>



<p class="wp-block-paragraph"><strong>What you group together under a heading matters.</strong>&nbsp;If you want Google to associate two concepts, put them under the same heading. If you split them across different sections with different headings, you increase the semantic distance between them. This is a content architecture decision that most people make based on readability alone, without considering the semantic consequences.</p>



<p class="wp-block-paragraph"><strong>Lists create equal-distance relationships.</strong>&nbsp;Every item in a list is equidistant from the heading above it. You can&#8217;t accidentally push a concept further from the main topic by placing it lower in a list. That&#8217;s structurally different from running paragraphs, where terms physically farther from the heading accumulate more word-count distance in a traditional proximity model.</p>



<p class="wp-block-paragraph"><strong>Heading hierarchy is semantic nesting.</strong>&nbsp;An H3 under an H2 inherits context from the H2. The H2 inherits from the H1. You&#8217;re building a tree of semantic relationships. A well-structured hierarchy tells Google not just what each section is about, but how sections relate to each other and to the page&#8217;s central topic.</p>



<h2 class="wp-block-heading">The Connection to Entity-Based SEO</h2>



<p class="wp-block-paragraph">If you&#8217;ve been following the&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/entity-analysis-systematized/" target="_blank">entity extraction</a>&nbsp;and&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/information-gain-the-patent-you-know-the-patent-you-dont/" target="_blank">information gain</a>&nbsp;notes, this concept slots in directly.</p>



<p class="wp-block-paragraph">Entities and their attributes are the building blocks. Semantic distance is how you control the relationships between those building blocks on the page. When you place an entity under a heading, you&#8217;re telling Google that entity is semantically close to the heading&#8217;s concept. When you place two entities under the same heading, you&#8217;re telling Google they&#8217;re related to each other.</p>



<p class="wp-block-paragraph">This is entity relationship management at the page level. Your heading structure isn&#8217;t just formatting. It&#8217;s the mechanism by which Google interprets which entities belong together and how they connect to the page&#8217;s central topic.</p>



<h2 class="wp-block-heading">What to Do With This</h2>



<p class="wp-block-paragraph"><strong>Review your heading text.</strong>&nbsp;Are your headings specific enough to create meaningful semantic containers? Generic headings like &#8220;Overview,&#8221; &#8220;More Info,&#8221; or &#8220;Details&#8221; create weak containers. Specific headings that name the concept being covered create strong ones.</p>



<p class="wp-block-paragraph"><strong>Check your content grouping.</strong>&nbsp;Are related concepts under the same heading? Are unrelated concepts accidentally sharing a section? Look at your most important pages and ask whether the content beneath each heading actually belongs together semantically.</p>



<p class="wp-block-paragraph"><strong>Use lists deliberately.</strong>&nbsp;When you have a set of items that should all be equally associated with a concept, a list under a clear heading is structurally stronger than scattering the same items across running paragraphs. The list structure guarantees equal semantic distance from the heading.</p>



<p class="wp-block-paragraph"><strong>Think about heading hierarchy as a relationship tree.</strong>&nbsp;Your H1 is the trunk. H2s are branches. H3s are sub-branches. Content under each heading is bound to it semantically. When you&#8217;re planning a page, think about which concepts should be siblings (under the same parent heading) and which should be nested (sub-heading under a parent).</p>



<p class="wp-block-paragraph"><strong>Don&#8217;t over-optimize.</strong>&nbsp;John Mueller has noted that heading hierarchy order doesn&#8217;t need to be perfect from Google&#8217;s perspective. The patent describes one ranking signal among many. Structure your content for humans first. But know that when you make a structural decision, it carries semantic weight.</p>



<h2 class="wp-block-heading">The Takeaway</h2>



<p class="wp-block-paragraph">Your page structure isn&#8217;t just organization. It&#8217;s communication. Every heading you write, every list you create, every section break is telling Google how the concepts on your page relate to each other. The patent is from 2004, but the logic holds. Google is still reading the structure of your pages to understand meaning. Give it a clear structure, and it has a better chance of understanding what your page is actually about.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>4 SEO Concepts That Aren&#8217;t Helping You</title>
		<link>https://theseopub.com/4-seo-concepts-that-arent-helping-you/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 21 Apr 2026 13:00:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3489</guid>

					<description><![CDATA[Last week&#8217;s note&#160;covered four SEO metrics that most people are reading at the wrong level. This week is the companion piece: four concepts that people spend real time and energy optimizing for but probably shouldn&#8217;t be. The difference between the two notes is this. Last week&#8217;s metrics have legitimate uses when applied correctly. This week&#8217;s [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://theseopub.com/4-seo-metrics-youre-reading-wrong/" target="_blank">Last week&#8217;s note</a>&nbsp;covered four SEO metrics that most people are reading at the wrong level. This week is the companion piece: four concepts that people spend real time and energy optimizing for but probably shouldn&#8217;t be.</p>



<p class="wp-block-paragraph">The difference between the two notes is this. Last week&#8217;s metrics have legitimate uses when applied correctly. This week&#8217;s concepts are either outdated, misunderstood at a fundamental level, or solving a problem that doesn&#8217;t exist anymore.</p>



<h2 class="wp-block-heading">Keyword Density</h2>



<p class="wp-block-paragraph">People still ask this question. &#8220;What percentage should my keyword appear in the content?&#8221; Five percent? Three percent? Two?</p>



<p class="wp-block-paragraph">There is no target percentage. There hasn&#8217;t been one for a very long time.</p>



<p class="wp-block-paragraph">Google stopped relying on literal keyword matching years ago. It understands entities, relationships between entities, synonyms, and meaning. When you search &#8220;how to fix a leaking faucet,&#8221; Google doesn&#8217;t count how many times each ranking page says &#8220;leaking faucet.&#8221; It understands that the page is about plumbing repair, that a faucet is a fixture, that a leak is a malfunction, and that the searcher wants step-by-step instructions. It matches intent and topical coverage, not word frequency.</p>



<p class="wp-block-paragraph">Yet SEO plugins still flag keyword density. Beginners see a warning that their keyword only appears 1.2% of the time and start cramming it into sentences where it doesn&#8217;t belong. The result is content that reads awkwardly, and awkward content doesn&#8217;t help anyone.</p>



<p class="wp-block-paragraph">If you&#8217;re writing naturally about a topic and covering the relevant entities with appropriate depth, your target keyword will appear at a natural frequency. You don&#8217;t need to count it. If you&#8217;re worried about whether Google understands what your page is about, the answer is almost never &#8220;use the keyword more times.&#8221; It&#8217;s &#8220;cover the topic more thoroughly.&#8221; Those are very different things.</p>



<h2 class="wp-block-heading">PageSpeed Insights Score</h2>



<p class="wp-block-paragraph">People chase a perfect 100 in Google&#8217;s PageSpeed Insights tool like it&#8217;s a grade. It&#8217;s not.</p>



<p class="wp-block-paragraph">The Lighthouse score you see in PageSpeed Insights is a lab-based diagnostic tool. It runs a simulated test of your page under controlled conditions and produces a score. That score is useful for identifying specific performance issues: images that aren&#8217;t compressed, JavaScript that blocks rendering, layout shifts during load. It&#8217;s a debugging tool.</p>



<p class="wp-block-paragraph">What Google actually uses as a ranking signal (and a very weak one at that) is Core Web Vitals field data. That&#8217;s the real-world performance data collected from actual users visiting your site through Chrome. It measures three things: how fast the largest visible element loads (LCP), how quickly the page responds to interaction (INP), and how much the layout shifts unexpectedly during load (CLS). These are measured from real user sessions, not from a simulated lab test.</p>



<p class="wp-block-paragraph">A site can score 65 in Lighthouse but have perfectly good Core Web Vitals because real users on real connections experience the site just fine. A site can score 98 in Lighthouse but have poor field data because the lab simulation doesn&#8217;t reflect how the site actually performs for its audience.</p>



<p class="wp-block-paragraph">The Lighthouse score and Core Web Vitals field data are related but not the same thing. If you&#8217;re going to track page speed as part of your SEO work, look at the Core Web Vitals report in Google Search Console or the field data section in PageSpeed Insights (labeled &#8220;Discover what your real users are experiencing&#8221;). That&#8217;s what Google uses. The number at the top of the screen is for diagnosing problems, not for measuring ranking impact.</p>



<h2 class="wp-block-heading">Word Count</h2>



<p class="wp-block-paragraph">The idea that longer content ranks better refuses to die. It comes from correlation studies that found pages ranking in the top positions tended to have more words. The conclusion people drew was that writing longer pages would improve rankings.</p>



<p class="wp-block-paragraph">The problem is that correlation isn&#8217;t causation, and the actual cause is straightforward. Longer pages tend to rank better because they tend to cover more entities, answer more questions, address more aspects of the search intent, and provide more information gain. Those things help with rankings. The word count itself does nothing. Google has said this explicitly. There is no minimum word count for ranking, and adding words doesn&#8217;t help unless those words add substance.</p>



<p class="wp-block-paragraph">A 3,000-word page that pads its length with filler, restated points, and generic advice performs worse than a 1,200-word page that covers the topic with depth and specificity. If you read&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/information-gain-the-patent-you-know-the-patent-you-dont/" target="_blank">the information gain note</a>, this should click. Google can measure whether a page contributes novel information relative to other pages on the same topic. More words is not more information gain. More novel, specific information is more information gain, regardless of how many words it takes to deliver it.</p>



<p class="wp-block-paragraph">The practical consequence of chasing word count is that people dilute their content. They add paragraphs restating things they already said. They include sections on tangentially related topics just to hit a number. Every paragraph that restates what the other ranking pages say, or what your own page already said, dilutes the ratio of useful-to-redundant content. That&#8217;s the opposite of what you want.</p>



<p class="wp-block-paragraph">Write until you&#8217;ve covered the topic thoroughly. Stop when you&#8217;ve said what needs to be said. If that&#8217;s 800 words, publish 800 words. If it&#8217;s 2,500 words, publish 2,500 words. The number is an outcome of thorough coverage, not a target to aim for.</p>



<h2 class="wp-block-heading">&#8220;Toxic&#8221; Links and Disavow Obsession</h2>



<p class="wp-block-paragraph">Third-party SEO tools have a feature that scans your backlink profile and flags links as &#8220;toxic.&#8221; The flags show up in red. There&#8217;s usually a score. It feels urgent. People spend hours compiling disavow files to submit to Google, rejecting links from sites they&#8217;ve never heard of.</p>



<p class="wp-block-paragraph">Most of the time, this is wasted effort.</p>



<p class="wp-block-paragraph">Google has said repeatedly that its algorithms are very good at identifying and ignoring low-quality links on their own. John Mueller has addressed this directly more than once. The system doesn&#8217;t need your help to figure out that a spammy comment link on a random blog isn&#8217;t a genuine endorsement of your site. Google just ignores it.</p>



<p class="wp-block-paragraph">The disavow tool exists for specific situations. If you&#8217;ve received a manual penalty related to unnatural links, you may need to disavow the links that caused it. If you previously participated in a paid link scheme and want to clean it up, the disavow tool is appropriate. These are deliberate, known problems where you&#8217;re telling Google &#8220;I know about these specific links and I want you to ignore them.&#8221;</p>



<p class="wp-block-paragraph">What the disavow tool is not for is going through every link a third-party tool paints red and rejecting it preemptively. Those tools use their own proprietary scoring to determine what&#8217;s &#8220;toxic.&#8221; Their criteria don&#8217;t necessarily match what Google considers problematic. A link from a low-DA site with a foreign-language domain might look suspicious to a tool&#8217;s algorithm but be a perfectly legitimate link from a real site in another country. Disavowing it doesn&#8217;t help you. In some cases, people accidentally disavow links that were actually passing value to their site.</p>



<p class="wp-block-paragraph">The risk isn&#8217;t just wasted time. It&#8217;s the possibility of removing links that were helping. If you haven&#8217;t received a manual penalty and you aren&#8217;t cleaning up a link scheme you knowingly participated in, you almost certainly don&#8217;t need to touch the disavow tool. Let Google&#8217;s algorithms handle the noise. They&#8217;ve been doing it for years.</p>



<p class="wp-block-paragraph">If anything, use a toxic link designation as a notification that this is a link you may want to take a closer look at. Nothing more.</p>



<h2 class="wp-block-heading">The Pattern Across Both Notes</h2>



<p class="wp-block-paragraph">Last week&#8217;s note and this one share the same underlying problem. People anchor to a number or a concept because it feels concrete and measurable, and they optimize for it without asking whether it actually connects to how search engines work.</p>



<p class="wp-block-paragraph">The fix is always the same question. Does this thing I&#8217;m spending time on directly influence how Google evaluates my site? If the answer is no, or if the answer is &#8220;only in a very specific context that doesn&#8217;t apply to what I&#8217;m doing,&#8221; that time is better spent elsewhere.</p>



<p class="wp-block-paragraph">There&#8217;s no shortage of things in SEO that actually matter and actually respond to effort. Spend your time there.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>4 SEO Metrics You&#8217;re Reading Wrong</title>
		<link>https://theseopub.com/4-seo-metrics-youre-reading-wrong/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 14 Apr 2026 13:00:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3485</guid>

					<description><![CDATA[There are metrics in SEO that feel important but lead to bad decisions when applied without context. The numbers themselves aren&#8217;t useless. The problem is how people look at them. Each of the four metrics below has a specific, narrow context where it&#8217;s genuinely useful. Outside that context, it&#8217;s noise. The pattern is always the [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">There are metrics in SEO that feel important but lead to bad decisions when applied without context. The numbers themselves aren&#8217;t useless. The problem is how people look at them.</p>



<p class="wp-block-paragraph">Each of the four metrics below has a specific, narrow context where it&#8217;s genuinely useful. Outside that context, it&#8217;s noise. The pattern is always the same: a metric that means something at one level of granularity becomes meaningless when zoomed out.</p>



<h2 class="wp-block-heading">Click-Through Rate in Google Search Console</h2>



<p class="wp-block-paragraph">This is the one I see most often in forums and Slack groups. Someone pulls up their site&#8217;s CTR in Search Console, sees 2.1%, and panics. Or they look at a specific page&#8217;s CTR, see 3.4%, and start rewriting title tags.</p>



<p class="wp-block-paragraph">The problem is what that number actually represents. Site-wide CTR is an average across every query the site appeared for, including queries where the site ranked on page 5, page 8, page 12. Those impressions where you ranked 47th and nobody clicked? They&#8217;re in that average, dragging the number down. A site could be performing brilliantly for its important queries and still show a low aggregate CTR because it also has thousands of impressions for queries where it barely appeared.</p>



<p class="wp-block-paragraph">Page-level CTR has the same issue. A single page might rank #3 for its target keyword but also show up at position 40 for a dozen tangentially related queries. Those low-ranking impressions pull the page&#8217;s average CTR down even though the page is performing exactly as it should for the query that matters.</p>



<p class="wp-block-paragraph">CTR is useful in exactly one context: a single query, evaluated relative to its average position. If a query is averaging position 2 and has a 2% CTR, something is wrong. Maybe the title tag doesn&#8217;t match the intent. Maybe a featured snippet or AI overview is stealing the click. Maybe the SERP is dominated by ads. That&#8217;s a real signal worth investigating. But you can only see it at the individual query level. The aggregate number tells you nothing actionable.</p>



<h2 class="wp-block-heading">Average Position in Google Search Console</h2>



<p class="wp-block-paragraph">Same mistake, different metric. People look at their site&#8217;s overall average position, see something like 28.4, and think their site is performing terribly.</p>



<p class="wp-block-paragraph">But consider what that number actually averages. A site that ranks #1 for 10 valuable queries and #80 for 200 low-relevance queries it barely targets may show an average position in the 30s or 40s. That&#8217;s not a struggling site. That&#8217;s a site performing well for its important terms and showing up incidentally for a bunch of things it never optimized for (or are too difficult for it to rank for yet).</p>



<p class="wp-block-paragraph">Average position only means something at the individual query level, and even there it&#8217;s imperfect because it&#8217;s averaged across whatever fluctuations happened during the date range you&#8217;re looking at. A query that ranked #3 for 20 days and #15 for 10 days will show an average position around 7, which doesn&#8217;t reflect either of the positions it actually held.</p>



<p class="wp-block-paragraph">The useful application is tracking a specific query&#8217;s average position over time to spot trends. Is your target keyword gradually improving? Gradually declining? Holding steady? That&#8217;s useful. The site-wide number is not.</p>



<h2 class="wp-block-heading">Domain Authority (and Domain Rating and Authority Score)</h2>



<p class="wp-block-paragraph">Domain Authority is Moz&#8217;s metric. Domain Rating is Ahrefs&#8217;. Authority Score is Semrush&#8217;s. They all attempt to approximate the overall strength of a domain on a 0-100 scale. And they&#8217;re all misused in the same three ways.</p>



<p class="wp-block-paragraph"><strong>Misuse 1: Judging link quality.</strong>&nbsp;People look at the DA of a site linking to them and use it as a proxy for link value. &#8220;I got a link from a DA 72 site&#8221; sounds impressive. But link value doesn&#8217;t come from the strength of the overall domain. It comes from the strength, relevance, and authority of the specific page linking to you. A link from a strong, well-linked page on a DA 30 niche site can pass more value than a link from a page with zero backlinks on a DA 90 site. When you evaluate a link, you need to look at the linking page, not the domain&#8217;s aggregate score.</p>



<p class="wp-block-paragraph"><strong>Misuse 2: Gauging ranking difficulty.</strong>&nbsp;People look at the DA of the top 10 results for a keyword and conclude that if they&#8217;re all DA 70+, the keyword is too competitive. This tells you almost nothing useful. DA is a domain-level metric. What matters for ranking is the strength of the specific pages at the top: their backlink profiles, their content relevance, their topical authority. A page on a DA 90 domain with no links and thin content is beatable. A page on a DA 40 domain with strong links and deep, relevant content is not. The domain number is a distraction from what actually determines the ranking.</p>



<p class="wp-block-paragraph"><strong>Misuse 3: Tracking your own DA.</strong>&nbsp;I see this constantly. People track their site&#8217;s DA over time as if it&#8217;s a KPI. No search engine on the planet uses Domain Authority. It&#8217;s a third-party metric calculated by a third-party tool using that tool&#8217;s own methodology and data. It doesn&#8217;t factor into Google&#8217;s ranking algorithm. It doesn&#8217;t appear in any Google patent. Google has said explicitly that they don&#8217;t use it. Tracking your own DA is tracking someone else&#8217;s estimate of how authoritative your site might be, updated on someone else&#8217;s schedule, using criteria that don&#8217;t match how Google actually evaluates authority. Your time is better spent tracking metrics that directly reflect performance: rankings for target queries, organic traffic, conversions.</p>



<p class="wp-block-paragraph"><strong>And a warning about link sellers.</strong>&nbsp;DA is the favorite metric of people selling links, and that should tell you something. When you see someone advertising that they can get you links on DA 90+ sites, what they&#8217;re usually selling are profile links, forum signatures, or directory listings. These are pages with no real content, no backlinks of their own, and often no link path from the root domain of the site. They don&#8217;t benefit from the overall strength of the domain because there&#8217;s no link equity flowing to them. These links have zero authority.</p>



<p class="wp-block-paragraph">The DA number is the entire sales pitch because it sounds impressive and most buyers don&#8217;t know enough to question it. Here&#8217;s my mantra on this: anyone selling links based on DA is either incompetent or a scammer. They&#8217;re incompetent because they don&#8217;t understand how link equity actually flows through sites. Or they&#8217;re a scammer because they do understand it but are using DA to make worthless links sound valuable. Either way, you shouldn&#8217;t be buying from them.</p>



<h2 class="wp-block-heading">Number of Search Results</h2>



<p class="wp-block-paragraph">People search a keyword in Google, see &#8220;About 12,400,000 results&#8221; at the top of the page, and conclude the keyword is extremely competitive. Or they see &#8220;About 8,200 results&#8221; and conclude it&#8217;s easy pickings. Both conclusions are wrong.</p>



<p class="wp-block-paragraph">The number of indexed pages that contain words related to a query tells you nothing about the strength of the pages at the top of the results. A query could return 50 million results, but if the top 5 are weak, low-authority pages with thin content, that query is very winnable. A query could return 3,000 results, but if the top 5 are deeply authoritative pages backed by strong link profiles, good luck.</p>



<p class="wp-block-paragraph">Think of it like a marathon. If you&#8217;re the 3rd fastest runner in the field, it doesn&#8217;t matter whether there are 4,000 or 40,000 participants. You&#8217;re finishing 3rd either way. The total number of runners in the race tells you nothing about how fast the people ahead of you are running. What matters is the competition at the front, not the size of the field behind them.</p>



<p class="wp-block-paragraph">If you want to evaluate keyword difficulty, look at what&#8217;s actually ranking in the top 5 to 10 positions. Look at their content quality, backlink profiles, topical authority, and how well they match the search intent. That&#8217;s the competition. The number at the top of the SERP is just how many pages Google found that were tangentially related to the words you typed.</p>



<h2 class="wp-block-heading">The Pattern</h2>



<p class="wp-block-paragraph">Every one of these mistakes follows the same structure. A metric that means something at a specific, narrow level gets applied at a broad level where it loses all context. CTR means something for a single query at a known position. Average position means something for a single query over time. Link strength means something at the page level. Competitive difficulty means something when you evaluate the actual pages ranking, not the total count.</p>



<p class="wp-block-paragraph">The fix is always the same question: does this number, at this level, actually tell me what I&#8217;m trying to learn? If the answer is no, stop looking at it and zoom in until it does.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Long Title Tags, Automated (Free Claude Skill + ChatGPT GPT)</title>
		<link>https://theseopub.com/long-title-tags-automated/</link>
		
		<dc:creator><![CDATA[Mike Friedman]]></dc:creator>
		<pubDate>Tue, 07 Apr 2026 13:16:53 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">https://theseopub.com/?p=3480</guid>

					<description><![CDATA[Last February I shared a note about using longer title tags to improve Google rankings. It was based on research from Joy Hawkins at Sterling Sky and Joel Headley, a former Google employee, and backed up by results I was seeing on my own client sites. You can&#160;read that original note here. That note got [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Last February I shared a note about using longer title tags to improve Google rankings. It was based on research from Joy Hawkins at Sterling Sky and Joel Headley, a former Google employee, and backed up by results I was seeing on my own client sites. You can&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/do-longer-title-tags-help-with-google-rankings/" target="_blank">read that original note here</a>.</p>



<p class="wp-block-paragraph">That note got a bigger response than I expected. A lot of people tried it and saw results. I&#8217;ve been using this approach on every client site since, and it&#8217;s become one of the most consistently effective on-page changes I make.</p>



<p class="wp-block-paragraph">So I turned the process into a tool. It&#8217;s available as both a free Claude skill and a free ChatGPT GPT. Give it a topic, a brand name, and the page content (or a brief if the page doesn&#8217;t exist yet), and it builds a strategically long, multi-segment title tag designed to rank for multiple search intents per page.</p>



<h2 class="wp-block-heading">Why Long Title Tags Work (The Evidence)</h2>



<p class="wp-block-paragraph">The standard SEO advice is to keep title tags under 60 characters to avoid truncation. That advice is outdated, and the evidence against it is strong.</p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://www.sterlingsky.ca/how-long-title-tags-help-with-ranking-on-google/" target="_blank">Joy Hawkins&#8217; team at Sterling Sky</a>&nbsp;tested title tags exceeding 200 characters across multiple pages and documented noticeable ranking improvements. Joel Headley, who spent years at Google, tested thousands of healthcare websites. He injected neighborhood names into title tags and saw a 15% increase in visibility across the sites he tested.</p>



<p class="wp-block-paragraph">The key insight is that Google reads the entire title tag for ranking purposes, even when it truncates what it displays in search results. Google has also confirmed that when it rewrites a displayed title (which it does frequently), the original title tag is still used for ranking. So the title tag&#8217;s job is to help Google understand what the page is about and match it to queries. Display is secondary.</p>



<p class="wp-block-paragraph">That changes the calculus. If Google reads the whole thing but only shows part of it, you should optimize for ranking, not for what fits in a search snippet.</p>



<p class="wp-block-paragraph">Since adopting this approach across my client sites, I&#8217;ve consistently seen ranking improvements when expanding title tags from the traditional 50-60 character range to 150-250 characters. Not every page sees a dramatic jump, but the direction has been reliably positive. The&nbsp;<a rel="noreferrer noopener" href="https://theseopub.com/do-longer-title-tags-help-with-google-rankings/" target="_blank">original note</a>&nbsp;includes specific before-and-after examples from my own work.</p>



<h2 class="wp-block-heading">How the Tool Works</h2>



<p class="wp-block-paragraph">The tool builds title tags using a multi-segment architecture. Each segment between the hyphens functions as a near-standalone title tag targeting a slightly different search intent. Instead of one short title trying to capture a single query, you get three or four segments that each target a distinct variation of what someone might search for.</p>



<p class="wp-block-paragraph">Give it a topic, a brand name, and the page content, and it handles the rest. The tool accepts the content as a URL (it will fetch and review the page), pasted text, or an attached document. It scans the actual content before building segments to make sure every segment is supported by what&#8217;s on the page.</p>



<p class="wp-block-paragraph">This is the part that matters most. A title tag is a promise to both users and search engines. If Segment 2 says &#8220;affordable&#8221; but the page never discusses pricing, that&#8217;s a misalignment. If a segment references a location the business doesn&#8217;t actually serve, that&#8217;s a problem. The tool checks for this and won&#8217;t include intents or claims the content doesn&#8217;t support.</p>



<p class="wp-block-paragraph"><strong>If the content doesn&#8217;t exist yet</strong>, the tool runs in Draft Mode. You give it a topic or content brief instead of a live page, and it builds the title tag based on the intended scope. The output is labeled as a draft with a reminder to validate it against the final content before publishing. This is useful when you&#8217;re planning a content calendar or topical map and want title tags ready before the pages are written.</p>



<p class="wp-block-paragraph"><strong>What it outputs:</strong></p>



<ul class="wp-block-list">
<li>The full title tag (targeting 150-250 characters)</li>



<li>Character count</li>



<li>A rationale for each segment explaining which search intent it targets and why</li>



<li>A content alignment status (verified against actual content, or flagged as draft)</li>
</ul>



<p class="wp-block-paragraph"><strong>It also includes a Local SEO Mode.</strong>&nbsp;If you mention locations, service areas, or neighborhoods, it automatically switches to injecting geo-modifiers into the segments. This is especially useful for service area businesses that need to target multiple locations without creating dozens of thin pages.</p>



<h2 class="wp-block-heading">The Segment Architecture</h2>



<p class="wp-block-paragraph">Every title tag the tool builds follows this structure:</p>



<p class="wp-block-paragraph"><strong>Segment 1 (Primary).</strong>&nbsp;Targets the main keyword. This is the most important segment because it&#8217;s what Google is most likely to display. Lead with your strongest keyword here.</p>



<p class="wp-block-paragraph"><strong>Segment 2.</strong>&nbsp;Targets a secondary intent, a related query variation, or reframes the topic for a different type of searcher. This should be meaningfully different from Segment 1, not just the same keywords rearranged.</p>



<p class="wp-block-paragraph"><strong>Segment 3.</strong>&nbsp;Targets a tertiary intent. This could be a how-to framing, a benefit statement, an objection-handling angle, or a long-tail variation.</p>



<p class="wp-block-paragraph"><strong>Segment 4 (Optional).</strong>&nbsp;Only used when a genuinely distinct intent exists that isn&#8217;t covered by the first three. The tool won&#8217;t force a fourth segment just to add length.</p>



<p class="wp-block-paragraph"><strong>Brand.</strong>&nbsp;Always last.</p>



<p class="wp-block-paragraph">Each segment is separated by a hyphen, and each one should read as a complete, natural phrase. Not a keyword fragment. Not a comma-separated list of terms. A readable title that could stand on its own.</p>



<h2 class="wp-block-heading">What It Avoids</h2>



<p class="wp-block-paragraph">The difference between a strategically long title tag and keyword stuffing is structure. The tool enforces several rules:</p>



<p class="wp-block-paragraph">No content misalignment. Every segment must be supported by what&#8217;s actually on the page. If the content doesn&#8217;t cover a topic, the title tag won&#8217;t promise it.</p>



<p class="wp-block-paragraph">No exact keyword repetition across segments. It uses synonyms, conditional synonyms (words that aren&#8217;t dictionary synonyms but function as synonyms in context), and reframings instead.</p>



<p class="wp-block-paragraph">No fragments. &#8220;Best cheap fast plumber&#8221; is not a segment. &#8220;Affordable Emergency Plumber For Your Home&#8221; is.</p>



<p class="wp-block-paragraph">No forced segments. If three segments plus the brand cover the intent space, it stops at three.</p>



<p class="wp-block-paragraph">No pipes. Segments are separated by hyphens, not pipes. No real reason. I just hate pipes.</p>



<h2 class="wp-block-heading">Examples</h2>



<p class="wp-block-paragraph">Here are a few examples showing the before (traditional short title) and after (multi-segment title).</p>



<h3 class="wp-block-heading">Standard Mode: Product Tool Page</h3>



<p class="wp-block-paragraph"><strong>Before:</strong>&nbsp;Free SKU Generator &#8211; ACME Corp</p>



<p class="wp-block-paragraph"><strong>After:</strong>&nbsp;SKU Generator &#8211; Create SKUs On Demand For Free &#8211; Effortlessly Build SKUs For Your Entire Inventory &#8211; ACME Corp&nbsp;<em>(111 characters)</em></p>



<p class="wp-block-paragraph">Segment 1 targets the head term &#8220;SKU generator.&#8221; Segment 2 targets &#8220;create SKUs free.&#8221; Segment 3 targets the use-case intent of building SKUs for an entire inventory. Three different types of searchers, one title tag.</p>



<h3 class="wp-block-heading">Standard Mode: Informational Guide</h3>



<p class="wp-block-paragraph"><strong>Before:</strong>&nbsp;How To Build AI Agents &#8211; ACME Corp</p>



<p class="wp-block-paragraph"><strong>After:</strong>&nbsp;Learn How To Build AI Agents &#8211; Free Guide For Building AI Agents From Beginning To Implementation &#8211; Avoid These 5 Mistakes In Building Your AI Agent &#8211; ACME Corp&nbsp;<em>(161 characters)</em></p>



<p class="wp-block-paragraph">Segment 1 targets the how-to query. Segment 2 targets people searching for a comprehensive guide. Segment 3 targets the mistake-avoidance angle, a distinct informational intent.</p>



<h3 class="wp-block-heading">Local SEO Mode: Service Business</h3>



<p class="wp-block-paragraph"><strong>Before:</strong>&nbsp;Emergency Plumber &#8211; Acme Plumbing</p>



<p class="wp-block-paragraph"><strong>After:</strong>&nbsp;Emergency Plumber in Dallas &#8211; 24/7 Plumbing Repair and Drain Services &#8211; Fast Emergency Plumbing near Fort Worth and Arlington &#8211; Acme Plumbing&nbsp;<em>(143 characters)</em></p>



<p class="wp-block-paragraph">Segment 1 targets the primary local query &#8220;emergency plumber Dallas.&#8221; Segment 2 broadens to service-type variations without a geo-modifier. Segment 3 picks up secondary locations with natural phrasing. Three geo-targets, two service variations, one title tag.</p>



<h2 class="wp-block-heading">How to Get It</h2>



<p class="wp-block-paragraph">The tool is available in two formats. Same instructions, same output, just different platforms.</p>



<h3 class="wp-block-heading">Claude Skill</h3>



<ol class="wp-block-list">
<li>Download the skill file: <strong><a href="https://theseopub.com/wp-content/uploads/title-tag-builder.skill">DOWNLOAD LINK</a></strong></li>



<li>Open Claude.ai</li>



<li>Go to <strong>Settings > Capabilities</strong></li>



<li>Scroll to the Skills section and upload the file</li>



<li>Toggle the skill <strong>ON</strong></li>



<li>Start a new chat</li>
</ol>



<p class="wp-block-paragraph">To use it: &#8220;Using the title tag skill, build a title tag for [topic]. Brand is [brand name].&#8221; Then provide the page content by pasting a URL, the text, or attaching a document. If the page doesn&#8217;t exist yet, tell it to use Draft Mode and provide a topic or brief instead.</p>



<p class="wp-block-paragraph">For Local SEO Mode, just include locations in your request: &#8220;Using the title tag skill, build a title tag for emergency plumbing services in Dallas, Fort Worth, and Arlington. Brand is Acme Plumbing.&#8221;</p>



<h3 class="wp-block-heading">ChatGPT GPT</h3>



<p class="wp-block-paragraph"><strong><a href="https://chatgpt.com/g/g-69d3c60ee3308191ab1f625d82fe5d54-extended-title-tag-creator">LINK TO GPT</a></strong></p>



<p class="wp-block-paragraph">Same instructions, same output. Use whichever platform you prefer. If you have both, the only difference is that Claude skills persist across chats while a GPT is a separate conversation each time. That, and I think Claude just does a better job at everything 😉.</p>



<h2 class="wp-block-heading">One More Thing</h2>



<p class="wp-block-paragraph">This tool handles individual title tags well. But if you&#8217;re doing a full site audit or building title tags for an entire content cluster, the real leverage comes from thinking about title tags sitewide. Search engines aggregate title tags across your site to understand your overall topicality. The tool includes semantic SEO principles like conditional synonyms, hypernym-hyponym pairing, and sitewide n-gram awareness to help with this, but that&#8217;s a topic for a deeper note down the road.</p>



<p class="wp-block-paragraph">For now, try it on a few pages and see what happens. The&nbsp;<a rel="noreferrer noopener" href="https://www.sterlingsky.ca/how-long-title-tags-help-with-ranking-on-google/" target="_blank">original case study from Sterling Sky</a>&nbsp;has the full evidence if you want to dig in before testing.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
