From Guesswork to Architecture: How to Take Control of Your Google Classification

Last Verified April 20, 2026

The HCU research says something uncomfortable:

“It’s not WE who determine whether our topics are on-point or not, it is Google.”

Classification happens to you. You publish, hope, and wait. Google decides what your site is about.

But here’s what the research missed — and it changes everything.

Groundbreaking HCU Research: Why Google’s Classification System Really Works

In April 2024, forensic SEO specialist Carolyn Holzman published groundbreaking research that reframes everything we thought we knew about Google’s Helpful Content System.

The research, titled “Decoding Google’s Helpful Content System: Analyzing Data Supported With Field Observation of the HCS,” wasn’t based on speculation or theory. It was built on years of indexation research, controlled testing, and field data from live business sites.

Her central hypothesis challenges the conventional understanding:

“Google doesn’t only index, rank, and serve individual pages on a site. The Helpful Content System polices a site-wide factor based on the topical nucleus of a site.”

In other words, the Helpful Content System isn’t really about “helpful content” at all—at least not in the way most people think. It’s about topical coherence at the domain level. And that coherence isn’t just a Google signal—it’s also how AI systems decide which sites to autonomously surface as authorities, a dynamic explored in depth when you understand topical coherence as an authority signal in AI’s discovery process.

This distinction matters enormously. Because if you’ve been trying to recover from HCU by making your content “more helpful” or “more people-first,” you’ve been solving the wrong problem. This explains why page-level fixes stopped working—the system evaluates coherence at the domain level, not individual page quality.

What the Research Actually Found

Holzman’s research combines controlled testing with real-world case studies. The findings are striking.

Finding #1: How Topical Theming Improves Google Indexation by 113%

When Holzman shifted her test sites from random topics to topically themed content, indexation rates improved dramatically:

Test Condition	Before Theming	After Theming	Improvement
Desktop (Simple Keywords)	63%	94%	~50%
Desktop (JavaScript)	68%	88.8%	~30%
Mobile (Simple Keywords)	46%	98%	~113%

The content itself didn’t change in quality. What changed was the topical coherence of the site as a whole.

Finding #2: Why HCU Recovery Requires Site-Wide Content Fixes

One of the most striking case studies involved an indoor hobby site with 400 pages of content. After losing approximately 70% of traffic, the site underwent extensive updates.

Here’s what the team discovered:

“They had to complete significant improvements across ALL 400 problem pages before improvements could be seen. Normally, improvements would have been seen once a page had been reworked.”

Pages fixed in November 2022 didn’t recover until February 2023. Even new pages launched in January 2023 didn’t perform until February 2023—after the majority of site updates were complete.

The old playbook—fix a page, see improvement in 24-48 hours, move to the next—no longer works.

All top pages showed recovery on the same dates, regardless of when individual work was completed. The system had to be coherent before any individual page benefited. This timing pattern isn’t arbitrary—it reflects the mathematics behind content authority decay, where signals accumulate and expire on predictable cycles that operate at the domain level, not the page level.

Finding #3: How Google’s Site Classification Overrides Your Content Intent

Perhaps the most uncomfortable finding comes from a roofing company case study.

The site was populated with local content pages before the roofing service pages were complete. When Google first indexed the site, local content outnumbered roofing content.

The result? Google classified it as a “city site,” not a roofing site. The roofing service pages were indexed but not served—9 out of 12 service pages were invisible to searchers looking for roofing services. This is a structural visibility problem — and it extends beyond topic density alone: DOM clarity and topical classification control are equally critical factors in whether Google reads your content the way you intend.

It took three months after adding nine more roofing-related pages—making roofing “the topic with the most pages”—before the site began generating roofing queries.

Holzman summarizes:

“It’s not WE who determine whether our topics are on-point or not, in the end, it is Google.”

Finding #4: The Critical Gap in Current SEO Tools

Here’s the admission that stopped me in my tracks:

“There is no software tool that can take HC measurements because at this time on-page tools measure only one page at a time or one page against other domain’s pages that are ranking for the same term.”

The Helpful Content System operates at the site-wide level, evaluating topical cohesion across your entire domain. But every existing SEO tool analyzes content page-by-page. That blind spot has real consequences—and it’s compounding: there’s a hidden technical cause of visibility collapse that page-level tools are structurally incapable of detecting.

This mismatch means businesses have had no way to see what the HCS classifier actually evaluates. This gap has only widened as AI-driven content evaluation systems have become central to how both search engines and generative AI platforms assess authority. It’s worth noting that not all site types face this classification challenge equally—understanding how news sites bypass HCS classifiers reveals just how deliberately the system operates on non-news sites.

Beyond Topical Coherence: Why Semantic Relationships Matter More

Holzman’s research establishes that topical coherence is a critical site-wide factor. But here’s what the research doesn’t address—and where the real opportunity lies.

Topical coherence is about having enough content on a topic for Google to recognize it as your domain’s focus. It’s necessary for passing Google’s classifier. But passing the classifier is only the starting point—the next skill is learning to forward-engineer your domain signal rather than reverse-engineer what already ranks.

But having a bunch of posts on the same topic isn’t the same as demonstrating expertise. That distinction is the foundation of expertise architecture that signals topical authority—a structural shift most SEO strategies haven’t made yet.

The Difference: Topical Coherence vs. Semantic Authority

Consider the difference:

Topically coherent but disconnected:

50 posts about marketing automation
Each post is good on its own
No explicit connections between them
Google sees: “This site writes about marketing automation”
AI sees: “Surface coverage—I’ll cite them if they have a specific fact I need”

Topically coherent AND semantically connected:

50 posts about marketing automation
Post on segmentation explicitly explains it’s a prerequisite for personalization
Post on automation explicitly builds on the nurturing framework from an earlier post
Post on metrics explicitly connects back to the goals established in strategy posts
Google sees: “This site has comprehensive, structured expertise in marketing automation”
AI sees: “This is someone who understands marketing automation as a system—I’ll cite them as an authority”

The difference is what I call semantic relationship clarity—the explicit connections between ideas that demonstrate how you think, not just what you write about. Building those explicit connections in practice means creating semantic relationship links that signal topical authority to both search engines and AI systems.

Topical coherence gets you past Google’s gate. Semantic relationship clarity gets you cited as an authority.

The Control Factor: From Guesswork to Architecture

Here’s where this gets actionable.

The research reveals a frustrating reality: Google classifies your site, and you’re at its mercy. Without visibility into what the classifier sees, you’re left guessing.

But that’s only true if you leave your expertise implicit. The alternative is engineering classification through signal architecture—structuring your content so the signals Google and AI systems read are the ones you deliberately emit.

Making Your Expertise Explicit: The Key to Controlled Classification

When you just publish posts on related topics—without explicitly showing how they connect—you’re gambling. Google sees scattered content. It infers relationships. It classifies you based on patterns it detects.

You’re at its mercy.

But when you make your semantic relationships explicit?

When your post on segmentation explicitly states: “Segmentation is a prerequisite for personalization because you can’t personalize content until you understand who you’re personalizing it for. The research outputs from our [buyer persona methodology] become the decision inputs for strategic content planning…”

When your post on automation explicitly builds: “This workflow design expands on the nurturing framework we established in [previous post], adding the trigger logic that determines when prospects move between stages…”

When your post on metrics explicitly connects: “These KPIs directly measure the strategic goals we outlined in [strategy post], creating accountability for the outcomes that matter…”

Google sees a methodology. A system. A comprehensive expertise.

It classifies you as you intended — because you showed it how your ideas connect.

The Shift in Mindset

This is the fundamental shift:

Old Approach	New Approach
Publish → hope Google infers connections → wait to see what happens	Map your expertise → make relationships explicit → shape your classification
React to algorithm changes	Architect your expertise to be understood
Hope for the best	Make your methodology visible
Classification happens to you	Classification reflects your intent

You’re not powerless. You just need to stop leaving your expertise invisible.

Practical Application: How VizzEx Solves the HCU Classification Problem

The research validates what we’ve been building with VizzEx. If you’re exploring how VizzEx fits into the broader landscape of content intelligence tools built for topical architecture, that context matters for understanding why the approach is different.

If the Helpful Content System evaluates topical coherence at the site-wide level, you need tools that can see your content at the site-wide level. Page-by-page analysis isn’t enough.

How VizzEx Addresses the Site-Wide Analysis Gap

VizzEx was designed from the ground up to do what Holzman’s research says no tool could do yet—analyze your blog horizontally across all content, not vertically one page at a time. That distinction — how AI evaluates your content as a connected system rather than as isolated documents — is the conceptual foundation the entire platform is built on.

Here’s what that looks like in practice:

See what the classifier likely sees. VizzEx shows you your topic clusters and their relative density—revealing what Google likely thinks your site is about, not what you think it’s about.

Identify disconnected content. VizzEx finds isolated posts and content islands that aren’t contributing to your topical authority—the exact problem the HCU research identifies.

Get specific connection recommendations. Not vague advice to “add more internal links.” VizzEx provides specific semantic linking opportunities with reasoning for why each connection matters—the relationship context AI needs to see. If you’re evaluating how this compares to keyword-based linking tools, semantic linking tools built for this classification era operate on fundamentally different logic.

Make your methodology visible. This process starts with mapping semantic relationships across your content to understand how your posts connect before you can make those connections explicit. VizzEx maps the relationships between your posts, revealing where methodology connections are missing and how to build the explicit bridges that demonstrate expertise.

The goal isn’t to game Google’s classifier. It’s to make the expertise you already have visible and mappable—so classification reflects your actual authority, not the algorithm’s best guess. That progression—from implicit expertise to structured, machine-readable authority—is what defines semantic architecture for search and AI systems at scale.

The Research Behind the Tool

Here’s what makes this moment significant: The research that validates this approach was conducted independently of VizzEx’s development.

Carolyn Holzman spent years building her indexation research project and analyzing live site data. She published her HCU findings in April 2024. I built VizzEx based on my own experience with semantic content analysis and AI discoverability in 2025.

When I reached out to show Carolyn what we’d built, she got excited—because VizzEx was solving exactly the problem her research had identified.

Now we’ve joined forces. Carolyn is a collaborator, advisor, and partner in VizzEx. Her forensic SEO expertise and ongoing research inform how we develop the platform.

This isn’t a case of building research to validate a product. It’s independent research that happened to validate an approach—and the researcher joining the team because the tool fills the gap her work identified.

Taking Control of Your Classification

The research says Google classifies your site — and you’re at its mercy.

But that’s only true if you leave the connections implicit.

When you make your semantic relationships explicit:

Google sees a methodology, not scattered posts
AI recognizes comprehensive expertise, not surface coverage
Classification reflects your intent, not algorithmic inference

From guesswork to architecture.

From hoping to showing.

From scattered content to demonstrated methodology.

Your expertise exists. Now make it visible.

VizzEx is currently in beta. If you’re ready to see your blog the way Google’s classifier sees it—topic clusters, connectivity scores, semantic gaps and all — apply for the VizzEx beta program.

Continue Reading

This article is part of our series on Google’s Helpful Content System, based on Carolyn Holzman’s independent research and its implications for content strategy.

Upcoming Articles in This Series:

Why Your Page-by-Page SEO Recovery Strategy No Longer Works
Your Site’s Topic Isn’t What You Think It Is: How Google’s HCS Classifier Determines Your Niche
The Math Behind HCU: How Topic Density Determines What Google Serves
The Paradigm Shift SEOs Are Missing: Relevance Is Now a Domain Factor
See What Google’s HCS Classifier Sees—Before It Classifies Your Site
Google Says Recovery Takes Months. Every Day You Wait Adds to Your Timeline.

Essential Background Reading

About the Research and Methodology

About the research: “Decoding Google’s Helpful Content System: Analyzing Data Supported With Field Observation of the HCS” was published by Carolyn Holzman through Vertmontly, Inc. in April 2024. Holzman is a forensic SEO specialist who researches how search is evolving and hosts a podcast on her findings. She is now a collaborator and partner at VizzEx.

Frequently Asked Questions

Why didn't my site recover from the HCU update even after improving my content?

They had to complete significant improvements across ALL 400 problem pages before improvements could be seen. Normally, improvements would have been seen once a page had been reworked. All top pages showed recovery on the same dates, regardless of when individual work was completed. The system had to be coherent before any individual page benefited.

How does Google actually classify what my website is about?

Google doesn't only index, rank, and serve individual pages on a site. The Helpful Content System polices a site-wide factor based on the topical nucleus of a site. The roofing company case study illustrates this: when local content outnumbered roofing content at first index, Google classified the site as a 'city site,' not a roofing site—leaving 9 out of 12 service pages invisible to searchers looking for roofing services.

What is the difference between topical coherence and semantic authority?

Topical coherence is about having enough content on a topic for Google to recognize it as your domain's focus. But having a bunch of posts on the same topic isn't the same as demonstrating expertise. A topically coherent but disconnected site leads Google to see 'This site writes about marketing automation,' while a topically coherent AND semantically connected site signals a demonstrated methodology that both Google and AI systems recognize as authoritative.

Why can't my SEO tools detect Helpful Content System problems?

There is no software tool that can take HC measurements because at this time on-page tools measure only one page at a time or one page against other domain's pages that are ranking for the same term. The Helpful Content System operates at the site-wide level, evaluating topical cohesion across your entire domain—but every existing SEO tool analyzes content page-by-page. This mismatch means businesses have had no way to see what the HCS classifier actually evaluates.

How much can topical theming improve Google indexation rates?

When Holzman shifted her test sites from random topics to topically themed content, indexation rates improved dramatically. Mobile simple keyword indexation improved from 46% to 98%—approximately 113% improvement. The content itself didn't change in quality. What changed was the topical coherence of the site as a whole.