Article No. 65

Keyword Research: The Complete Process, From Seed Terms to Prioritized List

Abstract

Keyword research isn't really about finding "more traffic." It's about matching the language real people use when they search to the pages you actually build, so that content decisions are...

On this page

Keyword research isn’t really about finding “more traffic.” It’s about matching the language real people use when they search to the pages you actually build, so that content decisions are based on demonstrated demand instead of guesses about what sounds important. Done well, it’s a filtering process: a wide net that generates a large raw list, followed by several rounds of narrowing until what’s left is a short, defensible set of terms worth building content around.

This guide walks through that process end to end, at a level that lets you run it competently on your own. It doesn’t go deep on any single stage. Reading search volume data, finding keywords your tools don’t show you, the long-tail/short-tail/head-term split, branded versus non-branded segmentation, commercial intent classification, fixing overlap between your own pages, and running a full competitor gap analysis are each substantial enough to need their own treatment, and each has one. Consider this the map; the other guides are the terrain.

What Keyword Research Actually Accomplishes

Every page on a well-run site should exist because a group of real people are searching for something that page answers, not because someone decided the topic sounded relevant. Keyword research is the step that turns “we should probably write about X” into “here is the specific language people use, roughly how often, and what they expect to find when they search it.” Skip it, and content gets built on assumptions about your own industry’s vocabulary, which is frequently different from the vocabulary your customers actually use.

A common gap: a business describes its own service with internal or technical terminology, while customers search using plainer, more problem-focused language. A plumbing company might think of a service as “hydro jetting,” while the person searching describes the problem as “why does my drain keep backing up.” Both matter, but if the page only speaks the internal term, it never surfaces for the more common way people actually search. Keyword research exists to catch that gap before a page is built, not after it fails to rank.

The Five-Step Process

1. Seed Keyword Generation

Start with the terms you already know matter: your core services or products, the categories a customer would use to describe what you do, and the problems those products or services solve. Then add two sources most people skip:

  • Your own Google Search Console data. The Performance report shows queries you’re already getting impressions for, including terms you never deliberately targeted. This is real demand, already happening, not a guess.
  • Customer language, pulled from support tickets, sales call notes, reviews, and how customers describe the problem in their own words, which is often not the industry term you’d default to.

A handful of good seeds (10 to 30) is enough to start. The expansion step is where the list grows.

To pull seeds from Search Console specifically: open the Performance report, add a query filter or just sort the Queries tab by impressions, and scan for terms that are getting real impression volume but aren’t the primary target of any existing page. That’s a query Google already associates your site with, which makes it a stronger starting seed than a term you’re guessing at from scratch.

2. Expansion

From each seed, generate the fuller universe of related terms. There are two broad approaches:

Tool-based expansion runs your seeds through Google Keyword Planner, Ahrefs, or Semrush, each of which returns related terms, questions, and variations at scale by matching your seed against their own keyword databases. Enter “drain cleaning” and you’ll get back everything from close variants to adjacent services, sorted by whatever volume or relevance metric that particular tool uses. Non-tool expansion mines Google’s own SERP features (People Also Ask, related searches, autocomplete) and human-language sources like forums and review sites, which surface phrasing that keyword databases under-report, particularly newer or hyper-specific language that hasn’t accumulated enough search volume to register in a third-party tool’s dataset yet.

Both matter, and they catch different things. Tools are fast and comprehensive for terms that already have enough search volume to be tracked. SERP and community mining catches real demand before it’s built up enough volume to show anywhere. The full method for this second category, including how to do it systematically rather than by screenshotting a few PAA boxes, is covered in a dedicated guide on finding the keywords your tools don’t show you.

3. Filtering Criteria

Expansion produces noise along with signal, often a lot of it. Filter in this order:

  1. Relevance first. Does this term actually describe something you offer or answer, not just something loosely related? A landscaping company’s expansion list will include terms like “landscape architect salary” or “landscaping business ideas,” both real searches with real volume, neither one relevant to a company selling landscaping services. Cut these before they waste time in later steps.
  2. Volume and difficulty second. Is there enough real demand, and is ranking for it realistic given your current authority?
  3. Intent third. Does the term match what the page is meant to do (inform, compare, or convert)?

Order matters here. Filtering by volume before relevance means wasting effort scoring difficulty and intent on terms that should have been cut in the first pass.

Reading what a reported volume number actually means, including why the same keyword shows different volume across tools and when a keyword worth targeting shows no volume data at all, is its own subject with its own guide. So is classifying intent accurately instead of guessing from modifier words alone, which has a dedicated guide as well.

4. Prioritization Framework

Once the list is filtered down to genuinely relevant, viable terms, prioritize with a simple, defensible logic rather than sorting by volume alone:

Priority = Relevance x Opportunity x Feasibility

  • Relevance: how directly the term maps to something you actually offer
  • Opportunity: realistic traffic and business value if you ranked
  • Feasibility: your current ability to rank, given site authority and competitive strength on that specific term

A high-volume term with weak relevance or unrealistic feasibility should rank below a lower-volume term that scores well on all three. This is the same underlying logic used later when evaluating competitor gap data, just applied to your own keyword list first.

5. Mapping Keywords to Pages

The final step is assigning keywords to content, and the rule that matters most here is: one keyword cluster per page, not one keyword per page. Group closely related terms with shared intent onto a single, well-built page instead of creating near-duplicate pages for each variation. “Emergency drain cleaning,” “24-hour drain cleaning,” and “same-day drain cleaning” are three different keywords with three different volumes, but they represent the same search intent and belong on the same page, not three thin pages competing with each other.

Skipping this step, or mapping casually without checking what already exists on the site, is the single most common cause of keyword cannibalization, where two of your own pages end up competing for the same query instead of reinforcing each other. That diagnosis-and-fix process has its own guide, but the cheapest fix is avoiding it here, before either page exists.

The Tool Landscape, Honestly

None of the major keyword tools show you Google’s actual, exact search data. Knowing what each one really provides matters more than knowing its feature list.

Tool What it actually gives you Real limitation
Google Keyword Planner Free, built on Google's own ad-auction data Accounts without active, ongoing ad spend see broad volume ranges (0, 1-100, 100-1K, 1K-10K, and so on) instead of exact numbers, a change <a href="https://searchengineland.com/google-officially-throttling-keyword-planner-data-low-spending-adwords-accounts-255795">Google confirmed in 2016</a> that still applies
Ahrefs / Semrush Third-party, clickstream-informed estimates, not Google's own numbers These are modeled approximations built from panels of real user behavior data blended with other sources, not a census; treat them as directionally useful, not precise
Google Search Console Not a discovery tool by design, but the single best source of data on terms you're <em>already</em> getting impressions for Only shows what's already happening, so it can't surface demand for terms you haven't touched at all

The distinction between Keyword Planner’s first-party (but ad-driven) data and Ahrefs/Semrush’s third-party (but broader) estimates matters because it’s easy to describe either one as showing “real Google numbers.” Neither does. Google Keyword Planner shows Google’s actual ad-auction data, which is real but incomplete for accounts without spend; Ahrefs and Semrush show statistical estimates built from external data sources, not numbers pulled directly from Google’s servers. The full mechanics of how those estimates are built, and how to judge whether a specific number in front of you is trustworthy, is covered in the guide to reading search volume data.

Where This Guide Stops

This piece covers the process. It deliberately doesn’t cover, in any depth:

  • Reading volume data correctly (what a reported number actually represents, why it varies by tool, and when to distrust it)
  • Finding keywords standard tools don’t surface at all (GSC mining, community language, systematic autocomplete harvesting)
  • The long-tail versus short-tail versus head-term taxonomy, and how to build a portfolio across all three
  • Branded versus non-branded segmentation, and why conflating the two distorts your own performance reporting
  • Commercial intent classification, using signals rather than a fixed list of “buyer” words
  • Fixing keyword overlap between your own pages once it’s already happened
  • Running a full competitor gap analysis, from identifying real search competitors through turning gap data into a content plan

Each of those is a genuinely different skill, and cramming all of them into one “complete guide” is exactly the failure mode this piece is trying to avoid. Follow the process above to get from a blank page to a prioritized keyword list, then use the deeper guides for whichever stage needs more precision.

Call Now Button