Article No. 92

SEO Glossary Part 6: Core Web Vitals to Directory

Abstract

Sixth entry in the running SEO glossary, continuing alphabetically from "Core Web Vitals" to "directory." This segment leans more technical than Part 5, covering crawling, indexing, and performance vocabulary. Core...

On this page

Sixth entry in the running SEO glossary, continuing alphabetically from “Core Web Vitals” to “directory.” This segment leans more technical than Part 5, covering crawling, indexing, and performance vocabulary.

Core Web Vitals

Core Web Vitals are a specific set of page-experience metrics Google uses as a ranking input: Largest Contentful Paint (LCP), which measures loading performance; Interaction to Next Paint (INP), which measures responsiveness to user interaction; and Cumulative Layout Shift (CLS), which measures visual stability during load.

Current recommended thresholds are LCP under 2.5 seconds, INP under 200 milliseconds, and CLS under 0.1. INP is a relatively recent addition; it officially replaced First Input Delay (FID) as the responsiveness metric on March 12, 2024, because Google’s Chrome team found FID’s single-first-interaction measurement missed a page’s overall responsiveness across an entire visit. Google measures these using real-world field data collected from Chrome users through the Chrome User Experience Report (CrUX), not lab simulations alone, and mobile and desktop are scored separately since real-world conditions differ significantly between them.

Lab tools like Lighthouse and PageSpeed Insights’ lab section run a simulated test on a single device and connection profile, which is useful for debugging a specific change but doesn’t reflect the range of real conditions actual visitors experience. CrUX field data, by contrast, aggregates real Chrome user sessions over a rolling 28-day period, which is why a page can show a passing Lighthouse score while still failing its real-world CrUX assessment for the same metric, particularly on sites with a lot of lower-end mobile traffic that a lab test on a fast connection won’t surface.

Cost Per Click (CPC)

Cost per click is a paid advertising metric measuring how much an advertiser pays each time someone clicks their ad. In organic SEO work, CPC data is frequently borrowed as a proxy for commercial intent and competition, since a keyword advertisers are willing to pay heavily for typically converts well. CPC varies enormously by industry and intent: highly competitive, high-value categories like legal services or insurance can run into the tens of dollars per click or higher, while broad informational queries often cost well under a dollar, when they carry meaningful ad competition at all.

Using CPC as an organic keyword-prioritization signal has a real limitation worth flagging: it reflects advertiser willingness to pay, not searcher volume or organic ranking difficulty, so a keyword can carry a high CPC because a handful of advertisers are locked in a bidding war over a small number of monthly searches, which makes it a poor target for an organic content strategy despite the impressive-looking dollar figure attached to it.

Crawl Budget

Crawl budget refers to the number of pages a search engine’s crawler will request from a given site within a given period, shaped by a mix of the site’s overall authority, how much load the server can comfortably handle, and how much genuinely new or updated content the crawler expects to find. This concept matters almost exclusively for large sites, generally those with thousands to millions of pages, where inefficient crawling can mean some pages get visited rarely or not at all. Crawl traps, such as infinite URL parameter combinations or faceted navigation generating near-duplicate pages, are a common way sites waste crawl budget on low-value URLs instead of pages that actually need attention. XML sitemaps help by giving the crawler a prioritized list of URLs the site owner considers important.

A useful diagnostic for whether crawl budget is actually a problem on a given site is comparing the number of pages in the sitemap against the number of pages Search Console’s crawl stats report shows being requested regularly; a large, sustained gap between those two numbers is a stronger signal of a real crawl budget issue than any general concern about site size alone. For most small and mid-sized sites, crawl budget genuinely isn’t a limiting factor, and time spent chasing it is often better spent on content quality or technical errors that block crawling entirely.

Crawler

A crawler, also called a spider or bot, is automated software that systematically browses websites, following links to discover content and gather information for a search engine’s index. Googlebot runs in both desktop and mobile variants, with the mobile crawler being the primary one used for indexing under Google’s mobile-first indexing approach. Crawlers capable of rendering JavaScript execute a page’s scripts to see content the way a browser would, but this rendering step happens separately from and later than the initial crawl, and the rendering engine used may lag slightly behind the very latest Chromium release.

Not every crawler behaves the same way, which matters when interpreting server logs. Search engine crawlers identify themselves through a user-agent string and generally respect robots.txt, while many SEO tools also run their own crawlers to audit sites, and malicious scraper bots frequently spoof a legitimate crawler’s user-agent string to avoid being blocked. Verifying that traffic claiming to be Googlebot is actually coming from a legitimate Google IP range, something Google documents how to check, is a standard step before trusting log data that shows unusual Googlebot behavior.

CSS (Cascading Style Sheets)

CSS is the language that controls a webpage’s visual presentation, including layout, color, typography, and responsive behavior, keeping presentation logically separate from the underlying content structure defined in HTML. Render-blocking CSS, meaning stylesheet files the browser must fully download before it can display anything, directly affects Largest Contentful Paint, since the page can’t paint its largest visible element until blocking resources finish loading. Common performance techniques include inlining the small amount of “critical” CSS needed for the initial viewport and deferring the loading of everything else. Layout shifts, the behavior CLS specifically measures, are frequently a CSS problem in disguise: images and embedded elements without explicit width and height attributes reserved in advance cause the browser to shift surrounding content once the asset finishes loading, which is one of the most common and most fixable causes of a failing CLS score.

A deep link is a hyperlink pointing to a specific page within a site’s structure rather than to the homepage, giving users and crawlers direct access to a particular piece of content. Building internal (and earning external) deep links helps distribute authority throughout a site’s architecture instead of concentrating it entirely on the homepage or top-level category pages, and it strengthens topical relationships by connecting semantically related pages to each other directly. From a link-earning perspective, deep links are also generally harder to acquire than homepage links, since an external site has to find and value a specific inner page rather than just referencing a brand’s homepage, which is part of why a strong volume of deep, content-level backlinks is often a better authority signal than a large number of homepage-only links.

De-index

De-indexing is the removal of a page from a search engine’s index, which can happen intentionally, through a noindex meta tag, an X-Robots-Tag HTTP header, or a manual removal request, or unintentionally, through a manual action, algorithmic penalty, or technical error. It’s worth keeping the mechanisms distinct: a noindex directive prevents indexing while still allowing the page to be crawled, whereas a robots.txt disallow blocks crawling but doesn’t guarantee a URL already in the index will actually be removed from it. Deliberate, strategic de-indexing of low-value pages, like thin internal search results or duplicate parameter URLs, is a legitimate way to improve a site’s overall quality signals rather than something to avoid by default.

Directory

A web directory is a site that organizes links to other websites into categorized lists, a format that predates modern search engines and once served as a primary way people navigated the web. DMOZ (the Open Directory Project), historically the most influential general web directory, shut down in March 2017 after roughly two decades of operation, which effectively ended whatever residual SEO value general-purpose directory submission still carried. Quality, niche, or genuinely local directories can still provide legitimate citations, particularly for local SEO purposes, but most broad, low-effort web directories offer minimal ranking value today and submitting to large numbers of them risks resembling a link scheme rather than a legitimate citation strategy.

Quick reference

Term One-line definition
Core Web Vitals Google's page-experience ranking metrics: LCP, INP, and CLS
Cost Per Click (CPC) What an advertiser pays per ad click, sometimes borrowed as an intent signal
Crawl Budget How many pages a crawler will request from a site in a given period
Crawler Automated software that discovers and indexes web content
CSS The language controlling a page's visual presentation, layout, and styling
Deep Link A link pointing to a specific inner page rather than the homepage
De-index Removal of a page from a search engine's index, intentional or not
Directory A site organizing links to other sites into categorized lists

Sources cited: web.dev: Interaction to Next Paint is officially a Core Web Vital, Google Search Central: crawl budget management, Search Engine Land: DMOZ closure coverage

Call Now Button