Skip to main content
SEO

Noindex Tag

An HTML meta tag or HTTP header telling search engines not to include a page in their index, used for thin, duplicate, or private content pages.

What Is the Noindex Tag?

The noindex tag is a directive that instructs search engine crawlers not to include a specific page in the search index. It can be implemented as an HTML meta tag in the page's <head> section — <meta name="robots" content="noindex"> — or as an HTTP response header: X-Robots-Tag: noindex. Either method tells Google, Bing, and other crawlers: "Crawl this page if you like, but do not include it in your index and do not show it in search results."

Unlike robots.txt, which blocks crawlers from accessing URLs entirely, the noindex tag allows crawlers to visit the page — it just prevents indexation. This distinction matters: a page blocked by robots.txt can't be read, so any canonical tags, links, or other signals on that page are invisible to crawlers. A page with a noindex tag can be read and its signals processed; it just won't appear in search results.

When Google encounters a consistent noindex directive on a page, it will eventually drop that page from the index. "Eventually" is important — the deindexing doesn't happen instantly; it occurs when Googlebot next crawls the page and processes the updated directive.

Why the Noindex Tag Matters for Marketers

Not every page on a website should be indexed. Indexing the wrong pages actively harms SEO performance in several ways.

First, quality dilution: search engines evaluate domain quality holistically. Sites with large percentages of thin, low-value, or boilerplate pages may have their average quality rating depressed — making it harder for high-quality pages to rank well. Noindexing low-value pages concentrates quality signals on the content that matters.

Second, crawl budget efficiency: search engines allocate a limited crawl budget to each domain based on its authority and size. If crawlers spend time indexing paginated archives, tag pages, search result pages, and admin pages, they have less budget to discover and re-crawl your important content. Noindexing low-priority pages redirects crawl activity toward high-value URLs.

Third, keyword cannibalization prevention: multiple thin, indexed pages on the same topic compete against each other in rankings. Noindexing duplicates, near-duplicates, and thin variations consolidates ranking potential on a single authoritative page.

How to Implement Noindex Tags

  1. Identify pages that should not be indexed. Common candidates: thank-you pages, checkout confirmation pages, password-protected pages, admin interfaces, search results pages, tag archives, author archives (if thin), printer-friendly versions, and staging content.
  2. Implement the meta tag. In the <head> of the target page, add: <meta name="robots" content="noindex, follow">. The "follow" part tells crawlers they can still follow links on the page — useful for pages that link to important content you want discovered.
  3. Use HTTP headers for non-HTML resources. PDFs and other document types can't contain meta tags — use the X-Robots-Tag HTTP header instead.
  4. Do not include noindexed pages in your XML sitemap. Submitting noindex pages to a sitemap sends contradictory signals. Remove them from sitemap generation in your CMS or sitemap plugin settings.
  5. Verify crawlability is maintained. A common mistake is adding both noindex and a robots.txt block to the same URL. If Googlebot is blocked from crawling the page, it cannot see the noindex directive and the page may remain indexed.

How to Measure Noindex Implementation

Google Search Console's Coverage report is the primary measurement tool. The "Excluded" section shows URLs not indexed, with reasons including "Excluded by 'noindex' tag." Monitor this list to confirm intentional noindex pages appear as expected and no high-value pages are accidentally excluded.

Conduct periodic audits using Screaming Frog — crawl the full site and filter URLs by meta robots status to verify noindex is applied only where intended and not inadvertently applied to important content pages.

The noindex tag carries direct implications for AI search visibility. AI retrieval systems — those powering Perplexity, Google's AI Overviews, and similar tools — draw from Google's index. A page with a noindex directive will not be indexed by Google and will therefore not be available to AI systems that retrieve from Google's crawl. If you're working to ensure your brand and content appear in AI-generated answers, every page you want cited must be indexable. Audit your noindex implementation carefully to ensure that high-quality, authoritative content isn't accidentally excluded from the indexed web — and from AI search visibility.

Want to improve your AI search visibility?

Run a free AI visibility scan and see where your brand shows up in ChatGPT, Perplexity, and AI Overviews.

Run Free Visibility Scan
Book a call