Canonical Tag
In one line
The canonical tag tells search engines which URL is the master version when multiple URLs serve the same or near-duplicate content.
Going deeper
The canonical tag goes in `<head>` as `<link rel="canonical" href="https://example.com/path">`. It tells search engines which URL is the master version when the same content is reachable through multiple paths. Originally proposed in 2009 by Google, Microsoft and Yahoo, it is now respected by every major search engine and most AI crawlers. The need is structural: ecommerce categories surface the same product through three or four paths, `?utm=...` tracking parameters fork URLs infinitely, and www / non-www / http / https variants happen on every site. Duplicate URLs are normal — canonicals tell engines which one to count.
Three operating rules. **First, self-reference everywhere by default** — `example.com/a` should canonicalise to `example.com/a`. That one line alone resolves around 90% of accidental duplication from tracking parameters, session IDs and case-variant URLs. **Second, use absolute URLs**, not relative `/path` — relative paths technically work but risk protocol or subdomain confusion. **Third, align everything**: body content, internal links, sitemap and hreflang must all point at the same URL. If the signals disagree, Google ignores your canonical and picks one itself.
This is also where the worst SEO accidents happen. The classic disaster is a CMS template that points every page's canonical at the homepage — one line removes the entire site from the index. Pagination is the second landmine: if `/blog?page=2` canonicalises to `/blog`, every post on page 2+ disappears from the index. Each paginated page should self-reference. And combining `noindex` with a canonical sends contradictory signals — both lose credibility, so don't pair them.
Google has stated explicitly that canonicals are a hint, not a directive. If the body content doesn't actually match the declared canonical, Google picks a different URL itself. You see this in Search Console as `Duplicate, Google chose different canonical than user`. Reviewing that report monthly is the fastest way to surface structural site issues — they show up here before they show up in traffic.
From a GEO angle, canonicals are an anti-fragmentation device. When the same content sits at five URLs, AI engines hesitate over which one to cite, and citations end up scattered — so domain-level authority never accumulates. Canonicalising onto one URL lets SEO equity and GEO citation eligibility build on the same asset. Pair clean canonicals with 301 redirects so the consolidation holds even if a crawler ignores the hint.
Related terms
Sitemap.xml
An XML sitemap is a structured file that lists the important URLs of a site so search engines can discover and crawl them efficiently — its impact grows with site size.
SEOrobots.txt
robots.txt is the text file at a site's root that tells search engines and AI crawlers which paths they may or may not crawl — a long-standing web standard.
SEOhreflang
hreflang is the tag that tells search engines which version of a page to serve to users by language and region when you have multiple localised URLs.
SEOKeyword Cannibalization
Keyword cannibalization happens when multiple pages on the same site target the same query and end up competing with each other, dragging all of their rankings down.
SEOInternal Linking
Internal linking is the practice of connecting pages within the same domain — and it shapes crawl efficiency, how authority flows between pages, and how users move through the site.
How does your brand show up in AI answers?
Villion measures how your brand appears across ChatGPT, Perplexity and AI Overviews, then automates the work that lifts citation rate and share of voice.
Get a free audit