A canonical tag (also "canonical link" or "canonical URL") is used in the HTML code of a website to refer to the original page when content is used more than once. In this way, a "duplicate content" accusation can be counteracted. This occurs when two or more URLs have the same or very similar content. In order not to show users results with the same content, search engines filter out duplicate content and mark it as such. This can have negative effects on rankings and should therefore be avoided at all costs. Pages that link to the original page with a canonical tag are not used for indexing.
A canonical URL is therefore a preferred or representative URL of several URL versions, each with similar or the same content (duplicate content), which can be communicated to search engines by setting an additional meta tag ("canonical tag") on the non-canonical URLs.
Contents
- Avoidance of Duplicate Content
- In which cases is the use of Canonical Tags recommended?
- What does a Canonical Tag look like and how is it used?
- Advantages of Canonical Tags
- Meta Canonical to another URL
- HTTP Canonical to another URL
Avoidance of Duplicate Content
-
Define an URL standard and set up a 301 redirect from all the same (secondary) pages to the standard variant.
- Set up an XML sitemap to show search engines which is the default variant of the URL.
- Make sure that all internal as well as external backlinks point to the same URL variant.
- When using duplicate content-generating parameters, exclude them in the "Parameter handling" of Google Webmaster Tools (Website configuration > Settings > Parameter handling).
In which cases is the use of Canonical Tags recommended?
If you do not have access to the administrator of the website to set up a 301 redirect or duplicate content is simply unavoidable due to the system (for example, if a redirect is not desired on a page, such as when calling up the print version or a filter page in an online shop).
What does a Canonical Tag look like and how is it used?
A canonical tag is placed in the head "" of a page that produces duplicate content (canonical tags are to be added to the other meta tags).
For example, the page
http://www.yoursite.com/products.html?session_id=xyz
would have the following canonical tag:
Advantages of Canonical Tags:
- The change can be made directly on a single page - virtually "on the spot", which is why it is often referred to as a "mini 301".
- No programming knowledge is necessary, and there is no need to configure the web server.
- The reputation of an (old) page is fully inherited (as with a "301").
- The Google PageRank is no longer distributed among several pages with similar or the same content, but is bundled on a main page defined by Canonical Tag.
- It is now possible to define an alternative default page without redirecting to it - for example, when redirection is not even desired (such as when using session IDs or tracking parameters).
- It is a "standard" that is supported across various search engines
Why should every page be labelled (canonical tag on itself)?
In this way, you can prevent potential duplicate content problems later on and be prepared once and for all. For example, (advertising-) partners may link to your site later on with tracking parameters attached to the URLs, and search engines may thus become aware of new addresses.
Another possibility is that your website is visited by scrapers and is available in whole or in part twice on the net - in this case, the Canonical tag would be copied at the same time and ensure that reference is made to your domain.
Canonical Tag plus Disallow, Noindex and/or Nofollow?
According to the motto "better safe than sorry", you might be inclined to block duplicate content URLs via robots.txt and the "Disallow" directive and/or to provide the duplicate content pages with the Noindex and Nofollow attributes.
If the duplicate content URLs contain attached parameters, you could also use the "parameter handling" in Google Webmaster Tools.
Is all this necessary to get on top of duplicate content? No!
Google only recommends setting the Canonical Tag, but not using a method that blocks or excludes the URL instead or in addition.
Meta Canonical to another URL
Meta Canonical tags are also located in the of the page and are used to indicate the preferred version of a URL. This signals to the crawlers that only the preferred version of the URL should be indexed and all authority and value should be attributed to the preferred, canonical version.
For example, by using the canonical tag, we can pass the value of the second and third URLs from the following block to URL #1 and ensure that only URL #1 is indexed. This is the correct use of canonical tags.
http://www.example.com/shirts (kanonisch)
http://www.example.com/shirts?size=medium&color=red
http://www.example.com/shirts?size=large&color=green
However, the canonical tag can be misused if it is used to consolidate authority from different URLs to one unrelated URL. For example, if canonical tags are placed on multiple established product pages to direct authority to the landing page for a new, unrelated product, this would be a misuse of this tag.
We recommend checking the canonical tags and making sure that they actually point to the clean (without parameters) version of a relevant URL.
HTTP Canonical to another URL
HTTP Canonicals are the equivalent of the already mentioned Meta vs. HTTP Nofollow and Noindex tags. The HTTP method is just another way to use canonical tags and should be sanitised in the same way as meta-canonicals.