View Categories

A Beginner’s Guide to ‘Canonical’ Tags

  • May 13, 2020
  • 0
  • by Alex Ali

There are many cases when you’ll have different URLs leading to the same (or almost identical) pages. However, search engines tend to penalize anything that looks like ‘duplicate’ content. That means you need to inform them which version of those pages is the original or primary one.

That’s where canonical tags come in. These are simple code snippets that tell search engines which version of a page or URL they should consider as being the ‘canonical’ (or main) option. With canonical tags, you can avoid penalties for duplicate content. Plus, they save you from having to implement complex redirects.

In this article, we’re going to talk about what canonical tags are and why you might need to use them. Then we’ll teach you three ways to specify canonical pages, in order to avoid Search Engine Optimization (SEO) issues. Let’s get to work!

An Introduction to Canonical Tags

When you’re running a website, there are a lot of situations where you might end up with different versions of the same page, or one page that’s accessed using several unique URLs. E-commerce platforms, for example, often generate URLs on the fly. That means you can have a product with several URLs leading towards it.

The problem is that having multiple URLs for a single page can confuse search engines, leading them to think that you’re displaying duplicate content. That can have a negative impact on your Search Engine Optimization (SEO), although how much it really affects your site is up for discussion.

What you can do in these situations is use canonical tags to tell search engines which page or URL is your ‘go-to’ version. Other URLs will remain usable, but you’ll avoid any potential SEO issues. Plus, setting up these tags can also help you collect analytics more accurately, because most tools will only track canonical pages.

3 Ways to Specify Canonical Pages

There are several ways you can determine which pages search engine should consider canonical, and each approach has its pros and cons. Let’s run through the main options, so you can select the one that best meets your needs.

1. Use the rel=canonical Tag

The rel=canonical tag is a simple bit of code that you can add to any page’s HTML, which points search engines towards the canonical version. Let’s say, for example, that we have three identical pages with the following URLs:

  1. mystore.com/shop/white-sneakers
  2. mystore.com/shop/product-id=1
  3. mystore.com/shop/2019/white-sneakers

In this scenario, we have an e-commerce platform that generates multiple URLs for the same product, and we want to set the first option as the primary URL. To do that, we’ll need to add rel=canonical tags to pages two and three, pointing towards the first address.

To add those tags, you’ll need to open one of the page’s HTML files and locate the <head> section (at the very top of the file). Then, add the following snippet between the <head> and </head> tags:

<link rel="canonical" href="mystore.com/shop/white-sneakers" />

Now save the changes to the document, and repeat the process with all the other non-canonical pages.

This approach can be time-consuming, depending on how many canonical pages you want to designate and how many duplicate URLs you have on your hands. However, it’s simple from a technical perspective, and the additional code should barely make a difference to the file size of your HTML documents (and as a result, their performance).

2. Specify Canonical Pages Using HTTP Headers

If for some reason you don’t have access to a page’s <head> tags, or you’re dealing with files that don’t have them (such as PDFs), you’ll need to use a different approach. In those cases, you can use HTTP headers instead.

This method requires you to have access to your website’s .htaccess file. You’ll usually find this located within your website’s root folder, which means you’ll want to use File Transfer Protocol (FTP). If you don’t already have a client set up, we recommend using FileZilla.

Once you have a client installed, connect to your site using your FTP credentials, then locate and open the .htaccess file:

Editing the .htaccess file.

There should already be some code within, which will vary depending on your server’s configuration. Unless you know what you’re doing, we don’t recommend tinkering with this file too much.

Instead, make a copy just in case, and then add the following snippet to the end of the original file:

<Files "example.pdf">
Header add Link "< http://mystore.com/shop/ >; rel=\"canonical\""
</Files>

What this code will do is set example.pdf as the canonical version of the file. You can repeat this process for other types of files as well, and since it doesn’t add code to their <head> tags, it shouldn’t impact file sizes.

3. Use Your Sitemap to Specify Canonical Pages

Sitemaps are relatively straightforward documents, which include URLs of all the pages that make up a website. You can submit a sitemap to search engines if you want to make sure they don’t miss any of your pages.

Another upside to using a sitemap is that Google considers the URLs you include within it to be canonical. To build on our earlier example, let’s look once more at three URLs for a theoretical e-commerce product:

  1. mystore.com/shop/white-sneakers
  2. mystore.com/shop/product-id=1
  3. mystore.com/shop/2019/white-sneakers

Instead of adding rel=canonical tags to pages two and three, you can simply include the first link within your sitemap, leaving out the rest. This involves a lot less work than the two approaches we’ve looked at so far.

The downside to this technique is that Google only takes this as a ‘suggestion’. Including a link on your sitemap results in a pretty good chance that the search engine will consider it canonical, but since the process is automated, mistakes can still happen. With that in mind, we recommend using a combination of this method and rel=canonical tags for particularly important pages.

Conclusion

One of the things that search engines penalize most severely is duplicate content. By using canonical tags, you can prevent situations where search engines look at different versions of your site’s URLs, and think they’re dealing with duplicate pages.

The good news is that specifying canonical pages is easy. There are three approaches you can take, depending on your needs:

  1. Use the rel=canonical tag.
  2. Specify canonical pages using HTTP headers.
  3. Use your sitemap to inform search engines about canonical pages.

Do you have any questions about how to use the techniques we’ve introduced? Let’s talk about them in the comments section below!

Image credit: Pixabay.

The A2 Posting