Getting a Handle on Your Canonicals

Have you ever worried about duplicate content?  If you haven’t, you’re definitely in the minority.

Everyone who runs a website has at one point or another worried about whether or not their site is being silently penalized in the search engines for having duplicate content.

Just to be sure everyone’s on the same page let’s first define what duplicate content REALLY is.

When you have two pages that are identical to one another on your own website, this is considered duplicate content.  And chances are you likely have duplicate content but you just don’t know it.

Here’s why that would happen…

Let’s say that you have a website located at http://www.Example.com.

You create a new web page on your site and have a few people link to that page.

  • Person #1 links to you using http://www.Example.com.
  • Person #2 links to you using http://Example.com (notice the lack of the “www”).
  • Person #3 links to you using http://www.Example.com/ (notice the trailing slash).
  • And Person #4 links to you using http://www.Example.com/index.html (notice the “index.html).

To YOU and to the people visiting your site, this is all the same page.

To the SEARCH ENGINES however, these are all different URLs. Surprised?

So as far as the search engines are concerned you have 4 pages on your site that all contain the same exact content – hence, the duplicate content issue.

You see, search engines can’t tell that these are the same URLs (I’m shaking my head as I say that because it’s still beyond me), but, the fact remains, that this is in fact, very true.

So how do you fix this issue?  Well, thankfully, there’s a few ways to fix this issue.

  1. When you link internally, that is, link to other pages within your own website, consistently use the same URL.  ALWAYS use http://www.Example.com or http://Example.com -whichever one is your preferred way of linking, choose one and stick with it.
  2. Additionally you can also use a 301 redirect to point to your preferred way of linking to your URL.  For example, if you prefer http://www.Example.com then set up a 301 redirect on http://Example.com, http://www.Example.com/, http://www.Example.com/index.html, etc.  (I’ve listed below the most common ways people would link to your site).
  3. Within your Google webmaster tools area you can TELL Google what you’re preferred way of linking is.
  4. Be sure that when you submit a sitemap for your website that within that sitemap, all of the URLs are using your preferred way of linking.

Some potential issues are:

  1. If you don’t have direct control over the webhost that administers your files, you’ll have to have someone else place that 301 redirect on the URLs you want.
  2. A lot of free web hosts don’t let you create a 301 redirect.
  3. Session IDs on a website can create a huge duplicate content issue.  Since each page may be accessed with a different session ID in the URL, that page may be indexed multiple times; even though it’s the same page.

Now if all else fails and you simply can’t implement a 301 redirect to your preferred way of linking, then consider placing the canonical link element on the individual page.

In the head of the individual web page you’d place the following:

<head>
<link rel=”canonical” href=”http://www.Example.com/page.html” />
</head>

Note the trailing slash after “page.html”.  This is very important to include.

The bottom line is, using the canonical link element on your website is EXTREMELY beneficial to you.  But first and foremost, use the methods listed in the first four items at the top and if those fail, then use the canonical link element.

This information has summarized what Matt Cutts said in his 20-minute presentation; but if you want more information, here’s a few more places you can check out:

And, as promised, here’s a list of URLs that are all different in the search engines eyes and that might cause duplicate content issues:

  • www.Example.com
  • Example.com
  • www.Example.com/
  • example.com/
  • www.example.com/index.html
  • example.com/index.html
  • www.example.com/Home.aspx
  • example.com/Home.aspx

Comments

Comments are closed.