Webforumz Newsletter - September 2007
Tutorials
Using 301 redirects to remove duplicate content
Diego Alto raised an interesting point in August's edition in his SEO Quicklist article - that of "The duplicate homepage".
Having read this you may now be worried about your own site having duplicate content. In summary, the following four URL's all offer the same content, BUT they are all considered different pages by Google. Therefore you will have the same content indexed on four different pages.
- http://1840dsgn.co.uk/
- http://www.1840dsgn.co.uk/
- http://1840dsgn.co.uk/index.htm
- http://www.1840dsgn.co.uk/index.htm
This becomes a problem when URL's penalise you for having duplicate content.
So what can you do?
Today I will be talking you through using a .htaccess file, and the mod-rewrite function to ensure your homepage is only indexed once. What we will be doing is setting up a redirect script that will redirect anyone visiting http://1840dsgn.co.uk/ to the www version of the site http://www.1840dsgn.co.uk/. This will be set up as a 301 redirect so the search engine spiders will index only the www version of the page.
Warning: This method only works on Linux based servers
The .htaccess file
.htaccess is a file that lives in the root directory of your website. It has no extension and must be named .htaccess for it to function correctly.
You may already have a .htaccess file, so check on your remote host before creating a new one. If you do - download it onto your local system, if you don't create a new file and name it .htaccess. This can be created in any text editing/code editing program. Once you have created/located your file, open it for editing.
Setting up your .htaccess file
If you have created a new file, or if this code is not in your existing file then add it on a new line:
RewriteEngine On
Your file is now ready for rewriting commands.
Re-redirecting from a non-www domain to a www domain
To create a permanent 301 redirect from your non-www domain to your www domain add the following code on a new line of your .htaccess file:
#301 redirect to www
RewriteCond %{HTTP_HOST} ^1840dsgn.co.uk
RewriteRule (.*) http://www.1840dsgn.co.uk/$1 [R=301,L]
We begin with a comment, #301 redirect to www, so we know what this bit of script is doing when we revisit the file.
Our second line of code outlines the condition that must be met for the redirect script to kick in, a bit like an if statement that those of you who use programming languages will be used to. It tells the browser, IF the text immediately proceeding http:// in the URL is 1840dsgn.co.uk then perform the rewrite rule.
The final line of our code performs the redirect. Firstly we grab any path extension on the non-www URL, for example http://1840dsgn.co.uk/portfolio.php is entered, we grab /portfolio.php using the regular expression (.*), this is automatically placed in a generic variable called $1, we then redirect the page to the www URL, and attach any path extension we picked up using $1 on the end of our URL. We then add [R=301,L], this tells the browser that this is a permanent 301 redirect, and that it is the last rewrite rule so if this one is matched it can stop searching for more.
Remember when implementing this for your own site to replace the URL's to match your own domain!
Linking back to your homepage
The second problem to solve is the duplicate content served by http://www.1840dsgn.co.uk and http://www.1840dsgn.co.uk/index.html. Again these are the same page but under different URLs. Fortunately solving this problem is a lot easier than using a 301 redirect! Just ensure when you link back to your homepage you link using src="/" rather than src="/index.html". Alternatively use absolute links to your homepage and link to src="http://www.1840dsgn.co.uk".
Testing
To test whether your rewrite rule has been successful you can use a Server Header checking tool. An example is SEO Consultants Server Header Check.
Visit the tool and enter your non-www URL, I entered http://1840dsgn.co.uk and got the following response:
SEO Consultants Directory Check Server Headers - Single URI Results
Current Date and Time: 2007-08-05T03:29:34-0700
User IP Address: 82.69.37.203
#1 Server Response: http://1840dsgn.co.uk
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Sun, 05 Aug 2007 10:29:34 GMT
Server: Apache/1.3.33
Location: http://www.1840dsgn.co.uk/
Connection: close
Content-Type: text/html; charset=iso-8859-1
Redirect Target: http://www.1840dsgn.co.uk/
The important content here is 301 Moved Permanently so we can see that our 301 redirect has been successful, it also tells you where your page is redirecting to, what content type is on the page and what server you are running on.
Troubleshooting
If you have correctly implemented the code above and it is still not working properly you will need to check with your hosting company that modrewrite is enabled, if it is not they will usually be able to enable it for you.
And we're done!
The 301 redirect is the safest way to ensure you do not get penalised for serving up duplicate content on multiple URLs.