First and foremost let me get the following out of the way:
- I admire SEOmoz.
- I admire Rand Fishkin and his team. (I love his yellow shoes)
As the vast majority of you would agree, SEOMoz is a great company with some fantastic resources on search engine optimisation and social media. I have been using SEOMoz for over a year now and have always found its tools useful and to some extent very accurate. However a few days ago I noticed something really shocking.
This campaign was setup at root domain level and according to
Track at root domain level. Track all the different subdomains within this root domain.
Example: The root domain seomoz.org has www.seomoz.org, guides.seomoz.org, and pro.seomoz.org all as subdomains within the root domain. If we discover pages on any of the subdomains during our crawl, they’ll be included in the data we display.
Take a look at the following snapshot.
Above is an overview of crawl diagnostics for a campaign I have been working on very recently. As you can see in the graph, SEOMOz reports roughly about 200 duplicate content issues. If you drill down into the “duplicate page content” area you will see the following:
In this case the client/ their developer had decided to roll a new section without keeping us in the loop, this had resulted in some duplicate content issues. Now if you look closely, you will see dates where the issue started and so on. I looked into this report I straight away I could tell what the problem was.
You should not trust SEOmoz’s duplicate content report
I trusted SEOMoz’s duplicate content report because it had always been accurate to some extent. This time around and to my surprise, a manual audit of the website showed that every page of this website had 4 separate duplicates, something along the following lines:
- http://sub. domain.co.uk/section/service
- http://sub1. domain.co.uk/section/service
- http://sub2. domain.co.uk/section/service
I only found this issue out by carrying out some manual checks on Google, all the duplicate pages were indexed and cached. I didn’t know what to make of it really. I kept thinking why didn’t SEOmoz report this? So I started looking into this further and to my surprise I found the 5th duplicate issue.
The 5th URL was something along the lines of http://testing.domain.co.uk/section/service. At this point I had found 5 different URLs serving the same content, exactly the same content. And what was SEOMOZ reporting? Nothing! Absolutely nothing about such a blatant issue!
I figured SEOmoz does not check Google’s index when it comes to duplicate content. It only reports duplicate content based on its crawls. I still could not believe my own theory so I raised this issue on Twitter:
5 hours later SEOmoz replied
This is insane!
How can you rely purely on your own data without double checking it against what is out there on SE indices? Clearly SEOmoz’s approach is totally flawed and very misleading. If I relied on their tool then I would have been totally screwed.
Obviously SEOmoz has missed a trick here but this issue raises another important issue. If Google can find and index all the above mentioned URLs serving exact match duplicate content and SEOmoz cannot do the same then surely the SEOmoz crawler itself is inaccurate? So what is the point of providing me “duplicate content report”?
SEOMOZ should fix this issue otherwise their service is pretty much useless as far as reporting duplicate content is concerned. At this point I have lost confidence in their analysis & reporting.