Google Logo officially released on May 2010

Image via Wikipedia

Here is Google’s definition of “site:” advanced operator:

If you include site: in your query, Google will restrict your search results to the site or domain you specify. For example, [ admissions site:www.lse.ac.uk ] will show admissions information from London School of Economics’ site and [ peace site:gov ] will find pages about peace within the .gov domain. You can specify a domain with or without a period, e.g., either as .gov or gov.

Note: Do not include a space between the “site:” and the domain.

A few days ago I noticed something weird about Site: operator but didn’t have time to investigate it thoroughly. Today, one of my colleagues noticed the same thing. Basically when you use the site: operator Google should “restrict your search results to the site or domain you specifiy” – fiction!

Google can actually be exploited by content farms and other rogue websites because its so called “advanced operator” is actually very dumb.

As you can see in the screenshot above, Google returns urlpulse.co.uk for “site:aaa-cars.co.uk”. Although this might not look like a big deal on the surface, someone out there can and will exploit this to hijack branded searches.

These type of pages could be potentially harmful to the affected website. Unaware of what Google does in this instance it could be possible that a rogue site in control of these type of pages could turn these into 301 redirects or place a Canonical on the page to point to somewhere else.

Other sites that have been similarly affected:

site:ammotorservices.co.uk

site:combedowngarage.co.uk

Other Google Operator Weirdness

Here are some more of the Google operators playing tricks. In this instance it is the definitions that are appearing and not providing much usefulness.

Below we are checking for the word “scuffs” that we know exists in a URL using the “inurl:” operator.

Now compare that with the singular (truncated) version of the word, “scuff”, and look at the result returned from the screengrab below.

Clearly it is only picking up whole words from the url, ignoring truncated versions of the word.

And now in the screenshot below we’re using some different words but their definitions are not showing.

My good colleague Dom Calisto is updating this post….