Teh Xiggeh


Supplemental Hell

Posted in Search Engines by Xiggeh on July 25, 2006

On 27 August 2003 GoogleGuy announced a new feature in Google – supplemental results. Although Google had done a great job of removing irrelevant pages from queries that returned a large number of results, they discovered that useful information was not being displayed for more obscure queries. To rectify this Google introduced the ‘supplemental index’ (SI). The supplemental index contained pages not recognised as useful enough to be returned in standard results, but not spammy or irrelevant enough to be shunned entirely.

Nate Tyler (Google Media Contact) explained; “The supplemental is simply a new Google experiment. As you know we’re always trying new and different ways to provide high quality search results.”

In fact this kind of thinking wasn’t new. In June 2000 Inktomi introduced a smaller index of authority sites, and a larger index with ‘the rest’ of the web, similar to Google’s main and supplemental indices.

Few webmasters complained when the SI was introduced in 2000, but trouble started to creep in on 24 January 2006 – a substantial part of Google’s main index began shifting into the supplemental index. It was a serious bug.

Without much recognition from Google, forums exploded with speculation about why this happened. I’ve done a bit of research on the topic and present some facts I’ve discovered, and my (equally unproven) speculation.

What is the supplemental index (SI)?

First let’s get Google’s official view on supplemental results when it was launched in 2003;

“Supplemental sites are part of Google’s auxiliary index. We’re able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index. The index in which a site is included is completely automated; there’s no way for you to select or change the index in which your site appears. Please be assured that the index in which a site is included does not affect its PageRank.”

The official line is rather vague. Let’s see what else we knew about the supplemental index before the bug(s) were introduced in 2006;

  • Pages could be moved to SI without being crawled (Google support pages)
  • Pages could be moved to SI after being crawled (personal experience)
  • “Some” of the moving process “happens in the crawl/index cycle” (Matt Cutts)
  • SI does not affect PR (Matt Catts)
  • SI is not affected by PR (personal experience)
  • Under certain cicumstances SI results will be listed above results from the main index
  • The SI is held seperately from the main index
  • The SI has dedicated crawlers, running on different cycles and agendas from the main index crawlers and AdSense crawlers

Some very good observations from the community, and some snippets of wheat amongst the chaff from Google. The dedicated SE community observation and Google spokesbot action took 3 years (just over). So what’s the problem? Big Daddy…

Big Daddy – Supplemental Hell

Between November 2005 and April 2006 Google rolled out Big Daddy – new software on a new architecture – across its worldwide datacentres. Servers were taken offline, upgraded, and brought back up. A datacentre took around 10 days to upgrade.

Matt Cutts originally said “changes on Big Daddy are relatively subtle (less ranking changes and more infrastructure changes)”, but once the rollout had started it became clear the changes were a bit more disruptive than planned.

An interesting live commentary of the rollout can be found at the WMW forums (24 January 2006).

Over a period of weeks a substantial amount of Google’s main index was placed in the supplemental index, and the problems haven’t completely cleared up yet.

In March Google “identified and changed” a “threshold” on Big Daddy which brought many of the supplemental pages back into the main index. At the end of March Google did the same again, telling Big Daddy to crawl more pages.

Getting out of the supplemental index

To quote Matt Cutts on this one:

In general, the best way I know of to move sites from more supplemental to normal is to get high-quality links (don’t bother to get low-quality links just for links’ sake).

From my experience over the last few months this advice works great. A site that went supplemental for 8 weeks shot right to the top for popular search terms using this method.

However it won’t work for everyone. You should do your own research (there’s plenty of advice out there, good and bad) to find your own solution.

References

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: