Google gets a Caffeine Hit
On the 8th of June 2010 Google announced a new web indexing system called caffeine. In basic terms Google has rewritten their infrastructure that deals with how they collect sites for their index. On a side note here is a cool video that Matt Cutts created to explain how Google works.
So what does Google caffeine actually mean?
This type of Google infrastructure update is nothing new. Back in 2005 Google did a similar infrastructure update called “Big Daddy”. Well this has happened again, this is due to the popularity of social media contents on the internet and Google needs to meet the demand. Google’s old way of indexing sites was via flat index layers also known as main index and supplementary indexes. Their new way of indexing is to analyze small proportions of the web and update their search index on a continuous basis, which is why they needed to improve their infrastructure.
Below is a great image from the Google Blog which shows the difference between their old way of indexing and their new way.
With Google caffeine they have improved the Googlebot speed and the size of the data stored amongst many other improvements.
Google Caffeine lets Google index web pages on an Amazing scale. In fact, every second Google’s Caffeine processes hundreds of thousands of pages in parallel.
Here’s a quote from Google’s Blog about the amount of data Caffeine consumes “If this (Caffeine Processed pages) were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.” pretty amazing you have to agree.
This is a major roll out as its estimated that Google owns over 1 Million servers, and here’s some interesting facts its estimated that Google owns 2% of the whole worlds servers, and that they install 100,000 servers per quarter.
What does this mean for SEO.
Well Google have stated that this isn’t an algorithmic change so effectively nothing should change SEO wise but with all these pages being indexed faster there is going to be a whole lot more competition for your site, which makes it more important to keep on top of your SEO.