Google has just rolled out a new web indexing system, dubbed Caffeine, which it claims will provide 50% fresher results for your searches by constantly updating its database to keep up with the explosion of real-time content. "Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before", explained Google Software Engineer Carrie Grimes.
Under the old index, when you did a search Google would scan the various layers of its index. These layers are prioritized by importance, so it would search one group of high priority sites and then work its way down to less prioritized groups of sites. Each layer was updated on a schedule and at different rates -- the primary one was refreshed every two weeks, for example. With Caffeine, Google drops the layered architecture and instead analyzes small portions of the web and updates the index on a continuous basis. This means that recently-published content is added much sooner than before.
According to Google, Caffeine is capable of adding hundreds of thousands of pages into the Google index per second, and hundreds of thousands of gigabytes of information per day. This is the biggest change to the search engine's methodology in four years, and one that reportedly increases their ability to scale up with the rapid growth of information online.