Tuesday, September 14, 2010

Behind the Instant


Google released Instant Search feature on Sept 8, 2010 creating quite a frenzy over the web. Yahoo claimed that they did it first and 5 years back. Creative hackers forming Google Instant Search like mash-ups over other google services (Youtube Instant, Google Maps).

Google Instant Search redefines the way people use search, people used to click on a search button and manually navigate across search results to find something they wanted. With Instant Search, they can change the query until they relevant results.

This changes how SEO experts work on keyword optimization for search engines, and influences ad placement revenue for searches as this marginally reduces the case wherein the user need to visit next page to navigate over search results.

Though Google Instant Search is just ajaxifying the current google site, what really amazes me is how much they have to scale behind the scenes to deliver results as you type. I think performance tuning or scalability is like a magic comprised of a series of tricks: well performed, repeatable tricks. Rest of this post is my attempt to explain the magic.

Some facts from Google:
  • Instant Search measures 20x times increase in query traffic (already at a billion queries a day)
  • They didn't scale just by increasing number of servers, instead they found new ways to squeeze more out of their existing infrastructure.
  • They have used 15 new technologies in building this infrastructure.

How it works:






As you type the query, the browser requests the server for list of suggestions. As soon as what you type matches one of the suggestions, the actual search results are fetched for that suggestion and the web page is updated.

Let say we querying the web for 'man on wire'* using Google Instant Search. Here is the trace of HTTP requests between the browser and the site.


*has no significance other than is random as far as query is concerned.

The traffic can be classified between the JSON data, and the fetching of actual web page. As you can see most of the traffic here is the JSON data, which corresponds to the AJAX calls that happen as you type the query in the search box. The JSON data comprises of possible suggestions for the query you just started typing, and it includes the search result data only if there is a potential match in the suggestion.

What really happens?

  • Only suggestions are fetched from the server as you type, not the real search results.
  • Search results are retrieved only for the top matching suggestion, not for the actual keyword you type. Even if you start typing 'ma' the search results are fetched only for the top suggestion (say 'map quest'). The search results for the keyword 'ma' is fetched only after you hit the 'Search' button.
  • Results are retrieved either when you pause for several milliseconds or when your query matches (word match, not on just character matches) the top suggestion on the list.
  • Google most likely caches or pre-caches may interim results
By restricting the retrieval of search results to suggestions, Google gets an incredible advantage in caching the search results that could possibly be shared between many users. Now, the permutation is limited to the possibility of a suggestion appearing in the user query, rather than the possible permutations of what the could type next.

HTTP Trace log when searching for the same query 'man on wire' next time.


If you type the same query within minutes, Google is smart enough to load it from the local cache rather going to server to fetch again. As you can see in above table, there is no JSON traffic and it is just 204 server response. 204 means 'The server successfully processed the request, but it not returning any content'.

Layers of cache:


In order to scale, the search results has to be cached at various layers.




Google introduced various new layers of cache,
  • prioritized search queries cached (like hot trends)
  • user specific cache - so they could give you more personalized results,
  • general result cache
  • misc - and other layers of caching (even browser level cache)
Caching is the most common contributor to instant search results, but not without its own drawback of becoming stale. Google should have massively improved their current cache techniques in order to serve results instantly.

How Google keep search results relevant?


The sooner the results are fetched (as you type), the more layers of caches; the search results are probably stale as you consume it. Google revamped its the indexing infrastructure recently to provide more accurate and real-time results. Remember, they integrated the real-time tweets in to the search results?

Google mostly used MapReduce paradigm to build its index, until recently. MapReduce helps to handle tons of data, but it has to process data sequentially. Building a real-time search engine just using MapReduce is not going to work well, so they added additional indexing techniques with announcement of Caffeine.

Summary:


Google must have vastly improved it's indexing, caching, and other layers of its search infrastructure (like Google File System, Clustering techniques., ) to be able to serve us results instantly. I assume they would have even used the expertise of the recently cancelled project - Google Wave.

We can probably expect Google to release a research paper for at-least some of the 15 new technologies that powers Google Instant Search.


Recommended Blog Posts