grack.com

I tried out Google Desktop Search today and I decided to take a deeper look at how it works and how it integrates into your daily experiences.  This information all comes from reverse-engineering and file/registry observation.  None of it is guaranteed to be correct.

From looking at some of the PDB file references, I think the internal name of this Google search engine is “Total Recall”.  This fits with the replacement string returned from Google (“”) and the port number registry key “trs_port”.

The search utility consists of three main applications and a number of “information provider” plugins.  The main applications are:

  • The Google Desktop Search main application.  This provides the UI for configuration of the Google search programs and launches them as necessary.
  • The indexing service.  This program runs a small HTTP server on port 4664, receiving desktop search requests and outputting search results.
  • The crawler service.  This program runs in the background, indexing local files that exist on your disk.

The plugins are:

  • A Winsock1/2 protocol filter.  These intercept requests to www.google.com, www.google.ca, etc. and add a “Desktop” link to the search page, as well as placing the local search results in with the remote search results.
  • An IE-specific BHO (browser helper object).  The BHO indexes the pages you visit and takes a screenshot to store as a thumbnail for later.
  • Microsoft Word/Excel/Powerpoint plugins.  Unknown at the time, but they are probably used to index Office files.

The Winsock 1/2 interception is one of the cooler parts of the Google Desktop Search Application.  Each request you make runs through this filter.  Whenever a Google search is performed, the interception layer sends the requests to the local indexing server and merges the results with the web search results.  I verified this by running Windump on the machine and comparing the request made to Google with the results that Firefox received.

The BHO uses the GoogleDesktopAPI2.dll to add pages to the indexing service.  To take screenshots, it uses the GetDC function to grab the current bitmap from IE itself.  You’ll notice that if any Windows are obscuring the IE window at the time the screenshot is grabbed, they’ll appear in your thumbnails.

GoogleDesktopAPI2.dll has a number of unnamed imports.  Each of the search plugins loads these imports by ordinal and calls into them.  So far, none of the imports have been decoded.

More info as it comes!

Read full post