The Grub is Back

Google has a huge target taped to its back - everyone sees them as the search engine to beat. With over 50% of the search traffic going through Google, they are the obvious ones to target. How does Google do it now? Well, one of their secrets is commodity computer hardware - not your expensive Sun machines or highly specialized Cray supercomputers - we are talking about inexpensive x86 architecture machines running Linux (most likely their own flavor of Linux). We are talking about 1000’s - actually over 100,000 machines crawling and serving up search results 24 hours a day 7 days a week.

So, how does anyone compete with this horsepower? Obviously, Microsoft and Yahoo can afford to work in the same way. Well, LookSmart back in the day had something that could approach the horsepower of the Google back-end. Their secret weapon was Grub - a distributed web crawler:

People who choose to download and run the client will assist in building the Web’s largest, most accurate database of URLs. This database will be used to improve existing search engines’ results by increasing the frequency at which sites are crawled and indexed.

One day a couple of years ago, LookSmart just closed Grub down. However, yesterday it was announced that the Grub project had been sold to Wikia. WikiaSearch plans on using Grub to help create an open source search engine:

Transparency - Openness in how the systems and algorithms operate, both in the form of open source licenses and open content + APIs.
Community - Everyone is able to contribute in some way (as individuals or entire organizations), strong social and community focus.
Quality - Significantly improve the relevancy and accuracy of search results and the searching experience.
Privacy - Must be protected, do not store or transmit any identifying data.

Well, I certainly wish them good luck and I plan on trying the Grub client. However, it looks like the client isn’t fully functional at this time.

Comments are closed.