ClickAider

Open Search Platforms take root

Google’s announcement that they are discontinuing support for their Search API has added urgency to the development of open search platforms. 

Alexa and Nutch approach the problem from different directions, each of them interesting. 

Alexa provides a hosted, scalable search service that you pay for; you can access their full-text index of their web, distributing computations across their cloud, and creating new indices to capture the results of those computations. Costs are modest. 

Our goal is to give unparalleled and unlimited access to search. Just think of it… where else can you:

  • Take the reins of a Web crawler and direct it to crawl specific pages on specific domains and collect specific document types
  • Mine the documents in the crawl and generate custom indices
  • Reorder search results and create custom verticals
  • Use your own advertising solution

This is by no means a complete list. I just put it together to illustrate a point.

Where other search engines may give you access to their search results, they will tie your hands. You won’t be able to access the raw documents in their crawl, create your own index, reorder the results or even use your own advertising solution. In some extreme cases they will only provide results if you give over part of your page to their ajax script. Why would these search giants create search solutions that are obviously limited and of little use to inventors? Because they are not interested in helping to create their next competitor.

Alexa on the other hand… that’s exactly what we are here to do. We are here to build a platform for you. We are designing our services to be consumed and manipulated by developers and inventors. We fully expect that the next great search engine will be unimaginable to us and won’t be based on a plain vanilla search index from one of the big boys. It will be built and based on a new idea and it will require the kind of access that only Alexa can provide.

You can get started in the new and revamped Developer’s Corner.

Nutch, on the hand, is a fully open source web search engine; the work of creating a cloud to run Nutch on is up to you or a partner:

Web search is a basic requirement for internet navigation, yet the number of web search engines is decreasing. Today’s oligopoly could soon be a monopoly, with a single company controlling nearly all web search for its commercial gain. That would not be good for users of the internet.

Nutch provides a transparent alternative to commercial web search engines. Only open source search results can be fully trusted to be without bias. (Or at least their bias is public.) All existing major search engines have proprietary ranking formulas, and will not explain why a given page ranks as it does. Additionally, some search engines determine which sites to index based on payments, rather than on the merits of the sites themselves. Nutch, on the other hand, has nothing to hide and no motive to bias its results or its crawler in any way other than to try to give each user the best results possible.

No Comments so far
Leave a comment


Leave a comment

(required)

(required)