ClickAider
You are currently browsing the Bogle’s Blog weblog archives.

Beyond 411 for the iPhone is now available in the App store

The local search app Beyond411 (aka Berry411) is now available as a free download for the iPhone in the app store, just search for the abbreviation “b411″. This is just an initial version and doesn’t have all of the features of the Blackberry version. Be forgiving of it’s faults, and please pass on your bug reports and suggestions for improvement.

Stock Prediction, Web Sentiment, and Search Engines: a privacy thought experiment

If Microsoft or Google found a way to predict and profit in the stock market by mining search logs, would it be a violation of their privacy policy?

This is not an outlandish scenario. The research paper Stock Prediction using Web Sentiment describes “a novel way to do stock prediction using web sentiment”. The authors do textual analysis of financial message boards to extract how people are feeling about different stocks, correlate past sentiments with actual stock performance, and predict future stock values based on current sentiments.

The authors claim a strong correlation between web sentiment and future stock prices. Suppose for the sake of argument that you could actually predict and profit in the stock market with this strategy.

In principle, it seems that your ability to predict the stock market might be even better if you had access to the web search logs– users would presumably reveal things in their private searches they wouldn’t state publically. (To venture into even murkier privacy waters, consider also mining private emails.)

Google and Microsoft, of course, do have access to this data via their various web properties, and are growing increasingly sophisticated in their ability to mine this data.

They already analyze and expose aggregate search trends, for example at this moment Google Trends. At this instant Google Trends tells us that Copenhagen, Christmas Music, and Tiger Woods updates are on people’s minds.

Suppose a talented Google engineer decided to create “Google Stock Trends” as a 20% project. I can’t find anything in the Google privacy policy that would prevent that kind of aggregation– it doesn’t reveal any personally identifiable information.

Stock trend analysis seems to be included in the allowed purposes that Google makes use of personal information, including the vague “development of new services”.

At the same time, it seems a violation of trust to think that others might profit from my revealing personal stock knowledge via a search engine query. I might not even know they were doing this mining, if Google Stock Trends was for internal use only.

Presumably a reputable company would actually never create such a tool. Should their privacy policies be more explicit that this kind of value extraction and data mining isn’t allowed from searches that we think are private?

Stock Prediction