Stock Prediction, Web Sentiment, and Search Engines: a privacy thought experiment
If Microsoft or Google found a way to predict and profit in the stock market by mining search logs, would it be a violation of their privacy policy?
This is not an outlandish scenario. The research paper Stock Prediction using Web Sentiment describes “a novel way to do stock prediction using web sentiment”. The authors do textual analysis of financial message boards to extract how people are feeling about different stocks, correlate past sentiments with actual stock performance, and predict future stock values based on current sentiments.
The authors claim a strong correlation between web sentiment and future stock prices. Suppose for the sake of argument that you could actually predict and profit in the stock market with this strategy.
In principle, it seems that your ability to predict the stock market might be even better if you had access to the web search logs– users would presumably reveal things in their private searches they wouldn’t state publically. (To venture into even murkier privacy waters, consider also mining private emails.)
Google and Microsoft, of course, do have access to this data via their various web properties, and are growing increasingly sophisticated in their ability to mine this data.
They already analyze and expose aggregate search trends, for example at this moment Google Trends. At this instant Google Trends tells us that Copenhagen, Christmas Music, and Tiger Woods updates are on people’s minds.

Suppose a talented Google engineer decided to create “Google Stock Trends” as a 20% project. I can’t find anything in the Google privacy policy that would prevent that kind of aggregation– it doesn’t reveal any personally identifiable information.
Stock trend analysis seems to be included in the allowed purposes that Google makes use of personal information, including the vague “development of new services”.
At the same time, it seems a violation of trust to think that others might profit from my revealing personal stock knowledge via a search engine query. I might not even know they were doing this mining, if Google Stock Trends was for internal use only.
Presumably a reputable company would actually never create such a tool. Should their privacy policies be more explicit that this kind of value extraction and data mining isn’t allowed from searches that we think are private?
No Comments so far
Leave a comment
Leave a comment