When I was in high school back in the 70’s, I remember my favorite physics teacher telling me about the difference between data and information. Data was raw, unprocessed and unlinked to any meaning. Information was what data became when it was linked to meaning. This was an important distinction that has even greater meaning today in 2011.
I also remember that data was precious, hard to find and gather, and often even hard to measure. I remember writing a history paper about the success of various economic systems governed by different governmental structures, i.e. was capitalism more effective when governed democratically? I remember my conclusions in the paper were drawn from a hard won set of 72 discrete data points that I had gleaned from many printed sources.
How the world has changed?!? Perhaps historians will look back on the latter half of the 20th century and conclude that the biggest cultural change we experienced was the utter explosion of data, and the ease of public access to it. Today, we are overwhelmed by the data at our fingertips. We need giant server farms powered by nearby giant power plants to just index all that data. Some of the greatest companies of the millennium are data indexing and delivery companies. We can look at it and manipulate it with our smart phones. We can access it with voice commands or finger swipes. We are drowning in data. The growth of data accessible on the web is exponential.
Just having a web has accelerated this explosion. Increasingly, those data sources on the web are just automatic sensors, measuring some aspect of the environment (e.g. wind speed at an offshore buoy), or some parameter about your personal health (e.g. continuous blood pressure readings). These streams of data are also growing exponentially, facilitated by the availability of the web. It has been noted that the data flow from these sensors is now larger than the human-authored data on the web. I’m sure it will rapidly dwarf our human stuff.
So now I think a lot about that difference between data and information. When I do a search on an ambiguous search string (think “man eating shark”, is it a man or a fish I am curious about), the problem is the search engine typically doesn’t know what I “mean”. Until it understands “meaning”, the web is mostly just unstructured data that it is indexing.
So one of my strongest investment interests lies in the area of “big data”. What kind of value can be unlocked by finding meaning in vast, easily accessible pools of data? Startups can create and sell tools that find the information in the cloud of data out there. Or they can sell that information they’ve found in the sea. This is even better if the data has currency, i.e. it is continually being generated. And the rise of cloud computing only furthers my interest. Now compute power to search the sea of data is available to anyone. Infinite computing resources are available to anyone with the money and the algorithm.
Tim O’Reilly recently blogged “…data was a secret hiding in plain sight … it was the key to competitive advantage in the Internet era…” I couldn’t agree more. Find a meaning that has value and a dataset that holds that information and you have a business. Gather it, illuminate it, transform it – find its meaning. Do something more than just index it.
There’s value hiding in plain sight on the internet. You don’t need to invent or make or sell a “gadget”. Just find the Information in the Data – find the Meaning.