When I was in high school back in the 70's, I remember my favorite physics teacher telling me about the difference between data and information. Data was raw, unprocessed and unlinked to any meaning. Information was what data became when it was linked to meaning. This was an important distinction that has even greater meaning today in 2011.
I also remember that data was precious, hard to find and gather, and often even hard to measure. I remember writing a history paper about the success of various economic systems governed by different governmental structures, i.e. was capitalism more effective when governed democratically? I remember my conclusions in the paper were drawn from a hard won set of 72 discrete data points that I had gleaned from many printed sources.
How the world has changed?!? Perhaps historians will look back on the latter half of the 20th century and conclude that the biggest cultural change we experienced was the utter explosion of data, and the ease of public access to it. Today, we are overwhelmed by the data at our fingertips. We need giant server farms powered by nearby giant power plants to just index all that data. Some of the greatest companies of the millennium are data indexing and delivery companies. We can look at it and manipulate it with our smart phones. We can access it with voice commands or finger swipes. We are drowning in data. The growth of data accessible on the web is exponential.
Just having a web has accelerated this explosion. Increasingly, those data sources on the web are just automatic sensors, measuring some aspect of the environment (e.g. wind speed at an offshore buoy), or some parameter about your personal health (e.g. continuous blood pressure readings). These streams of data are also growing exponentially, facilitated by the availability of the web. It has been noted that the data flow from these sensors is now larger than the human-authored data on the web. I'm sure it will rapidly dwarf our human stuff.
So now I think a lot about that difference between data and information. When I do a search on an ambiguous search string (think “man eating shark”, is it a man or a fish I am curious about), the problem is the search engine typically doesn't know what I “mean”. Until it understands “meaning”, the web is mostly just unstructured data that it is indexing.
So one of my strongest investment interests lies in the area of “big data”. What kind of value can be unlocked by finding meaning in vast, easily accessible pools of data? startups can create and sell tools that find the information in the cloud of data out there. Or they can sell that information they've found in the sea. This is even better if the data has currency, i.e. it is continually being generated. And the rise of cloud computing only furthers my interest. Now compute power to search the sea of data is available to anyone. Infinite computing resources are available to anyone with the money and the algorithm.
Tim O'Reilly recently blogged “…data was a secret hiding in plain sight … it was the key to competitive advantage in the Internet era…” I couldn't agree more. Find a meaning that has value and a dataset that holds that information and you have a business. Gather it, illuminate it, transform it – find its meaning. Do something more than just index it.
There's value hiding in plain sight on the internet. You don't need to invent or make or sell a “gadget”. Just find the Information in the Data – find the Meaning.
John Bush says
We are a software development and internet marketing company based in the UAE. We are an established organization already working with many international clients in internet and mobile marketing. We also have a strong team of in-house programmers that we have recently added to with the sole purpose of datasearch / social media related projects. We have had initial discussions with a few local investors but we are now spreading the net to your part of the world as we appreciate you have far more experience in seed funding and start up funding for tech companies. We currently have two of our directors at the San Diego Internet Marketing conference; they are due to fly back out of LA at the end of the week and it would be great if they could maybe meet up with you or one of your team whilst there to give you an overview of some of the upcoming projects we are looking to develop.
Rather than go into too much detail now, I can give you Amit’s (our Managing Director) contact details and you can hopefully set up a meeting.
Please drop me an email and we can talk some more if it’s something you are interested in.
Tony Bove says
Great blog post. I also think about “what kind of value can be unlocked by finding meaning in vast, easily accessible pools of data” in the context of (1) recommending new music to music lovers based on musical influences that can be derived from existing music databases, and (2) uncovering relationships between public policies and unintended results — such as a rise in criminal activity among youths in an area where school funding has been cut. I could think of many more interesting and useful applications of unlocking value from vast pools of information.
Jaron Lanier had something interesting to say about vast amounts of data (and whether our current methods of analyzing data is useful) recently, in “The Hazards of Nerd Supremacy” in the Atlantic (Feb. 8, 2011):
“One problem is that information in oceanic magnitudes can confuse and confound as easily as it can clarify and empower, even when the information is correct. There is vastly more financial data set down in the world’s computers than there ever has been before, including publically accessible data, and yet the economy is a mess. How can this be, if information is the solution? A sufficiently copious flood of data creates an illusion of omniscience, and that illusion can make you stupid. Another way to put this is that a lot of information made available over the internet encourages players to think as if they had a God’s eye view, looking down on the whole system.”
Check out that article for more on Wikileaks and “how nerd supremicists” think:
Ted Driscoll says
Great reply, Tony. Jaron’s article is very interesting reading. I especially liked “The ideology that drives a lot of the online world — not just Wikileaks but also mainstream sites like Facebook — is the idea that information in sufficiently large quantity automatically becomes Truth.” I think this is even a bigger issue. I think it is what empowers a Glenn Beck… He must be right because he’s on the TV and millions are watching him. The one law in life that I grow more fond of as I get older is the Law of Unforeseen Consequences. Already we see the empowering of fringe groups by frictionless communication and publication.
Thanks for adding some stimulating thoughts to this thread.