Much has been written about the three V’s of Big Data .  As a matter of fact Gartner claimed these V’s – volume, velocity and variety – over 11 years ago according to their Deja VVVu blog post earlier this year.  At Objectivity, we think Gartner was spot on with their revelation at the time… but things change.  Since we have been in the Big Data business for more than 22 years, we know that there are other businesses and players that have been conquering Big Data and now have evolved that landscape beyond simple management to real-time analysis of complex data as well.  Almost a decade later Gartner’s original set of V’s are missing what is being realized by companies today – Value and Veracity.

In our DBTA webinar this January, Leon Guzenda, founder of Objectivity discusses the 5 V’s of Big Data in his presentation and recently IBM joined in on the discussion as well.

Let’s start with Value: Yes, there is a lot of big data out there, e.g. the many types of logs (Splunk) from M2M systems, location data, photo/video data, etc. At Objectivity we believe that inside your data there are relationships, either explicitly or implicitly hidden within data. And in those relationships lies the true Value of your data. Examples include telephone call detail records (CDRs, from/to subscriber #), network logs (TCP/IP logs, source and destination IP addresses), and web logs (clickstream data). Extracting this set of columns data can build a very nice graph. The question then is how to utilize this information to get commercial value out of it. The point about value is that there are lots of people collecting and storing big data, but what’s the point if you don’t know or have a plan how to use it. What’s its commercial value? How do you manage it? How do you know what you’ve got and where it is? Do you keep it forever, or delete it, or something in-between?

Now onto Veracity.  Veracity came about partly because of the idea that eventual consistency of data may be good enough for some Big Data. For instance, in transactional systems (e.g. most relational databases systems) the transactional consistency is essential, if I pay money from my bank account to your bank account then both accounts have to balance at the end of the transaction, you can’t afford any loss of data or inconsistency. In the Big Data world things may be more relaxed, does it matter or what’s the cost if you lose one geo-location record out of hundreds or one web click out of thousands. On the other hand the concept of “pedigree and lineage” is very important, say in the intelligence community where large volumes of human intelligence (voice, e-mails, text messages) are analyzed and handled and assessed by many different human analysts.

So when you are evaluating your Big Data investment update your thinking a decade and add two more V’s to the paradigm: Volume, Velocity, Variety, Value and Veracity.  Adding these last two Vs helps you understand the why and how of your Big Data investment as well as how to build an architecture that can scale and continue to provide new opportunities over time.  In the meantime, we’ll try to persuade Gartner to update their assumptions.