2016 is the year that we’ve finally entered the era of the Internet of Things (IoT). Since the beginning of this year, I’ve seen and heard more and more customers and industry leaders discuss technologies that can store, process, and analyze large amounts of real-time streaming data from sensors and IoT devices.
Organizations within the Industrial IoT especially are seeking new IoT technologies to solve their technical challenges and add significant business value. Industries, such as manufacturing, logistics, telecommunications, and oil and gas, have been successfully building IoT applications for configuration management, predictive maintenance, supply chain optimization, and many other critical use cases.
Last week’s Strata + Hadoop World conference in San Jose, Calif., ended on a perfect note! Over the years the event has evolved from merely discussing the power of Big Data analytics to actually implementing emerging technologies to discover relationships within that data.
After gathering my notes from the conversations that I had with the attendees who visited our booth, I can sum up my Strata experience with these two takeaways:
Graph databases are the key to extracting more value from Big Data
A lack of scalability is the primary limitation of other graph technologies
It’s no secret that the Oil and Gas industry is cautious and calculated when adopting new technology. There are more than a few reasons for this, but the most fundamental in my opinion are the absolute requirement for safe operation combined with the sheer amount of inertia, in terms of investment, in existing technologies. It’s the analog of turning an oil tanker – it’s going to take a while.
That said, when there is uptake of a particular set of technologies, the consumption and demand comes at astounding pace and scales. Without a question, Oil and Gas is one of the largest industries, if not the largest industry globally. According to Wikipedia, six out of ten companies with the highest revenue in the world are Oil and Gas companies. Profit percentage, on the other hand, especially in the current low-price oil environment, is another matter. The pressure from that aspect of the business is actively driving increased appetite for new technologies to reduce costs.
At the start of the new year, my colleague Nick Quinn, our principal engineer here at Objectivity, examined signaling in applications that provide recommendations as a use case for graph databases in his blog, “Peacocking.” He used the peacock’s plumage as an analogy to how we use online dating sites to express mutual romantic interest.
I felt that since it’s the week before Valentine’s Day, it would be timely to revisit the topic of online dating as a treasure trove for big data, and specifically, relationship and graph analytics. This time, however, I’m approaching it from a woman’s perspective—because let’s face it, no matter how flashy male peacocks look, it’s their female counterparts who possess the most decision-making power in selecting a mate.
It’s no surprise that the online dating ecosystem is generating massive volumes of data. According to datascience@berkeley, five of the largest sites (eHarmony, Match, Zoosk, Chemistry, and OkCupid) had between 15-30 million members each. Online dating apps are even more impressive, with Tinder leading the pack at an estimated 50 million users making 1 billion swipes and 12 million matches per day!
In the majority of today’s commercial software applications that require data persistence, a significant portion of time is spent designing and integrating the database with the application. This task typically involves:
Designing a large, normalized database schema in a Relational Database Management System (RDBMS) using a tool such as an Entity Relationship Diagram (ERD).
Implementing the database with associated Views, Stored Procedures, Triggers, Constraints, etc.
Implementing a mapping layer between the database tables and the application class model (either manual code or an ORM such as Hibernate or Entity Framework).
Performing iterations over the previous items as changes are made to the schema.
Maintaining the database by monitoring and modifying disk space, index usage, transaction log size, etc.
These tasks are very complex and require a significant investment of development resources to perform; they typically require senior-level application and database developers.