Oil and Gas meets Lambda: A Guest Blog by CGG GeoSoftware

It’s no secret that the Oil and Gas industry is cautious and calculated when adopting new technology. There are more than a few reasons for this, but the most fundamental in my opinion are the absolute requirement for safe operation combined with the sheer amount of inertia, in terms of investment, in existing technologies. It’s the analog of turning an oil tanker – it’s going to take a while.

That said, when there is uptake of a particular set of technologies, the consumption and demand comes at astounding pace and scales. Without a question, Oil and Gas is one of the largest industries, if not the largest industry globally. According to Wikipedia, six out of ten companies with the highest revenue in the world are Oil and Gas companies. Profit percentage, on the other hand, especially in the current low-price oil environment, is another matter. The pressure from that aspect of the business is actively driving increased appetite for new technologies to reduce costs.

read more

Using Apache Spark and ThingSpan To Relieve Network Congestion

Telecommunications voice and data networks are natural examples of graph structures: equipment of many types, often from hundreds of manufacturers, must work in harmony to reliably and efficiently transport information for millions of users at a time. Objectivity products have been used at the heart of fiber optic switches, cellular wireless and low earth satellite systems, long-term alarm correlation systems and in network planning applications.

Dealing with problems (alarms) or overloads has traditionally involved taking individual pieces of equipment offline and re-routing the traffic via other nodes. In this example, we’ll look at an apparently simple situation and show how the combination of Spark SQL and ThingSpan’s advanced graph navigation can be used to quickly diagnose and solve an equipment overload situation. We start by loading Location, Equipment and Link (plus loading percentages) objects and connections into ThingSpan, producing the following graph in Figure 1.

read more

Making Offers They Can’t Refuse: Using Spark and Objectivity’s ThingSpan to Increase Retail Product Sales

Retailers have deployed advanced business intelligence tools for decades in order to determine what to sell and to whom, when, where and at what price. Much of the transactional data was too voluminous for smaller retailers to keep for long, putting them at a disadvantage against the industry giants and more agile web-based retailers. The falling prices of commodity storage and processors are making it possible to keep data longer. This data can also be combined with external sources, such as information gathered from social networks, then analyzed by more powerful machine learning technologies and other tools.

In this blog, we will look at how any retailer—traditional or online—might identify slow-moving products and use their own sales transaction data in conjunction with social media information about bloggers who have mentioned or bought a product in order to identify and target potential buyers.

read more

Detecting Financial Fraud Using GraphX and ThingSpan

Many articles and blogs, including our own, have shown how graph databases can be used to look at financial transaction data to see if particular individuals or organizations are involved in money laundering or other kinds of fraud. In this blog I will expand on this issue and explain how institutions can use Objectivity’s ThingSpan and GraphX, running on Apache Spark, to tackle detect financial fraud more quickly and efficiently.

In a typical scenario, investigators are trying to determine the money trail initiated by the perpetrator(s). This is a very simple navigational query using a graph database, along the lines of “Starting at the Person_X vertex, perform a transitive closure using Financial_Transaction edges and any kind of vertex.”

That is great if we already know that Person_X is of interest, but what if all we have is a huge graph of recent and historic financial transactions garnered from multiple sources, such as banks, exchanges, real estate transfers, etc.? The problem evolves from being a simple query to being a Big Data analytics one. We are now interested in pattern-finding, not path-following.

read more

Online Dating: Relationship Analytics in the Real World

At the start of the new year, my colleague Nick Quinn, our principal engineer here at Objectivity, examined signaling in applications that provide recommendations as a use case for graph databases in his blog, “Peacocking.” He used the peacock’s plumage as an analogy to how we use online dating sites to express mutual romantic interest.

I felt that since it’s the week before Valentine’s Day, it would be timely to revisit the topic of online dating as a treasure trove for big data, and specifically, relationship and graph analytics. This time, however, I’m approaching it from a woman’s perspective—because let’s face it, no matter how flashy male peacocks look, it’s their female counterparts who possess the most decision-making power in selecting a mate.

It’s no surprise that the online dating ecosystem is generating massive volumes of data. According to datascience@berkeley, five of the largest sites (eHarmony, Match, Zoosk, Chemistry, and OkCupid) had between 15-30 million members each. Online dating apps are even more impressive, with Tinder leading the pack at an estimated 50 million users making 1 billion swipes and 12 million matches per day!

read more

Is Your Database Schema Too Complex?


In the majority of today’s commercial software applications that require data persistence, a significant portion of time is spent designing and integrating the database with the application. This task typically involves:

Designing a large, normalized database schema in a Relational Database Management System (RDBMS) using a tool such as an Entity Relationship Diagram (ERD).
Implementing the database with associated Views, Stored Procedures, Triggers, Constraints, etc.
Implementing a mapping layer between the database tables and the application class model (either manual code or an ORM such as Hibernate or Entity Framework).
Performing iterations over the previous items as changes are made to the schema.
Maintaining the database by monitoring and modifying disk space, index usage, transaction log size, etc.
These tasks are very complex and require a significant investment of development resources to perform; they typically require senior-level application and database developers.

read more