ThingSpan Performance Blog Series – Part I

ThingSpan Performance Blog Series – Part I

Graph Analytics with Billions of Daily Financial Transaction Events   Introduction In this first blog of the series we will explain how ThingSpan can deliver solutions to meet the demanding needs of today’s complex analytics systems by combining big and fast data sources. In particular, we will focus on ThingSpan’s capabilities to scale up as data volumes increase, and scale out to meet response time requirements. The requirement for this use case was to ingest one billion financial transaction events in under 24 hours (about 12,000 transactions a second) while being able to perform complex query and graph operations simultaneously. ThingSpan has proven to handle this requirement to ingest while performing these complex queries at the same time at speed and scale. It is important to note that the graph in this case is incrementally added to as new events are ingested. In this way the graph is always up to date with the latest data (nodes) and connections (edges) and is always available for query in a transactional consistent state, what we call a living graph. We used Amazon EC2 nodes for the tests and found that “m4x2large” gave the best resources for scalability, performance, and throughput. We surpassed the performance requirements using a configuration consisting of 16 “m4x2large” nodes. For this test we used the native Posix file system. Full details of the proof of concept can be downloaded from: ThingSpan in Financial Services White Paper   Test Description Data In our example use case, approximately one billion financial transaction events occur each day. Each event produces a subgraph that creates or merges 8-9 vertices and creates...
In the beginning

In the beginning

In the beginning there was data. Then Codd (and Date) created relational database systems, and then there was structured query language (SQL). SQL was good for queries by values of data, and queries where you knew what you were looking for. You could answer the known questions. Data was neatly organized into rows (records) and columns (fields) of tables. You could even query across tables using “joins” if you knew what to join.

How Smart Are Your Connected Devices? Using Spark and ThingSpan to Provide IIoT Predictive Analytics for Smart Homes.

How Smart Are Your Connected Devices? Using Spark and ThingSpan to Provide IIoT Predictive Analytics for Smart Homes.

The Industrial Internet of Things covers a very wide range of devices and systems that interact with one another or dedicated services over the Internet. Although such systems have been deployed by specialist companies, such as building control system suppliers, there has been a recent upsurge in interest in developing unified protocols and standards for IIoT infrastructure. IIoT covers a wide range of disciplines, but they can be grouped as follows:

Infrastructure:
IIoT Cloud Platforms
Network Infrastructure & Sensors
Configuration Management
IIoT Cybersecurity
Techniques:
Big Data Learning
Machine Analytics
Application Sectors:
Manufacturing & Supply Chain
Extraction & Heavy Industry
Utilities and Smart Grid/City/Home
Transportation & Fleet.

The infrastructure and techniques share a lot in common with the consumer/retail IoT domain, so in this first look at applying Spark and ThingSpan in IIoT applications we will look at a simple Smart Home application as the techniques employed are applicable to both domains.

Optimize your Infrastructure within the Internet of Things

Optimize your Infrastructure within the Internet of Things

2016 is the year that we’ve finally entered the era of the Internet of Things (IoT). Since the beginning of this year, I’ve seen and heard more and more customers and industry leaders discuss technologies that can store, process, and analyze large amounts of real-time streaming data from sensors and IoT devices.

Organizations within the Industrial IoT especially are seeking new IoT technologies to solve their technical challenges and add significant business value. Industries, such as manufacturing, logistics, telecommunications, and oil and gas, have been successfully building IoT applications for configuration management, predictive maintenance, supply chain optimization, and many other critical use cases.

Using Spark and ThingSpan for Intelligence Analytics

Using Spark and ThingSpan for Intelligence Analytics

Human Intelligence (HUMINT) consists of a huge graph of connected snippets of information about criminals and terrorists, plus analyst reports and a wealth of background information. In this example, we will deal with data that is primarily about telephone metadata, which includes Call Detail Records and the people involved in the calls.

We will look for suspicions patterns of calls, and, if we find any, we will try to determine whether any of the people involved has been seen sighted near a potential target, such as an important government facility.

Oil and Gas meets Lambda: A Guest Blog by CGG GeoSoftware

Oil and Gas meets Lambda: A Guest Blog by CGG GeoSoftware

It’s no secret that the Oil and Gas industry is cautious and calculated when adopting new technology. There are more than a few reasons for this, but the most fundamental in my opinion are the absolute requirement for safe operation combined with the sheer amount of inertia, in terms of investment, in existing technologies. It’s the analog of turning an oil tanker – it’s going to take a while.

That said, when there is uptake of a particular set of technologies, the consumption and demand comes at astounding pace and scales. Without a question, Oil and Gas is one of the largest industries, if not the largest industry globally. According to Wikipedia, six out of ten companies with the highest revenue in the world are Oil and Gas companies. Profit percentage, on the other hand, especially in the current low-price oil environment, is another matter. The pressure from that aspect of the business is actively driving increased appetite for new technologies to reduce costs.