Advanced Placement Configurations

Advanced Placement Configurations

Introduction InfiniteGraph 3.0 let’s you configure placement of your data in a completely unique way. In particular, you can configure placement of your distributed data by type and by related type. Consider the use case where an insurance company wants to place all instances of hospital records in a main storage location, and then store all doctor and patient records with their corresponding hospital. Also, consider the use case where an international investment company wants to store all trade data in the NYSE (New York Stock Exchange) in a remote data center in New York, and trade data for the TSE (Tokyo Stock Exchange) in a remote data center in Tokyo, Japan. With a distributed database, data localization is the process of storing data where it most makes sense, which is usually local to the applications that are accessing it most frequently. A major reason why you might like to localize your data is because it reduces the “distance” that you would need to go retrieve the data. Some other reasons for creating and implementing a custom placement model in InfiniteGraph would be: To improve query performance (we place it, we find it), To improve indexing performance, To potentially improve the read/write performance due to reduced lock contention, To maintain and organize data in a logical manner.   Survey of the Problem Managed data localization has become important with the onset of distributed databases because data is not of value unless it can be accessed. Terms like sharding, partitioning and custom placement have become part of our vocabulary. Without the ability to manage placement strategies at an administrative level,...

Gartner’s Missing V’s: Value and Veracity

Much has been written about the three V’s of Big Data .  As a matter of fact Gartner claimed these V’s – volume, velocity and variety – over 11 years ago according to their Deja VVVu blog post earlier this year.  At Objectivity, we think Gartner was spot on with their revelation at the time… but things change.  Since we have been in the Big Data business for more than 22 years, we know that there are other businesses and players that have been conquering Big Data and now have evolved that landscape beyond simple management to real-time analysis of complex data as well.  Almost a decade later Gartner’s original set of V’s are missing what is being realized by companies today – Value and Veracity. In our DBTA webinar this January, Leon Guzenda, founder of Objectivity discusses the 5 V’s of Big Data in his presentation and recently IBM joined in on the discussion as well. Let’s start with Value: Yes, there is a lot of big data out there, e.g. the many types of logs (Splunk) from M2M systems, location data, photo/video data, etc. At Objectivity we believe that inside your data there are relationships, either explicitly or implicitly hidden within data. And in those relationships lies the true Value of your data. Examples include telephone call detail records (CDRs, from/to subscriber #), network logs (TCP/IP logs, source and destination IP addresses), and web logs (clickstream data). Extracting this set of columns data can build a very nice graph. The question then is how to utilize this information to get commercial value out of it. The point...

Objectivity Goes Open-Use!

This has been a busy few weeks at Objectivity with the recent announcement of version 3.1 of InfiniteGraph. As usual we focused on performance enhancements making 3.1 the fastest distributed commercial graph database. But we also announced our first open-use (free, commercially supported open source tools) offering tIGOutput. tIGOutput, is a new output connector that enables you to easily and quickly import data into InfiniteGraph from sources such as Cassandra, HBase or MySQL using Talend data integration products. Talend is a provider of open source software whose products provide an extensible set of tools to access, transform and integrate data. With InfiniteGraph’s open-use connector, you can now import data from any source supported by Talend. To learn more about tIGOutput and how is can help you find paths within your big data, please visit our...
Meaningful Visualizations of Connected Data

Meaningful Visualizations of Connected Data

Introduction I recently watched a TED talk by David McCandless on The Beauty of Data Visualization. It was all about finding meaning in data sets by visualizing them in creative ways. This was done mostly by aggregating scraped data from different sources on the internet and displaying them in different, interesting or useful formats. As a visual thinker, this speaks to me. I see the value in showing meaning by appealing to the visual brain to think about something because it is easier for me to comprehend it that way. Imagine the federal budget. So much money is spent each year that the dollar amount is virtually incomprehensible. Likewise, it is difficult to imagine how to ask the right questions or to avoid jumping to the wrong conclusions, but with a visual aid like Jess Bachman’s “Death & Taxes” poster, the concepts become so much easier to digest. Like the federal budget, most data sources are static and boring. This makes it superbly important to use the right visualization toolkit to show value and meaning to the consumer. Data that is connected or graphical in nature requires the use of some kind of graphical visualization tool. Since InfiniteGraph is a graph database and we offer a simple but powerful visualization tool, IG Visualizer, it made me think of different uses cases with connected data and how we could show meaning using visualization. Survey of the problem Of course, there are many visualization tools that could be used to do some fantastic (and very cool!) visualizations. One thing that these visualization tools can do very well is give context to...

Big Data And The Trough of Disillusionment – Graph Databases to the Rescue

A few weeks ago the folks over here at Objectivity had a good laugh reading Arik Hesseldahl’s accurate portrayal of the Big Data craze in his article called Has Big Data Reached Its Moment of Disillusionment? In the article Hesseldahl turns to Gartner’s description of the hype cycle and the painful unfulfilled promises of Big Data technologies like Hadoop. The Hype Cycle goes like this: A new technology that promises to fundamentally “change everything” gets talked up incessantly in the press and at industry events and often also in research reports. At some point the chatter peaks, and expectations reach a fever pitch. Soon, maybe a year or two after it all started to build and some money has been spent and everything that was supposed to have changed for the better actually hasn’t, the narrative focus turns negative. What seemed so brilliant and earthshaking 18 months ago, seems in retrospect to have been an ill-advised waste of time, money and attention. Funny or not, we are sitting in the middle of the Hype Cycle as a market space; where companies are learning to manage Big Data but have no idea what to do with Big Data.  Hesseldahl points out that Gartner analyst Svetlana Sicular argues in a blog post that companies are struggling with a basic problem: What questions do you attempt to answer with your data in the first place? “Several days ago, a financial industry client told me that framing a right question to express a game-changing idea is extremely challenging,” Sicular wrote. “First, selecting a question from multiple candidates; second, breaking it down to many...