Everything Goes Better With Bacon

Everything Goes Better With Bacon

Contributed by Nick Quinn, Senior Software Developer, InfiniteGraph

Email: nquinn@objectivity.com

 

Whenever someone considers a large movie database like Internet Movie Database, or IMDB, inevitably the classic six degrees of Kevin Bacon problem comes up.  It is a famous problem posed like this, “…any individual involved in the Hollywood, California film industry can be linked through his or her film roles to actor Kevin Bacon within six steps” [http://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon].  This problem even helped Kevin Bacon begin his own social charity organization called SixDegrees.org linking people with charities that they might be interested in.

Below is an example of how InfiniteGraph can be used to store and navigate through large sets of connected data like the IMDB.  In the example, I will show how to both find the links between various actors and Kevin Bacon, but also how to output the navigation results in various formats including JSON and GraphML.  Note: Custom navigator plugins and custom formatter plugins (including the default JSON/GraphML formatters) can be created and used in any InfiniteGraph (2.1) graph database instance.  See the InfiniteGraph developer wiki for more details and examples of how to write and use custom plugins (http://wiki.infinitegraph.com).

Here is a visualization of the actors connected to Kevin Bacon within just two degrees of separation (up to 1500 connections).

 

 

As you can see it can be rather tricky to find all of the connections, but if you go up to even 4 degrees, you may be overwhelmed with the results which is why the navigational features that InfiniteGraph offers are so powerful!

To create a navigator plugin that finds all paths between two actors. I first wrote a path and result qualifier which qualifies the paths based on certain given criteria. By using the navigator plugin, I can reuse this code and dynamically set field values (using the Parameter annotation) at runtime. Below is an example of the path qualifier, set using the PathQualifier annotation, which filters out all paths that are longer than the degree value.

package com.ig.samples.bacon;

import com.infinitegraph.navigation.Path;
import com.infinitegraph.navigation.Qualifier;
import com.infinitegraph.plugins.Parameter;
import com.infinitegraph.plugins.PathQualifier;

@PathQualifier
public class BaconPQ implements Qualifier
{
@Parameter(required=true)
public int degree = 3;

public boolean qualify(Path path)
{
if(path.size() <= degree)
return true;
else
return false;
    }
}

 

Below is an example of the result qualifier which uses the ResultQualifier annotation and filters out all paths that don’t have target vertex id as a final destination of the navigation. Notice that unless the targetId field value is set, the result qualifier will never qualify any results.

package com.ig.samples.bacon;

import com.infinitegraph.Vertex;
import com.infinitegraph.navigation.Path;
import com.infinitegraph.navigation.Qualifier;
import com.infinitegraph.plugins.ElementId;
import com.infinitegraph.plugins.ResultQualifier;

@ResultQualifier
public class BaconRQ implements Qualifier
{
@ElementId(required=true)
public long targetId;

public boolean qualify(Path path)
{
Vertex currentV = path.getFinalHop().getVertex();

if(currentV.getId() == targetId)
                return true;
else
                return false;    
    }
} 

 
If I jar up these two classes and import them into the InfiniteGraph application, I can use them to navigate through my data.  I call my navigator plugin jar, “BaconNavigator.jar”.

 
Here is how I begin to write the application to access and use my navigator plugin.  I connect to my IG instance of the IMDB.  I execute my navigator given a few different targets: Tom Cruise, Tom Hanks, and Emma Stone.

public static void main(String[] args) throws Exception
{
BaconSample sample = new BaconSample();
GraphDatabase database = null;
Map&#60String,String&#62 overrideMap = null;
try
{
overrideMap = new HashMap&#60String,String&#62();
overrideMap.put("IG.BootFilePath", "../");
//Get IMDB Connection
database = GraphFactory.open("Samples.IMDBGraphDB", "../Samples.properties", overrideMap);

//Output the navigations to json/graphml formats
sample.executeBaconNavigator(database, "Cruise, Tom", "Hanks, Tom", "Stone, Emma (III)");
}
finally
{
if(overrideMap != null)
overrideMap.clear();

logger.info("Closing connection.");
if(database != null)
database.close();
}
} 

 
Continuing the example, here is the code which pulls the vertex references for Kevin Bacon and all of my target actors from the manual index called “Person_names_index”. Also, it imports the navigator plugin, retrieves an instance of the navigator bundle from the plugin manager and sets the field value “targetId” with the value for a single target vertex id.  Finally, I can output the results of each navigation from Kevin Bacon to a given target vertex in JSON and GraphML format.

package com.ig.samples;

import java.io.FileOutputStream;[…]

public class BaconSample
{
public static final Logger logger = LoggerFactory.getLogger(BaconSample.class);

public void executeBaconNavigator(GraphDatabase database, String... targets) throws Exception
{
Transaction tx = null;
try
{
tx = database.beginTransaction();
// Get Kevin Bacon Vertex
GenericIndex&#60?&#62 actorsIndex = IndexManager.getGenericIndexByName("Person_names_index");
Vertex baconVertex = (Vertex) actorsIndex.getSingleResult("name", "Bacon, Kevin (I)");
Assert.assertNotNull("Unable to find Kevin Bacon.", baconVertex);

PluginManagementFactory factory = database.getPluginManagementFactory();
//Import Navigator Plugin
PluginManager&#60NavigatorBundle&#62 navManager = factory.getManager(NavigatorBundle.class);
boolean success = navManager.importPlugin("../BaconNavigator.jar", true);
if (!success)
{
logger.error("Unable to import navigator plugin");
return;
}
NavigatorBundle bundle = navManager.getBundle("BaconNavigator");
for(String target:targets)
{
//Get Target Vertex
Vertex targetVertex = (Vertex) actorsIndex.getSingleResult("name", target);
Assert.assertNotNull("Unable to find target vertex for "+target+".",targetVertex);
//Set target id as field value
bundle.setFieldValue(ResultQualifier.class, "targetId", targetVertex.getId());
//Output to JSON/GraphML file
String lastName = target.substring(0, target.indexOf(","));
if (bundle.isValid())
{
logger.info("Outputting navigation results to file.");
output(factory, baconVertex, bundle, "Bacon2" + lastName + ".txt", FormatterBundle.DEFAULT_JSON);
output(factory, baconVertex, bundle, "Bacon2" + lastName + ".xml", FormatterBundle.DEFAULT_GRAPHML);
}
}
}
finally
{
if (tx != null && !tx.isComplete())
{
tx.complete();
}
}
}
\\…
}

 
Finally, I am able to write the method which takes my navigator bundle, executes the navigator plugin on a start vertex (Kevin Bacon) and outputs the results of the navigation to file.  Because of the default formatter plugins packaged with the IG core, I can output the navigation results via standard JSON or GraphML.  Note: An output stream instance must be set in the format handler for the formatter to write. The output stream does not have to be to a file, but if not defined, an exception may be thrown.   The FormatterBridge class allows you to wrap a FormatHandler instance as a NavigationResultHandler.

public void output(PluginManagementFactory factory, Vertex startVertex, NavigatorBundle bundle, String fileName, String formatterPluginName) throws Exception
{
FileOutputStream fstream = null;
try
{
fstream = new FileOutputStream(fileName);
// Get Formatter Bundle
PluginManager&#60FormatterBundle&#62 formatManager = factory.getManager(FormatterBundle.class);
FormatterBundle formatter = formatManager.getBundle(formatterPluginName);
//Set output to file
formatter.getFormatHandler().setOutputStream(fstream);

//Run Navigator to output results to formatted file
Navigator navigator = startVertex.navigate(bundle.getGuide(), bundle.getPathQualifier(), bundle.getResultQualifier(), new FormatterBridge(formatter.getFormatHandler()));
navigator.start();
}
finally
{
if (fstream != null)
{
fstream.close();
}
}
}

Here is a snippet of the JSON and GraphML results from the Kevin Bacon to Tom Hanks navigation.

{"_type":"vertex","_id":"1970337852817561","name":{"value":"Bacon, Kevin (I)","type":"string"}}
{"_type":"vertex","_id":"1970337754251476","name":{"value":"Going to Pieces: The Rise and Fall of the Slasher Film","type":"string"},"year":{"value":2006,"type":"int"}}
{"_type":"edge","_weight":"0","_style":"undirected","_id":"1688862783242314","_outV":"1970337852817561","role":{"value":"Actor","type":"string"},"_inV":"1970337754251476"}
{"_type":"vertex","_id":"1970355046514947","name":{"value":"Hanks, Tom","type":"string"}}
{"_type":"edge","_weight":"0","_style":"undirected","_id":"1688875681841358","_outV":"1970355046514947","role":{"value":"Actor","type":"string"},"_inV":"1970337754251476"}
{"_type":"vertex","_id":"1970337728233639","name":{"value":"Boffo! Tinseltown's Bombs and Blockbusters","type":"string"},"year":{"value":2006,"type":"int"}}
Bacon, Kevin (I)Going to Pieces: The Rise and Fall of the Slasher Film2006ActorHanks, TomActorBoffo! Tinseltown's Bombs and Blockbusters2006ActorActorBeyond All Boundaries2009ActorProducerActorApollo 131995ActorActor

 

Here are the rest of the navigation results in both JSON and GraphML formats: baconbaked.zip

Enjoy your bacon!

 

Information courtesy of

The Internet Movie Database

(http://www.imdb.com).

Used with permission.

How To Use A Graph Database to Integrate And Analyze Relational Exports

Contributed by Todd Stavish, Chief Systems Engineer for Objectivity, Inc.

Graph databases can be used to analyze data from disparate datasources. In this use-case, three relational databases have been exported to CSV. Each relational export is ingested into its own sharded sub-graph to increase performance and avoid lock contention when merging the datasets.

Visit Todd Stavish’s Blog for the  full article

 

Tom Sawyer Perspectives, Version 4.0, Introduces an Advanced, Data-driven, Graphics Architecture and Integration With InfiniteGraph

The Tom Sawyer release of Perspectives 4.0, integrates with Objectivity’s graph database, InfiniteGraph, and offers advanced, data-driven, graphics architecture with greater flexibility at run-time to enhance the user experience for both building and using visualization and analysis applications. To read the full announcement visit Tom Sawyer Software.

InfiniteGraph Tutorial: Prescription Analytics (National Drug Code, NDC)

The Prescription Analytics tutorial  shows how InfiniteGraph is able to ingest drug data from the FDA’s National Drug Code (NDC) directory, creates fictitious patient information, and runs sample programs to determine whether a patient with particular allergies can safely take a particular drug. For more information on additional sample applications visit the InfiniteGraph Wiki. To download the free version of InfiniteGraph and try this out on your own visit our download page!

InfiniteGraph Prescription Analytics Tutorial [Click to download]

 

http://youtu.be/ZVjLw1TeiQg

Objectivity Expands European Management Team With Former Sones Founder Mauricio Matthesius

Objectivity Expands European Management Team With Former Sones Founder Mauricio Matthesius

SUNNYVALE, CA, Apr 02, 2012 — Objectivity announced today the addition of former Sones founder Mauricio Matthesius, to their European management team. Matthesius joins Objectivity as Director of Sales and Partner Management for Germany, Austria, Switzerland and Eastern Europe as the company aggressively expands its EMEA operations.

“Mauricio has been on the ground in Europe evangelizing the promise of the graph database and its ability to discover opportunities within data,” said Matthew Wallis, Managing Director Europe and Middle East, Objectivity. “He is an excellent fit for our European team and brings great value to our operations.”

Prior to Objectivity, Matthesius was Founder and Managing Director of Sones, an open source graph database initiative based in Germany. Matthesius previously served as an SAP Data Warehouse consultant at both Integrata and Vattenfall Europe, as well as a position at Mitsubishi Daimler Chrysler GmbH. He was also an SAP Business Intelligence Trainer for customers like T-Systems, BASF and Deutsche Post. He is completing a Ph.D. in Information Systems with a focus on automatic information valuation at Technical University Ilmenau.

“As both the leading object and graph database provider, we are experiencing great demand across Europe. And one of Objectivity’s key differentiators is our deep experience and ability to work closely with our customers as they discover their data,” said Jay Jarrell, president and CEO, Objectivity. “Mauricio is a perfect match for our team and our steadfast focus on providing customers with a wealth of understanding about the opportunities within Big Data. We are very excited to have him join the team.”

About Objectivity, Inc.

Since 1988 Objectivity has been the Enterprise NoSQL leader, helping customers harness the power of Big Data for decades. With our products ObjectivityDB, InfiniteGraph, we help companies manage and discover the relationships between complex data in order to make better real-time decisions. Our customers include: AWD Financial, CUNA Mutual, Draeger Medical, Ericsson, LMS Medical, IPL. Discover your data — visit www.objectivity.com

# # #

Note to editors: Objectivity, Inc., Objectivity/DB and InfiniteGraph are trademarks of Objectivity, Inc. All other company, organization, product or alliance names mentioned herein remain the property of their respective owners.

For more information contact:

Press Contact: Objectivity Public Relations
Email: public-relations@objectivity.com