Introduction

Imagine an employee network the way that you might connections between highly trafficked websites? Of course, we all don’t work for a “highly trafficked” companies, but many employees feel drawn towards “highly trafficked” companies. It is interesting to imagine whether this attraction is justified. Likewise, many employees imagine the freedom that becoming an independent contract might bring, but they may neglect imagine thinking of the higher taxes and the shorter contracts that might come along with it. It would be interesting to see the flow of employees in and out of highly trafficked companies and likewise, to compare it with the flow in and out of non-highly trafficked companies. It is interesting to think how a person might be pulled from company to company. I think that there are many values that tend to be perceived as a stronger part of a smaller company like collaboration in smaller teams, higher visibility for individual contribution, flexibility, participating in different roles, etc. Likewise, there are many values that trend among larger companies like better salaries, more overall market visibility, working on cutting edge technologies, etc. For this reason, I believe that employees are pulled like gravity to companies that tend to be of the same size. On the other hand, if there is a trend, I would say that it is probably towards larger companies because of perceived job stability. But am I right? Does the data support this?

Survey of the problem

How do we study complex and highly connected networks? Many times, it is impossible to analyze the entire graph all at once to pull relevant statistics. Some companies like Google may use indexes to rank and category webpages the instant they are added. This requires a high investment of time. Still others, analyze smaller subsets of the data at a time and aggregate the results. This may lead to inaccurate results due to fact that more outliers may lie in another part of the graph that may be untouched. A better way might be to navigate through the graph only following paths that are relevant (filtered) and only touching nodes that are interesting (qualified). InfiniteGraph’s new Distributed Navigation API using GraphViews and Policies allow you to not only configure and perform these types of navigation easily, but also execute them in real-time. Also, InfiniteGraph offers a visualization toolkit called the IG Visualizer that allows you to visualize the connected data. Here is a graph of the small company, Krustyco, and its employees: past and present up to 100 connections. From this visualization, we can deduce that many, if not all, of the employees that leave Krustyco go to Midsized and Large companies.

Krustyco diagram

Describing the Model

Within my dataset, I have three types of Employees: Salaried, Contract (or Independent), and Temporary. I also have three types of Companies: Small, Midsized, and Large. I also have one type of RelationshipEdges: WorkFor. The edge can represent a relationship that is past (past employment) or current (no end date). In my model, I didn’t include other types of relationships that describe the relationships between employees like coworkers, part of same network, or friends, but that would be interesting to include. Also, the employee is always that type of employee (i.e. a Salaried employee doesn’t become a Contract employee or Temporary employee become a Salaried employee). This doesn’t really represent reality, but for simplicity’s sake, this is how we will keep it. Most of the data is synthetic (names of employees and companies), but much of the data employee size and growth percentage is modeled after real company data. Finally, some of the data attempts to approximate real values (salary amounts, hourly rates, and contract lengths).

Designing the Solution

The solution to finding an answer to whether people tend to move from small to large might be in looking at the relative “distance” (a measure of the trend) from a small company to a large. I would say that this cannot be measured by looking at an individual, but instead, this would be have to be looked at a higher level, analyzing the potential for flow from a small company to a large company. In graph theory, you can think of this as the “connectedness” of a small company to a large company. In our model, we can do this quickly and easily using InfiniteGraph’s Query and Navigation API.

In this example, we are measuring what was the flow from small companies to large companies and vice versa all through their employees. This should allow us to consider how the rate of employees from small to large companies compares to large to small companies. Also, we are doing a study on the connectedness of a small company to a large company by degrees of separation through current and past employment. This allows us to consider how employees at a company might be influenced to think about larger ones. Finally, we do an analysis of employee data from a closed small company to see what the impact on the employees might be.

As you can see from the code below, you can use InfiniteGraph’s Query API to get all small companies and use the Navigation API around small and large companies to get their workforce. You can also use simple predicates to filter out the edge and neighbor objects being returned. In this example, past employees around small and large companies are collected and their connections to various companies are sorted by date. This allows us to find the average number of hops from small to large companies and from large to small companies.

	@Override
	public void run()
	{
		String graphName = "EmployeeNetwork";
		Transaction tx = null;
		GraphDatabase graph = null;
		PrintWriter writer = null;
		
		float ave_length_to_small = 0;
		float ave_length_to_large = 0;
		int path_count = 0;
		long employee_count = 0;
		
		try
		{
			graph = GraphFactory.open(graphName);
			tx = graph.beginTransaction();
			
			writer = new PrintWriter(new FileOutputStream(new File("./" + graphName + ".txt"), true));
	
			System.out.println("Analyzing movement from Small-->Large companies");
			// Starts at each small company and finds unique paths to a large company
			Query<SmallCompany> findSmallCompanies = graph.createQuery(SmallCompany.class.getName(), "!closed");
			Iterator<SmallCompany> smallIterator = findSmallCompanies.execute();
			
			while(smallIterator.hasNext())
			{
				SmallCompany small = smallIterator.next();
				writer.write("Analyzing Small Company: " + small.getName() + "...");
				// Looking at each employee who has left the company
				for(Hop employeeHop : small.getNeighborHops(new EdgePredicate(WorkFor.class, "end!=\"\"")))
				{
					employee_count++;
					Employee start = (Employee) employeeHop.getVertex();
					Date current_end_date = EmployeeData.sdf.parse(((WorkFor)employeeHop.getEdge()).getEnd());
					// sort hops to company by start date
					Map<Long, Company> sortedMap = sortToMap(current_end_date, start);
					// compute average length to large companies, if path(s) exists
					int path_length = 0;
					for(Company company : sortedMap.values())
					{
						path_length++;
						if(company instanceof LargeCompany)
						{
							if(ave_length_to_large == 0) 	ave_length_to_large = path_length;
							else						ave_length_to_large = (ave_length_to_large * path_count + path_length) / (float)(path_count + 1);
							path_count++;
							break;
						}
					}
					sortedMap.clear();
				}
				writer.write("Average length to large is " + ave_length_to_large + " for " + path_count + " number of employees out of " + employee_count + " total. \n");
				writer.flush();
			}
			tx.commit();
			
			tx = graph.beginTransaction();
			System.out.println("Analyzing movement from Large-->Small companies");
			// Starts at each large company and finds unique paths to a small company
			Query<LargeCompany> findLargeCompanies = graph.createQuery(LargeCompany.class.getName(), "!closed");
			Iterator<LargeCompany> largeIterator = findLargeCompanies.execute();
			path_count = 0;
			employee_count = 0;
			while(largeIterator.hasNext())
			{
				LargeCompany large = largeIterator.next();
				writer.write("Analyzing Large Company: " + large.getName() + "...");
				// Looking at each employee who has left the company
				for(Hop employeeHop : large.getNeighborHops(new EdgePredicate(WorkFor.class, "end!=\"\"")))
				{
					employee_count++;
					Employee start = (Employee) employeeHop.getVertex();
					Date current_end_date = EmployeeData.sdf.parse(((WorkFor)employeeHop.getEdge()).getEnd());
					Map<Long, Company> sortedMap = sortToMap(current_end_date, start);
					// compute average length to small companies, if path(s) exists
					int path_length = 0;
					for(Company company : sortedMap.values())
					{
						path_length++;
						if(company instanceof SmallCompany)
						{
							if(ave_length_to_small == 0) 	ave_length_to_small = path_length;
							else							ave_length_to_small = (ave_length_to_small * path_count + path_length) / (float)(path_count + 1);
							path_count++;
						}
					}
					sortedMap.clear();
				}
				writer.write("Average length to small is " + ave_length_to_small + " for " + path_count + " number of employees out of " + employee_count + " total. \n");
				writer.flush();
			}
			tx.commit();
		}
		catch(Exception e)
		{
			e.printStackTrace();
		}
		finally
		{
			if(writer != null)
			{
				writer.flush();
				writer.close();
			}
			if(tx != null && !tx.isComplete())
				tx.complete();
		}

	}

Using this strategy, we find that the average number of steps between an employee going from a small company to a large company is 1.1923 and the average number of steps from a large company to a small company is 1.1955. They are almost identical, but before you jump to a conclusion on the basis of these results, consider how many paths out of the possible that I have found. In other words, consider the ratio of employees that left a small company to join a large one or left a large one to join a small. These numbers tell a different story.

# of paths total # of employees ratio
Small–>Large 8256 9731 0.85
Large–>Small 14939 318953 0.05

From these numbers, we can see that the trend is to gravitate towards larger companies from smaller ones by a significant amount. This is a bit misleading still because there are a lot more positions at larger companies, so there would be more opportunities and likewise, less opportunities with smaller companies. There are also other factors including hiring standards, amount of recruiting and name recognition/prestige to name a few. Even given these factors, these numbers do seem to confirm a reality that for many employees, there is an drive to move from a smaller company to larger ones.

Further Analysis using GraphViews and Policies

In 3.0 and on, InfiniteGraph’s navigation API was enhanced to include two new features that are useful to configuring the distributed navigation. We can use policies to configure things like maximum results, maximum memory use, maximum depth, and even fanout limit. We can also use a new feature called GraphViews to define what vertex and edge types I want to include/exclude in my navigation. Along with simplifying your code, this will limit the paths and therefore, improve the performance of your navigational queries.

Using a policies to restrict the number of results to 1000, limiting the depth to 12 steps (to avoid traversing the entire dataset) and to ensure that we are not revisiting nodes, we can perform navigational queries to measure the connectedness from each small company to a large company. In the sample below, we are performing the connectedness analysis around each small company to large companies and we are doing future employee data analysis as small companies close.

	@Override
	public void run()
	{
		String graphName = "EmployeeNetwork";
		Transaction tx = null;
		GraphDatabase graph = null;
		PrintWriter writer = null;
		
		try
		{
			graph = GraphFactory.open(graphName);
			tx = graph.beginTransaction();
			
			writer = new PrintWriter(new FileOutputStream(new File("./" + graphName + ".txt"), true));
			
			// Starts at each small company and finds connectedness to a large company
                        System.out.println("Analyzing connectedness from Small-->Large companies");
			Query<SmallCompany> findSmallCompanies = graph.createQuery(SmallCompany.class.getName(), "!closed");
			Iterator<SmallCompany> smallIterator = findSmallCompanies.execute();
			
			while(smallIterator.hasNext())
			{
				SmallCompany small = smallIterator.next();
				writer.write("Analyzing Small to Large connectedness: " + small.getName() + "...");
				// Use Graph View to filter out all temporary and contract employees
				GraphView view = new GraphView();
				view.excludeClass(graph.getTypeId(ContractEmployee.class.getName()));
				view.excludeClass(graph.getTypeId(TemporaryEmployee.class.getName()));
				// Use policies to restrict paths from being visited twice and to restrict result depth, count
				PolicyChain chain = new PolicyChain();
				chain.addPolicy(new NoRevisitPolicy());
				chain.addPolicy(new MaximumResultCountPolicy(1000));
				chain.addPolicy(new MaximumPathDepthPolicy(12));
				
				// Looking at each company and trace its unique paths to large companies 
				Navigator navigator = small.navigate(view, Guide.SIMPLE_DEPTH_FIRST, Qualifier.ANY, new VertexPredicate(LargeCompany.class, "!closed"), chain, new CompanyDataCollector(writer));
				navigator.start();
			}
			tx.commit();
			
			tx = graph.beginTransaction();
			System.out.println("Analyzing movement from Closed Small-->? companies");
			// Starts at each closed small company and analyzes paths to their current job through small companies
			Query<SmallCompany> findClosedSmallCompanies = graph.createQuery(SmallCompany.class.getName(), "closed");
			Iterator<SmallCompany> smallClosedIterator = findClosedSmallCompanies.execute();
			
			// view allows us to restrict paths to only include certain types like small companies that are not closed to qualify
			GraphView view = new GraphView();
			view.excludeClass(graph.getTypeId(MidsizedCompany.class.getName()));
			view.excludeClass(graph.getTypeId(LargeCompany.class.getName()));
			view.includeClass(graph.getTypeId(SmallCompany.class.getName()), "!closed");
			
			// policies allow us to avoid processing duplicate data
			PolicyChain chain = new PolicyChain();
			chain.addPolicy(new NoRevisitPolicy());
			
			// initialize data collection
			EmployeeDataCollector collector = new EmployeeDataCollector(writer);
			
			while(smallClosedIterator.hasNext())
			{
				SmallCompany small = smallClosedIterator.next();
				writer.write("Analyzing Small Closed Company: " + small.getName() + "...");
				// Looking at each employee who was part of the company
				for (Hop hop : small.getNeighborHops())
				{
					Employee start = (Employee) hop.getVertex();
					GraphView employeeView = new GraphView();
					// single out this employee history
					employeeView.excludeClass(graph.getTypeId(Employee.class.getName()), "name!=\"" + start.getName() + "\"");
					// find a path from this employee to wherever they ended up
					Navigator navigator = start.navigate(view, Guide.SIMPLE_DEPTH_FIRST, Qualifier.ANY, new EdgePredicate(WorkFor.class, "end==\"\""), chain, collector);
					navigator.start();
				}
			}
			collector.flush();
			tx.commit();
		}
		catch(Exception e)
		{
			e.printStackTrace();
		}
		finally
		{
			if(writer != null)
			{
				writer.flush();
				writer.close();
			}
			if(tx != null && !tx.isComplete())
				tx.complete();
		}

From this analysis, we find that the average “distance” from a small to a large company is 2 hops. We also can find out the following general statistics about employees as they leave closed small companies, but continue to work at small companies. The average salary is approximately 133K/year and the average age of employment is about 45-46, so the average employee that leaves a small company that closes can look forward to a bright future of employment. Also, the average temporary hourly wage is $10.30. Further investigation might lead us to compare this analysis to the trend towards midsized companies or what the trend is to move to a comparably sized company (small –> small or large –> large).

For more information about InfiniteGraph or contact us, feel free to visit our website and our wiki. Happy Trails!

SHARE THIS POST
Share on FacebookTweet about this on TwitterShare on Google+Share on LinkedIn