Telecommunications voice and data networks are natural examples of graph structures: equipment of many types, often from hundreds of manufacturers, must work in harmony to reliably and efficiently transport information for millions of users at a time. Objectivity products have been used at the heart of fiber optic switches, cellular wireless and low earth satellite systems, long-term alarm correlation systems and in network planning applications.
Dealing with problems (alarms) or overloads has traditionally involved taking individual pieces of equipment offline and re-routing the traffic via other nodes. In this example, we’ll look at an apparently simple situation and show how the combination of Spark SQL and ThingSpan’s advanced graph navigation can be used to quickly diagnose and solve an equipment overload situation. We start by loading Location, Equipment and Link (plus loading percentages) objects and connections into ThingSpan, producing the following graph in Figure 1.
Fig. 1: Telecommunications graph
The next step is to look for equipment that is overloaded. The definition of “overloaded” will very much depend on the type and availability of a particular node or link, but in our case we’ll simply use Spark SQL to look for any Link (a green diamond in our diagram) that has a Load figure above 90%. That situation might have been signaled to the network management system by the individual box, but in many cases it’s best to get an overall picture of the state of the network before deciding on the least risky and most efficient way of dealing with problems.
ThingSpan automatically generates the DataFrames needed for Spark SQL to execute the query in parallel over all of the Link data. In our simplified case, we discover in Figure 2 that only Link 6 is overloaded.
Fig. 2: Identification of overloaded Link
We now need to figure out why it’s overloaded and what to do about it. The link is handling traffic on behalf of equipment E21, E22, E31, E32 and E33, but are they the actual cause of the overload?
We can find out by using ThingSpan to traverse the graph and find the leaf nodes, i.e. the producers and consumers of the traffic. This is a very simple and fast query because of the highly efficient way that ThingSpan handles the representation of the nodes and connections. It is optimized for navigational and pathfinding queries.
In Figure 3 below, it is immediately apparent that E2 and E3 in San Jose, Calif., are consuming information drawn from E40 in New York. [Ed: Maybe they’re streaming the latest 8K UHDTV episode of “Programmers of New York” from MovieFlix.]
Fig. 3: Telco pathfinding
Luckily, the solution is very simple. Switching on Link 5 will route the traffic from E31 directly to E21 on a dedicated circuit rather than it being on a link shared with other traffic. This relieves the load on Link 6, seen in Figure 4 below, solving the problem.
Fig. 4: Network congestion resolved
This unique combination of Spark SQL parallel queries and ThingSpan parallel graph traversal provides a highly scalable, robust solution to the problem. Telecommunication companies strive to maintain “five nines” (99.999%) or “six nines” (99.9999%) reliability. In this system, shown below in Figure 5, the triply redundant HDFS storage provides the necessary data availability while Spark YARN handles service availability.
Fig. 5: ThingSpan architecture
Although we’ve used Spark SQL to help deal with a simple overload situation in this example, the combination of Spark MLlib and ThingSpan could be used to run extensive simulations of network usage scenarios. This would make it possible to predict when and where such overloads are likely to occur, enabling cost-effective and timely provisioning of equipment, rather than having to react to emergencies.
CTMO and Founder