Telecommunications voice and data networks are natural examples of graph structures: equipment of many types, often from hundreds of manufacturers, must work in harmony to reliably and efficiently transport information for millions of users at a time. Objectivity products have been used at the heart of fiber optic switches, cellular wireless and low earth satellite systems, long-term alarm correlation systems and in network planning applications.

Dealing with problems (alarms) or overloads has traditionally involved taking individual pieces of equipment offline and re-routing the traffic via other nodes. In this example, we’ll look at an apparently simple situation and show how the combination of Spark SQL and ThingSpan’s advanced graph navigation can be used to quickly diagnose and solve an equipment overload situation. We start by loading Location, Equipment and Link (plus loading percentages) objects and connections into ThingSpan, producing the following graph in Figure 1.

fig1telcograph

Fig. 1: Telecommunications graph

The next step is to look for equipment that is overloaded. The definition of “overloaded” will very much depend on the type and availability of a particular node or link, but in our case we’ll simply use Spark SQL to look for any Link (a green diamond in our diagram) that has a Load figure above 90%. That situation might have been signaled to the network management system by the individual box, but in many cases it’s best to get an overall picture of the state of the network before deciding on the least risky and most efficient way of dealing with problems.

ThingSpan automatically generates the DataFrames needed for Spark SQL to execute the query in parallel over all of the Link data. In our simplified case, we discover in Figure 2 that only Link 6 is overloaded.

fig2overloadedlink

Fig. 2: Identification of overloaded Link

We now need to figure out why it’s overloaded and what to do about it. The link is handling traffic on behalf of equipment E21, E22, E31, E32 and E33, but are they the actual cause of the overload?

We can find out by using ThingSpan to traverse the graph and find the leaf nodes, i.e. the producers and consumers of the traffic. This is a very simple and fast query because of the highly efficient way that ThingSpan handles the representation of the nodes and connections. It is optimized for navigational and pathfinding queries.

In Figure 3 below, it is immediately apparent that E2 and E3 in San Jose, Calif., are consuming information drawn from E40 in New York. [Ed: Maybe they’re streaming the latest 8K UHDTV episode of “Programmers of New York” from MovieFlix.]

fig3telcopathfinding

Fig. 3: Telco pathfinding

Luckily, the solution is very simple. Switching on Link 5 will route the traffic from E31 directly to E21 on a dedicated circuit rather than it being on a link shared with other traffic. This relieves the load on Link 6, seen in Figure 4 below, solving the problem.

figuntitled

Fig. 4: Network congestion resolved

This unique combination of Spark SQL parallel queries and ThingSpan parallel graph traversal provides a highly scalable, robust solution to the problem. Telecommunication companies strive to maintain “five nines” (99.999%) or “six nines” (99.9999%) reliability. In this system, shown below in Figure 5, the triply redundant HDFS storage provides the necessary data availability while Spark YARN handles service availability.

fig4thingspanarch

Fig. 5: ThingSpan architecture

Although we’ve used Spark SQL to help deal with a simple overload situation in this example, the combination of Spark MLlib and ThingSpan could be used to run extensive simulations of network usage scenarios. This would make it possible to predict when and where such overloads are likely to occur, enabling cost-effective and timely provisioning of equipment, rather than having to react to emergencies.

 

 

Leon Guzenda

CTMO and Founder

Leon Guzenda - Founder

SHARE THIS POST
Share on FacebookTweet about this on TwitterShare on Google+Share on LinkedIn