HOW DYNAMIC METADATA [DATA ABOUT DATA] CAN SAVE THE DAY AND RAISE PRODUCTIVITY ALMOST IMMEDIATELY
If you think that you’re doing everything that you can with your metadata (data about the data), you’re probably wrong. Let’s face it, metadata, as most people think of it today, is boring and tedious to collect. It’s also difficult to keep it up to date. It’s too often thought of as fodder for researchers.
Its uses in governance and compliance systems are important, but, other than reducing risk, they don’t seem to add much value. The problem is that we’ve tended to focus on static metadata, because it is easy to implement with any DBMS, though handling provenance and lineage properly can pose issues.
1: Employees typically waste about 15-35% of their time trying to find the right data. [IDC 2010]
2: Data Scientists, who command high salaries, typically spend 50-90% of their time finding, extracting and cleaning up data before they can start their real tasks. [IBM Cloud Blog 2017]
SOME LESSONS LEARNED
BAD: THE $125 MILLION MARS CLIMATE ORBITER FIASCO
The spacecraft was launched in 1998. It travelled 416 million miles in 286 days before its propulsion system overheated, causing it to plunge too deep into the atmosphere of Mars before carrying on into space, where it is probably still in orbit around the Sun. Subsequent investigations found an engineering process problem. Two components in the propulsion system controls were using different metadata.
An aerospace company’s module output values in Imperial Pounds. A NASA module expected its input in Metric Newtons. One pound is approximately 4.5 Newtons, so the NASA module was working with values that were far too low, causing it to turn up the thrust, which lead to the spacecraft’s loss.
GOOD: METADATA CAN SAVE TIME AND MONEY
Businesses that have successfully researched, collected and exploited their metadata have reported:
- Faster response to change.
- Increased productivity
- Significant top line revenue growth.
- Reductions in bottom line costs and operational complexity.
- Lower risks through better compliance.
STATIC vs DYNAMIC METADATA
In the business world one could consider most financial trading systems as having static (rarely changing) metadata describing sometimes extremely fast flowing data. A system with a dynamic metadata capability would allow a user to describe a new item of metadata, such as a combination of financial factors plus something extracted from an external source, e.g. a weather report. The user would then collect information on possible investments over time before deciding whether or not to purchase a particular stock.
Likewise, within an organization there is often the need to create new products and team structures to design, build, market and sell them. The types of data involved can change dramatically between projects, so it is important to be able to quickly define them yet maintain consistency with governance and other systems. Traditional databases need varying degrees of restructuring when the definitions of the data types are changes. Adding a new connection type may involve the creation of new data structures. It may also affect the structure of previously stored data.
CONTROLLING YOUR METADATA
Establish a metadata control team which should start by setting goals and priorities. There is no point in spending money on capturing and maintaining metadata (or data) that is of little value to your business. Have them produce a cost benefits analysis that helps set and adjust priorities.
The team should start high, with metadata that is widely applicable and with a measurable value. They can then group the various metadata types depending on sources, users, commonality and so on before tackling one group at a time.
Set up an independent Quality Assurance body that assesses the effectiveness, efficiency, accuracy, practicality and scalability of metadata collection and maintenance procedures and mechanisms. You have to be able to trust the metadata and it has to be cost or risk justified.
Metadata must be easy to define and change to keep up with changing business priorities. That sounds obvious, but too many organizations forget to do it, possibly because it’s hard. The underlying platform has to be able to handle new and updated definitions immediately. That isn’t easy with conventional database, governance and compliance systems.
The team also has to decide how to collect the metadata. It may be done automatically, manually, or by post-collection analytics. Remember that not all metadata is created equal. There may be a wide range of sources, collection and storage costs, and potential value to the business.
Finally, but importantly, the user community must be fully identified and understood. The location, numbers involved, IT literacy and user roles are all significant, as are the mechanisms available to them for analyzing and using the metadata. You should also never overlook the hidden constraints, often political, such as wanting to control ownership of data, or educating a workforce that may be in constant churn to deal with rapidly evolving situations.
The most common use of metadata is for cataloging data. The street index in a road atlas is an example of that kind of static metadata. Without it the user would have to scan every map until the destination is found.
Queries and applications may also generate new metadata, such as summaries of the results of a query. It is often useful to store this kind of dynamic metadata in order to compare the results of the same, or slightly varying, queries over a long period of time.
METADATA CONNECT - A POWERFUL, FLEXIBLE AND SCALABLE SaaS PLATFORM
Defining, Querying And Changing The Metadata
Objectivity’s Metadata Connect has an advanced declarative query language called DO. It can perform conventional (scan-like) queries, plus navigation and pathfinding queries, optionally in parallel. Metadata Connect will outperform other databases when navigation or pathfinding are involved. DO can also selectively update and delete metadata.
The DO query language is supported by a powerful metadata definition engine. New or derived metadata types can easily be created, along with the types of connection between them.This mechanism provides dynamic metadata for a system, something that traditional databases struggle with. Users with appropriate permissions can create new metadata, including connection types, at will. The latter does not require any database restructuring, making it easier and faster to cope with dynamic metadata changes.
Powerful Conventional And Graph Metadata Management.
As an added bonus, the database engine embedded within Metadata Connect can also store just about any kind of interconnected data. It has been used to build everything from engineering design and telecom equipment applications to the most powerful and wide ranging analytic systems for the Intelligence Community, managing and analyzing tens of trillions of objects and connections per day, around the clock.
SUMMARY - TAKE OWNERSHIP OF YOUR METADATA
You can’t control something if you don’t know where or what it is. If you still aren’t convinced that your metadata is important, set up a small multi-disciplinary study team. Have them look at priorities, requirements and benefits. Give them the authority to collect all of the information that they need, regardless of source, but also make it clear that they must also identify the tangible benefits of corralling the metadata.
Control systems need feedback to measure how well they are performing, or to get to a set point. What is often forgotten is that feedforward, pushing aggressively towards a goal, is also necessary for optimal performance. If you put together a good study team and take action on their findings you will almost certainly discover that you can improve efficiency, increase revenue and decrease costs with very little additional effort or resources.
Just eliminating that time lost in finding the right data can reap immediate benefits. Making sure that everyone is using the correct data can avoid very costly mistakes, as the Mars Climate Orbiter Team discovered.