Data governance is the capability that enables an organization to ensure that high data quality exists throughout the complete lifecycle of the data.” - Wikipedia

 

Background

There are many aspects to data governance that are outside of the scope of a Digital Chain of Custody application. However, Objectivity’s latest product, D*ChoC, has powerful features that make it easier to record, audit and enforce many procedures that could jeopardize the quality of data. It provides Digital Chain of Custody (D*ChoC) lineage and provenance mechanisms. In this article we will focus on the mechanisms that influence the items under control and on the workflows that need to be enforced to ensure that quality is not lost, or that changes are correctly rippled through a system.

 

Standard D*ChoC Information and Connection Types

The main D*ChoC Information Types are shown in the diagram below. “Item” represents anything in the Digital Chain of Custody.

We will take a quick look at Sequences, as making sure that the right mechanisms are applied in the right order is essential to any data governance regime. Sequences are lists or networks of connected Mechanisms (Applications, Tools, Procedures and Workflows).

 

Mechanisms

D*ChoC Mechanisms are standard Information Item Types. There are four derived types. (See diagram above). Sequences link individual Items together. One kind of Mechanism, Workflow, is special in that it can “include” other Workflows, as well as the other three Mechanisms when it appears in a Sequence. Mechanisms operate on Items and are generally used in conjunction with one or more User items, e.g.

[User => Person] “Author X” used [Mechanism => Application] “Office” to write [Item => Document] ] “Draft 1” .

 

Note that: In this example, Document is a customized Information Type derived from Item, not one of the standard ones.

Sequences

Now suppose that after a draft document is submitted it must be reviewed by an editor and once it is cleared for publication it must be sent to a website technician for conversion to a particular format for publication. It must also be copied to a safe repository before it is made visible to the public. The sequence of events now looks like this:

 

If one of the Details in the Publication Workflow information type is Status, it can be set to “write”, “Reviewed” etc. However, a better technique is to define an Information Type that can connect the “template” sequence to a particular item and define the Status as one of its details. Here’s how it would look in D*ChoC. The item layout has been adjusted slightly to make it clearer.

 

A further refinement of this technique would put the Status item between the appropriate Mechanism, e.g. “Review”, in the Sequence and the item under governance. That makes it easier to run a query to find all items with a particular status in a workflow.

 

Note: At this time, D*ChoC can be used to define a sequence and connect it to particular items via another item, but it doesn’t actively cause steps in the sequence to be performed. That capability, called “Active Workflows” will be added later.

 

Nested Workflows

A part of the power of D*ChoC lies in its ability to easily and efficiently represent and manage complex groups of inter-related items. This is particularly important when it comes to the reuse of standard workflows. The example below represents the processes that a manufacturing organization might go through in order to bring a new product to market. All but the Review procedure in the workflow below are previously defined workflows.

 

 

Controlling Capabilities

Besides making sure that the right sequences of actions occur and are recorded, it is important to ensure that no Users or Mechanisms change or delete items that they are not supposed to. This is partially a security issue, but it also requires knowledge of organizational structures, mechanisms and sequences. In the next blog post, we’ll start by looking at how to represent organizational structures then combine them with capabilities and permissions.