In Part 2 of this blog series, we continue to look at the mechanisms that influence the items under control and on the workflows that need to be enforced to ensure that data quality is not lost or that changes are correctly rippled through a system in an advanced data governance system.

 

Representing Organizational Structures in D*ChoC

D*ChoC has a set of pre-defined User information types (User, Person, Team, Organization and Role) plus an associated connection type called Org_Structure. They can be used to create a representation of just about any kind of organization within the D*ChoC. The diagram below shows a publishing organization.

 

Note how Arthur Grump is employed within the Editorial Team, but is also involved in another project that might involve whole teams in addition to other individuals. This kind of matrix organization is hard to represent in conventional databases, but D*ChoC is all about connections, so it is very easy to construct the model of the organization and adapt it over time.

 

Capabilities and Actions

A Capability can be represented by an information type derived from Item. A User must be granted a Capability in order to perform certain actions on all or selected information items. Typical Actions might include the ability to do such things as:

  • Create a new [ Item => Publication ].
  • Delete an abandoned [ User => Team => Project ].
  • View an unpublished [ Item => Audit_Report ].
  • Connect a [ User => Person ] to a [ User => Team ].

 

Create, Delete, Edit, View, Connect and Disconnect are known as Actions. Most governance systems would also include actions such as Audit, Grant_Permission, Revoke_Permission, etc. The Action describes what can be done, but doesn’t in itself allow anybody or anything to do it. That is the purpose of Capabilities. Each Action can be defined as an information type derived from Item. This is what the part of a D*ChoC handling capabilities and permissions might look like before Capabilities are assigned.

The next step is to add connections between Users and the Actions that they are allowed to perform. By using an [ Item => Capability ] item we are able to put time constraints on the ability, e.g. to prevent edits after a particular step in a workflow. At this point, a portion of the Governance part of the D*ChoC looks this:

 

It would be tedious to have to grant everybody the right to run the Translation Tool, so note that connecting the Capability to the whole Organization and the Run Action will allow an application wrapper (using the D*ChoC REST API) to check that it is OK for anyone in the organization to run the tool.

 

Summary

We have shown how to use D*ChoC to represent complex organizational structures, nested workflows and capabilities that allow Users to perform authorized actions. The combination of these components and the lineage and provenance features go a long way to help provide a fully functional data governance environment.

 

We can easily extend any of the Information Types and define new Connection Types to enhance the capabilities of the system. We can also automatically update the Digital Chain of Custody via the D*ChoC REST API and use the workflows and capabilities to make sure that the right sequences of actions are taken, but only by authorized users or computer processes.

 

D*ChoC is based on Objectivity’s highly scalable ThingSpan product running on the Microsoft Azure Cloud platform. D*ChoC has the scalability to handle provenance problems of any known magnitude efficiently and with ease. D*ChoC has the performance and throughput to handle batch updates, interactive tasks and fast flowing data feeds simultaneously with complex queries.