Object Orieneted DatabasesObject Oriented Database Learning Center

Objectivity Home - Government - Objectivity/DB - Webinars - Download Software

Objectivity Object Oriented Databeses

 

Download a PDF version of this Article

Learning Center Home

Table of contents:

Object Oriented Database vs Relational Database

 

Object Oriented Database vs Relational Database

Information Model Issues


As the names imply, ODBMSs and RDBMSs vary in the models they allow the application programmer to use to represent his information and processing of it. RDBMS support:

  • Tables (as data structures)
  • Select, Project, Join (as operators)

All application information and processing must be translated to these. ODBMSs support:

  • Any user-defined data structures
  • Any user-defined operations
  • Any user-defined relationships

These result in several differences.

ODBMS

First, consider relationships. Many applications must model relationships of information; e.g., among customers / products / account-representatives, or students / faculty / courses, or clients and portfolios of securities, or telecommunications networks / switches /components / users / data / voice, or manufacturing processes and products, or documents/crossreferences/ figures, or web sites / html objects / hyperlinks. In a relational system, to represent a relationship between two pieces information (tuples, rows), the user must first create a secondary data structure (foreign key), and store the same value of that foreign key in each structure. Then, at runtime, to determine which item is connected to which, the RDBMS must search through all items in the table, comparing foreign keys, until it discovers two that match. This search-andcompare, called join, is slow, and gets slower as tables grow in size. Although the join is quite flexible, it is slow and is the weak point of relational systems.

ODBMS

Instead, in an ODBMS, to create a relationship, the user simply declares it. The ODBMS then automatically generates the relationship and all that it requires, including operations to dynamically add and remove instances of many-to-many relationships. Referential integrity, such a difficult proposition in RDBMSs, usually requiring users to write their own stored procedures, falls out transparently and automatically. Importantly, the traversal of the relationship from one object to the other is direct, without any need for join or search-and-compare. This can literally be orders of magnitude faster, and, unlike the RDBMS approach, scales with size. The more relationships, the more benefit is gained from an ODBMS.

ODBMS

Another typical problem area in RDBMS is varying-sized structures, since as time-series data or history data. Since the RDBMS supports only fixed-size tables, extra structures need to be added, resulting in extra complexity and lower performance. In order to represent such varying sized structures the user must break them into some combination of fixed size structures, and manually manage the links between them (see figure for an example of this). This requires extra work, creates extra opportunity for error or consistency loss, and makes access to these structures much slower.

ODBMS

Instead, in an ODBMS, there is a primitive to support varying-sized structures. This provides an easy, direct interface for the user, who no longer needs to break such structures down. Also, it is understood all the way to the storage manager level, for efficient access, allocation, recovery, concurrency, etc.

ODBMS

A similar situation arises when a user wishes to alter a few rows in a table. Suppose, for example, that two new fields are to be added to 3 rows (see figure). The RDBMS user is left with two choices. He can enlarge all rows, wasting space (and hence time, also, because of increased disk I/O). Or, he can create a separate structure for those new columns, add foreign keys to all rows to indicate which have these extra columns. This still adds some (though less) overhead to all rows of the table, but also now adds a new table, and slow joins between the two.

ODBMS

The ODBMS user has no such problem. Instead, the ODBMS manages the changes directly and efficiently. The user simply declares the changes as a subtype (certain instances of the original type are different), and the ODBMS manages the storage directly from that, allocating extra space just for those instances that need it, with no extra foreign key or join overhead.

Flexibility can be critical to many applications, in order to vary structures for one use or another, or vary them over time as the system is extended. The RDBMS structures are static, fixed, with hard rectangular boundaries, providing little flexibility. Changes to structures or additions typically are quite difficult and require major changes. With an ODBMS, the user may freely define any data structure, any shape. Moreover, at any time, such structures may be modified into any other shape, including automatic migration of preexisting instances of the old shape. Any new structures can always be added.

ODBMS

Simple and Complex Objects

Such structures in the ODBMS may be very simple, with just a few fields, or may be very complex. Using the varray (varying-sized array) primitive, an object can have dynamically changing shape. In fact, by combining multiple such varrays into a single object, very complex objects can be created.

ODBMS

Composite Objects

ODBMS structures can also include composite objects, or objects that are composed of multiple other (component) objects. Any network of objects, connected together by relationships, can be treated as a single (composite) object. Not only may it be addressed structurally as a unit, but operations may propagate along the relationships to all components of the composite. In this way, object may be created out of objects, and so on, to any number of levels of depth. Since the relationships are typed, a single c omponent object may also participate in a different composite, allowing any number of dimensions of breadth. Thus, arbitrary complexity may be modeled. More importantly, additional capability may always be added. Users may always add new composites, that perhaps thread through some of the old composites' objects as well as adding some new ones, without limit. There is no complexity barrier or relational wall in complexity.

In an attempt to allow storage of complex structures, RDBMSs have begun to add BLOBs , or Binary Large Objects. Unfortunately, to the DBMS, the BLOB is a black box; i.e., the DBMS does not know or understand what is inside the BLOB, as the ODBMSs do for objects. In fact, storing information in a BLOB is really no different than storing them in a flat file, and linking that to the DBMS. Flat files can be useful, but with a BLOB the DBMS cannot support any of its functionality internally to the BLOB, including concurrencies, recovering, versioning, etc. It's all left to the application.

Similarly, in an attempt to support user-defined operations, some RDBMSs now support stored procedures. On certain DBMS events (e.g., update, delete), such procedures will be invoked and executed on the server. This is much like executing methods in an ODBMS, except that ODBMS methods may apply to any events (not just certain ones, as in RDBMS); may execute anywhere (not just on server); and may be associated with any structures or abstractions. Also, many RDBMS features may not work with them, because they're not tables. Certainly, any robustness, limited scalability, etc., would not apply since they're newly added outside the RDBMS table-based engine.

Very similar to stored procedures are Data Blades or Data Cartridges. These are pre-built procedures to go with BLOBs. They add the ability to do something useful with the BLOBs. In this way, they're similar to class libraries in an ODBMS, except that the ODBMS class libraries may be written by users (data blades typically require writing code that inserts into the RDBMS engine); and may have any associated structures and operations.

All these efforts (BLOBs, stored procedures, and data blades) stem from a recognition by RDBMS vendors that they lack what users need; viz., arbitrary structure, arbitrary operations, and arbitrary abstractions. Unfortunately, they take only a small step in this direction, providing only limited structures, predefined operations and classes. If the RDBMS were modified enough to generalize BLOBs to any object structure, and stored procedures to any object method, and data blades to any object class, it would require rebuilding the core DBMS engine to support all these, instead of just tables, and the result would be an ODBMS.

Data often changes over time, and tracking those changes can be an important role of the DBMS. The RDBMS user must create secondary structures and manually copy data, label it, and track it. The ODBMS user may simply turn on versioning, and the system will automatically keep history. This is possible for two reasons. First, the ODBMS understands the application-level objects, so it can version at the level of those objects, as desired. In the RDBMS, the data for an application entity is scattered through different tables and rows so there is no natural unit for versioning. Second, the ODBMS includes the concept of identity, which allows it to track the state of an object even as its values change. The history of the state of an object, of the value of all its attributes, is linear versioning. Branching versioning, in addition, allows users to create alternative states of objects, with the ODBMS automatically tracking the genealogy.

In addition to supporting arbitrary structures and relationships, objects also support operations. While the RDBMS is limited to predefined operations (select, project, join), the ODBMS allows the user to define any operations he wishes at any and all levels of objects. This allows users to work at various levels. Some can work at the primitive level, as in an RDBMS, building the foundation objects, which others can then use without having to worry about how they're implemented internally. Progressively higher levels address different users, all the way to end users, who might deal with objects such as manufacturing processes, satellites, customers, products, and operations such as measure reject rate, adjust satellite orbit, profile customer's buying history, generate bill of materials, etc. As new primitives are added, the high level users immediately can access them without changing their interface.

Encapsulating objects with such operations provides several benefits, including integrity, ability to manage complexity, dramatically lower maintenance cost, and higher quality. Integrity results from embedding desired rules into these operations at each level, so that all users automatically benefit from them. As we saw in the section above, RDBMSs allow (indeed, force) users to work at the primitive level, so they might violate or break higher level structures, or change primitive values without making corresponding changes to other, related primitives. The encapsulated operations prevent such integrity violations. The same remains true with GUI tools, because they, too, get the benefit of the higher-level objects with built-in operations.

Operations allow managing greater complexity by burying it inside lower structures, and presenting unified, more abstract, user-level interfaces. Encapsulation can dramatically reduce maintenance cost by hiding the internals of an object from the rest of the system. Typically, large systems become complexity-limited because any changes to one part of the system affects all others. This spaghetti-code tangle grows until it becomes impossible, in practice, to make more changes without breaking something else somewhere. Instead, with an ODBMS, changing the internals of the object is guaranteed not to affect any users of the object, because they can only access the object through its well-defined external interface, its operations. In addition, new sub-types of objects can be added, extending functionality, without affecting higher level objects, applications, or users. For example, a graphical drawing program might include a high-level function (and user interface menu or button) to redraw all objects. It would implement this by invoking the draw operation of all objects. Different objects might have very different implementations (e.g., circles vs. rectangles), but the high-level routine doesn't know or care, it just invokes draw. If a new sub-type of drawing object is added (e.g., trapezoid), the high-level redraw function simply continues to work, and the user gets the benefit of the new objects. Similarly, if a new, improved algorithm is implemented for one of the drawing object subtypes (e.g., a better circle drawing routine for certain display types), the high level redraw code continues to work as-is, unchanged.

Finally, objects encapsulated with their operations can improve quality. The old approach to structuring software, as embodied in an RDBMS, creates a set of shared data structures (the RDBMS) which are operated upon by multiple algorithms (code, programs). If one algorithm desires a change to some data structure, all other algorithms must be examined and changed for the new structure. Similarly, if an algorithm from last year's project turns out to be the one you wish to use this year, you must still copy it over and go through and edit it for the new project's data structures. Code reuse requires changing it, creating a new piece of software with new bugs that must be eradicated, and creating a new maintenance problem. Instead of building on past work, software developers are continually restarting from scratch. With ODBMS objects, however, the code and the data it uses are combined, so changes can be made to them together, without breaking or affecting the other objects. If an object from last year's project does what we need for this year's project, we can simply use it as-is, without copying and changing. It's already tested and debugged and working, so we gain higher quality by building on the work of the past. Even if the old object isn't exactly what we want, we can define a new subtype for that object (inheritance), specifying only the difference (delta) between the new and old. This gives the flexibility to allow reuse in practice, reduces the required new programming (and debugging and potential quality problems) to the much smaller delta, and reduces the maintenance by keeping the same, main object for both the new and the old system.

Last, we'll address complexity. Some might look at all this information modeling capability, including versioned objects, composites, subtypes, etc., and ask what price is paid for this increased complexity. In general, no price is paid. In fact, there is a reduction in complexity as seen by the user. The question is really backwards. The complexity lies not in the DBMS, but in the application, in the problem it's trying to solve. If it is simple, then the data structures and operations in the ODBMS will be simple. If the application's problem is inherently complex, the more sophisticated modeling capabilities of the ODBMS allow that complexity to be captured by the ODBMS itself, so it can help. Instead of forcing the application programmer himself to break that complexity down to primitives, the ODBMS allows him to use higher level modeling structures to directly capture the complexity, and then manages it for the user based on the structure and operations the user has defined. In short, the complexity is in the application, not the DBMS.

The same is true for queries. The RDBMS query system might look simpler, closed, easier to predict, and it itself is. However, the application's desired query is not. It must be translated from the natural, high-level application structures and operations down to the primitive RDBMS structures and operations, and the complexity is all in that translation. Instead, in an ODBMS, the high level structures and operations can be used directly in the query, and the ODBMS query engine executes the query itself, with no need for the user to translate to lower-level primitives. True, the resulting ODBMS query and query system is more complex, but that is because of the nature of the query. It was still true for the RDBMS-based complex application query. The only difference is who handles the complexity, the user or the DBMS.

Copyright © Objectivity, Inc. 2000-2007. All Rights Reserved.