Welcome to Objectivity, Inc. -- makers of the industry leading Objectivity/DB object-oriented database management platform, Grid Certified (Levels 1 through 6), and SOA compliant We are the leader in scalable database management solutions for mission-critical, real-time and distributed applications.

Object Oriented Database Learning Center:


 

Object Oriented Database Learning Center

A Practical Example – A Message Storage and Analysis System

A Practical Example – A Message Storage and Analysis System

Description: To examine the practical application of Objectivity/DB ’s features, let’s design a message system designed for an internet service provider (ISP) to monitor very large quantities of e-mail and instant messages and detect undesirable advertisements (spam) and viruses.

The requirements of such a system might be as follows:

  • The system will receive an average of 10,000 messages per second, but at peak times may receive as many as 50,000 messages per second.
  • The average size of a message (with attachments) is 5KB.
  • The system must be able to store all incoming messages – none may be lost.
  • Each valid message must be made available within 2 minutes.
  • Two weeks worth of messages will be kept online.
  • All messages (even spam) will be stored in a long-term archive that will be queried.
  • If a virus is detected in a message, the system will:
    Start an electronic investigation to attempt to determine the source of the virus.
    Determine how the virus spreads itself and use that infor mation to add to the virus detection capabilities.
  • Attachments will be saved in the database so that queries can be performed. Those attachments in a recognized format will be stored as structured data. Those in an unrecognized format will be stored as unstructured binary or text objects.

Logical Schema: Let’s propose a schema to define the structures of the data we want to manage. The schema will be composed of classes and their attributes. The most obvious classes that we’ll need are Message, Person, Address and Attachment. The schema proposed here is just a simple example.

Message Class: Basic Attributes The message class will have the following attributes:

object database

Message Format and References: Another attribute might be “format”. Rather than simply define this as an integer and assign different formats a unique number, let’s create a new class “Format” and refer to the message’s format using a reference.

There are currently many different formats for messages and there will be many more created in the future. For e-mails, the standards are defined by different “Request for Comment” documents called RFCs. One of the original baselines was RFC822, but there have been many new RFC documents that are new versions of the format.

By defining a class called format, we can include the attributes that are common and stable (if any) over all the versions and use this as a base class. We can then define specific formats and add new attributes by deriving new classes from the base class.

As new formats are created, we can either create a new instance of an existing format with different attribute values or we create a whole new class derived from Format if no existing formats fit.

Unlike SQL tables, the object-oriented approach gives us the flexibility to fit the data structures to the existing structures that already exist and also be adaptable to new structures that may be defined.

If we simply assigned a number that matched an entry in a format table, we’d have to assume that the SQL query would include knowledge of the table(s) to search to find that entry (in the FROM clause). By using a reference, we can go directly to the format object, regardless of what specific type of format it is. Changes to the schema are less likely to require changes to the data.

Attachments and Relationships: We also want to use references for Attachments, but since a message can have any number of attachments (as opposed to only one format), we’ll want to create either an array of references or a one-to-many relationship.

An attachment can be just about anything, so we’ll simply define the list of attachments as a list of ooObj objects, Objectivity/DB ’s base persistentcapable class. By doing this,any class we define can be used as an attachment, including other messages. We’ll still create a class called FileAttachment, from which we can derive specific classes for specific kinds of files. For now, we’ll just have two specific classes, TextData and BinaryData.

Obviously, if we find a virus in an attachment we’ll want to know quickly which message it has come from. If we find attachments by searching the attachments of messages, the virus-scanner can simply keep the reference to the “current message”. But if we search attachments independently, we’ll want each attachment to have a reference to the message it came from.

To do this, we’ll simply define an Objectivity/DB one-to-many bi-directional relationship between Message and Attachment. Each time we add an attachment reference to a message, Objectivity/DB will ensure that attachment OID will be added to the message and the message OID will be set in the attachment.

Sender, Recipients, CC Lists and Addresses: The sender, recipients and CC lists are all composed of Address objects. In a real system, we might make the Address class a base class for all the different kind of addresses the system will use.

For now, we’ll assume that the Address class simply contains a String for the actual address (e-mail or instant message id), a many-to-one bi-directional relationship to a Person object, and a one-to-many bi-directional relationship to other Address objects.

We’ll add a one-to-many Address relationship to the Address class. This recursive relationship defines a tree where an address (like group@objectivity. com) can be an alias for more addresses. When we actually populate the relationship with values, we can choose to leave this empty if it’s a simple address or populate it if it’s an alias for multiple addresses.

1


Object Oriented Database Learning Center