What is IBM Domino / Notes ?

Domino is a powerful business application development  tool.
It is far more than just a system to handle your email.

Building business applications in Domino and XPages will be easier if you understand the core principals behind the unusual Notes data model.

What was Notes designed to accomplish ?

The Domino product is now marketed by IBM as a "Social business application platform".
Interestingly, Domino has always provided the same core functionality. That functionality was once known as "groupware";
Now the same services are called "Social Business". Notes was just 20 years ahead of its time (Thank you Ray Ozzie!)

Domino allows "sometimes connected", multiple users to work  cooperatively on semi-structured information.

In the earlier (Lotus) Notes product, a powerful engine served up information to the Notes desktop client.
Lotus realized that this server engine could be used to deliver the same information to other front ends.
They developed a task, running on the (then) Notes server, that delivered Notes database information to a web browser.
When this functionality was being beta tested the code name was Domino.
Beta testers liked the name so much that Lotus decided to rename the complete server product "Domino".

Domino.. An Application development platform.

Domino comes with pre-written functionality "out of the box".
The system contains a mail file to handle all your incoming and outgoing email, calendar/scheduling, task tracking, etc.
There is an address book to store individual, group and server information.
These are very powerful Domino databases.
In addition, the product delivers many database templates that can be used to create a host of useful applications:
Discussion databases, document tracking databases, simple workflow applications etc.

However, To  get the best return on your investment, Domino should be thought of as an integrated development platform.
With Domino you can create outstanding Collaboration / Groupware / Workflow applications (Social Business applications!)

Cooperatively processing semi-structured information.

What do I mean by "cooperatively process semi-structured information" ?
To clarify, let's draw a distinction with someone using a traditional (relational) database to process "structured information" .
Imagine a teller at a bank processing a customer's account record.
In a traditional database, the account record might hold Account Number, Account balance, Last Name, First Name, Social Security Number, Date created, Date last accessed.
Every Account record will hold the same type of information, in the same format and will hold only that information.
When the teller needs to modify the information held for an account, the record will be locked to prevent other users simultaneously updating it, the change will be made, and the account will be released.

In contrast, an example of "sometimes connected, multiple users cooperatively processing semi-structured information" might be a product development team collaborating on the design for a new product.
This is where Notes shows its strength...
The design team will all want to add their own input, they will want to discuss and track each other's contributions. They will want to add their own sketches of the product. They will want to make simultaneous changes to the design. They will want to carry the plans with them during airline trips while they are "off-line" and offer feedback when inspiration hits. When they are back online they will want their changes to be incorporated into the master database.

So how do you design a system to store and process semi-structured information ?

First, forget all you have learnt about traditional relational  databases.

Let's go back to the beginning..
Imagine you want to store some pieces of information in the most general way possible.
You set up some containers to hold the information. Let's call these containers "buckets".
In front of you, imagine you can see a number of these buckets.
Into the first bucket you throw some information. Some more information goes into the second bucket and so on.
That's about as general as you can get.

Now you have a problem. To retrieve the information from one of your buckets you need to be able to identify the bucket.
You could use one of the pieces of information that you put in the bucket. For example, a person's name or phone number.
But the Name might change, and you would lose track of the bucket. Instead we will generate a random number to identify the bucket.

Next problem.. Having been supplied with the id of the bucket, we need to find the right piece of information.
We need to be able to identify the items of information we threw in the bucket.
So, to each piece of information we will assign a label called the "item name". One item name may be "LastName" another might be "PhoneNumber".

Please Note: we are not talking about a traditional, relational database !

Specifically....

We now have a system to store our unstructured information. We will call this "collection of buckets", a database.
And, since a "bucket" is not a particularly good image, let's call our storage area a "Note".
(These Notes will also be referred to as documents, a term more familiar to our users).
In each Note we have "items"; and each item has its own "item name". ( and an item may be a list of values) .
This iterative structure of one element containing other elements has been referred to as
the Notes container model.

We will call our product "Notes" and since our "database" is an object store for these Notes, we will call it a "Notes Storage Facility".
And the file that we create on the computer will be given an extension of NSF.

Notice that we have not yet said anything about the user interface.
The uniqueness of the Notes product lies in its storage paradigm.
The possibility exists to develop many front end interfaces: Notes clients, Web browsers, Java applets, VB programs etc.

How does the user access the information in these "Notes" ?

We want the user to be able to see the value of items stored in a document.
We will create a display "form". The form will be the mechanism we use to show the user what is in a document.
The areas on the form that display the values stored in the items on a document will be called fields.
One field on the form may be called "AccountNumber".
If the document has an item named "AccountNumber", then the value of the item AccountNumber on the document will be displayed in the field AccountNumber on the form.
If the document does not contain an item named AccountNumber then the field AccountNumber on the form will be left blank.
It can help to imagine the form as a template through which we are viewing the document. If a field on the form, lines up with an item on the document then the item value will be seen.

How do we handle the fact that we do not know what data type the item on a document will be? (Whether it will be a number or a date or a string ?) We will always convert the item to text. How about the fact that there may be many items with the same name ? We will display the value stored in the first item we come across on the note with that name.

So now we have forms to display documents. But what form will we use to display a document. As we have said, two documents may contain completely different items. Which form do we use ? As yet there is no connection between the form and the document.

OK. We will create a special item on a document with the name "Form". The value of that item will determine which form we will use to display the document. So if the value of the item "Form" is "Company", we will use the form called Company.
If the document does not have a "form" item we will use whichever form we have designated as the default form for the database.

Just because the value of the item "Form" on a document has the value "Company" does not make this a "company" record.
The Notes data model contains no concept of a "record type". All documents in a database are equivalent as far as Notes in concerned.

Having created a form to help display the documents, where does Notes store this form design information?
It stores the design information in other Notes documents. In effect we now have "user documents", storing data and "design documents" storing design elements.
The "user documents" can be accessed by users, the "design documents" can be accessed only by database designers.

The layout information is stored in a rich text field on these design documents. When we look at our data, Notes reconstructs the form layout from the instructions it finds in this rich text field.
As it follows the instructions to build the form layout, it might find reference to another form: a subform. Notes will then locate the subform design document and include the instructions found there.

How do we locate a particular document ?

We have documents stored in a database and a mechanism (the form) to look at the items stored on each document.
Next we need some way to locate a specific document.

Imagine we have a database of documents, some of which store information about employees in the company.
We want to see a list of these documents.
First ... We need to select only those documents that hold employee information. (Others might hold, say , Department information.)
In a traditional database, employee information will be stored in "employee records" and we would select to see only those Employee records.
However.. Notes has no concept of "record type". How do we identify employee documents ? If we have been consistent, all our employee documents will have an item on them called "Form" with a value of "Employee". So we can create a condition:

SELECT form="Employee"

( Please note.. This means that the value of the item "form" is "Employee". It DOES NOT mean that we are searching for the documents with a "form type" of Employee !!!)

Alternatively, assume that every employee document also has an item "salary". The salary item is always filled in and only exists on employee documents. We could then have a selection condition:

SELECT salary > 0.

This condition would also select the employee documents.

It is important to realize that, as far as Notes is concerned, these two ways of selecting the documents are equivalent. In a selection condition there is nothing special about the item "Form". The Form item is just one more entry on the document. It's value does not determine the document's "record type". In fact the item "Form" might not even be present on some of the "employee" documents.

Having located the correct set of documents, how can we find the particular one we are interested in ? - The Notes View
Assume I want to find the document that contains my personal employee information.
We make a list of the employee documents using one of the selection criteria above. From each document in the set we list the information that the user might need to identify the employee. Let's pick the value of the EmployeeName item, the SocialSecuriyNumber item, the Department item. The values appearing in the listing are referred to as column values.

This listing of item values from a set of documents in a column format is a Notes View.

A Notes view may list any set of documents. To Notes, all documents are equivalent, (there being no record type).
A view can therefore contain any mixture of documents. We could show documents containing employee information, departmental information etc. in the same view. If an item called "Name" appeared in both documents, then the View would show "employee" and "department" information jumbled together.
We have seen that when a form has a field that does not correspond to an item on a document,the field is left blank.
Similarly, when a view has a column that does not correspond to an item on a document, the column value is left blank.

Where does Notes store the information used to construct a view ?
It is another Notes document ( Every thing in Notes is stored in documents ) . This single document contains an index of all the documents that are to be included in the view.
(The elements in this index are the unique IDs of the documents). The view document also contains column formatting information (i.e. is the column a category. Is it to be displayed as an icon.etc.).
Other view information includes: Are the documents to be displayed as a response hierarchy, i.e. parent-child order, or as a flat record structure.

When a user requests to see a view displayed, Notes opens this "view" document, determines whether it must recalculate the set of documents that will appear in the view ( i.e. it updates the index list) and then is uses the column description items to display the view. This explains why a view can be "out of date". Some views are refreshed only when a user requests an update.

Response Documents and "Response to Responses"
I said that Notes has no form types, but what about "Responses" and "Response to responses".
When you design a form, you have three options: Documents created with the form can be main documents, response documents or response to a response documents.
When IRIS developed Notes I assume they decided that users should be offered some structure when creating documents.
There are many cases where one document is very obviously related to another.
For example: Having entered company information on one document, you might want employee information, entered on a second document, to be tied to the first.
As you recall, each document entered into Notes is assigned a unique ID. The method used to tie the response document to the main document is to add a single field to the response document called $REF. The $REF field holds the unique ID of its parent.
That's it !
The only thing that makes a response document a response document is the presence of the $REF field !
(i.e. There is no "response" document type).
This has some very powerful repercussions:
1) If you delete the $REF field from a "response" document, it is no longer a response document ( @isresponsedoc returns false).
2) If you add a $REF field to a document ( with the correct value) it becomes a response to some other document.
3) If you change the value of the $REF field it will become a response to a different document. ( This is what happens if you "cut" a response document and "paste" it while pointing to a different main document.) If you have been wondering how to do this in code lookup the "MakeResponse" method in the NotesDocument class.
4) Parent documents do not know whether they have child documents.
5) Child documents do not know whether their parents still exist. (The parent documents may have been deleted, creating orphan responses).
6) The database has no knowledge of the child-parent structure of the documents contained in it, without examining the individual documents.

How do "responses" differ from "response to responses" ?
Imagine you have a document hierarchy: A main document with some associated response documents.

The Only difference between these two options (response or response to response) is the unique ID number that was used to populate the $REF field.
If you are creating a discussion database, you get a nested response tree by using "response to response" documents..

Notice that although we have introduced a way of "relating" one document to another, this is very different from the associations in a relational database.
In a relational database two records are related because the first record contains a foreign key field that has the same value as a key field in the second record.
The Notes data model does not relate documents by field values. Notes documents are "related", if at all, by document ID The Child has a pointer ($REF) to its Parent .

Replication - Resyncing the database after multiple users have modified their own copy

As you have seen, a Notes Database contains both User Data and Design Elements stored in individual Notes.
Notes developers and End users can be working in their own local copy of the database, modifying the items held on these Notes.
One of the most important questions faced by the early designers of Notes was, how can these modifications be merged together in a meaningful way ?
The name they chose for the merge process is Replication. It can be said that Lotus and IBM have spent 30 years perfecting replication.

When two copies of a database are merged, the system must detemine what to do with each Note.
The simplest case is when a new note has been added - The new note can then be recreated in the other copy
If a Note has been modified in one copy, but not the other, then the same modification can be applied to both.
These are the simplest cases to reconcile. However, things get very complicated when users have modified different items in their copy of the same Note.

Notes considers which copy of the Note has been modified most, which copy was modified last; Whether a Note was deleted in one copy and modified in the other etc.
As you can see, the number of permutations can be daunting,
especially when one introduces time-zones, access rights, deletion stubs, space-saver options and othe replication settings etc.  

To find out more (than you ever wanted to know) about Replication take a look at: Replication Fundamentals ( 74 Pages - Have Fun !)

If you have read this far, and you have a question or you are unclear about some aspect of the Notes Data model, then please email me: GStalley@BDSsys.com . I'll try to hunt down an answer and include it here.

You may also be interested in this paper from Lotus : Inside Notes - The Architecture of Lotus Notes

And, if for some reason you want to know how a Note is structured at the byte level, check this out: The Notes Storage Facility (NSF) database file format

Footnotes

Identifying a bucket...
In the case of Notes, This number, used to uniquely identify the document, is called the Notes unique id. It contains two parts; the first identifies the database ( in fact it is the replica ID of the database) and the second part is an ID for the specific document within the database.