In a content management system, the actual data that the system or the users manipulate comes from many kinds of sources. Content can come from a JCR repository, or from a relational database, or from an LDAP directory, or from a semantic storage engine like Jena, or from any other kind of open or proprietary storage engine.

But fundamentally all these kinds of content, which I'll call "records", aren't very different:

  • a record can be created, viewed, modified, deleted,

  • a record can often be copied or moved,

  • a record obeys a schema that can be known to the system, this means that its individual fields are strictly typed,

  • when being viewed or modified, a record has a user interface that is based on forms, labels, widgets, depending on the schema,

  • records can be searched and a result set returned,

  • records can be listed in a compact form (search results, folder contents, user dashboard, workflow workitems, RDB table listing, user information browsing, etc.),

  • records have an identity (like a unique id) or a location (like a path), sometimes both.

One of the strengths of CPS is to use a common abstraction for many of these concepts, embodied in the CPSSchemas component. In Nuxeo 5 we want to go further and provide even more integration for all these, the base components for these abstractions are NXCore and NXTypeManager.

The reasons to strive for convergence are numerous:

  1. this merges into unique concepts things that had been previously separated because of different implementation choices. For instance, an LDAP schema is not fundamentally different from an SQL schema (or from an XML Schema if one is interested in the relevant subset).

  2. this gives the programmers a common API for all data-related operations, which means more reusability. For instance changing an attribute in an LDAP entry doesn't have to be different from changing the title of a document or changing a value in an RDB row; processing a list of search results to display them in a table doesn't have to be different from processing the children of a folder to display the folder's contents.

  3. this gives the framework developers a way to optimize some operations because of commonalities in the underlying implementations. For instance you don't need three kinds of events dispatching for "LDAP entry modified", "RDB row modified" or "document modified".

  4. this gives the users a unified way of manipulating different kinds of data when there's really no need to have a different UI for them. When a user fills a form it's really the same process whether he's modifying his personal preferences, adding a keyword to a document, or changing a quantity in an RDB row.

  5. this allows very simple migrations between storage technologies, when these are felt necessary. A customer could start with an LDAP database for its user base and later have the need to move them to an RDB table. User entries in an RDB table may need to be versioned and moved to a JCR storage. An application should survive all these with only configuration changes and no code to rewrite.

It should be noted that this means that JCR is in no way the primary storage model for Nuxeo 5, it's only the first one to be implemented. In the future, it will be possible to store documents in LDAP or an RDB. When a suitable storage model is devised and implemented, you'll be able to apply workflow or versioning to RDB-based documents for instance.

This convergence is quite exciting to us, and our goal is to allow people to build complex applications with Nuxeo 5 in a more straightforward manner.