Nuxeo is switching its ECM to Java, and we're using JCR for our document storage. JCR (Java Content Repository, standardized by JSR-170 and the upcoming JSR-283) is a young specification with a promising future — but what's its point, you may ask, as all existing content management systems are already storing content very well without it? Its goal is interoperability between vendors, which will make it possible for people who write applications needing to store content to have a unified API for such manipulations. All major content repository vendors are active in the JSR-283 expert group, and all are working on JCR bindings for their various proprietary repositories.

Of course a standardized and wildly successful way of manipulating content already existed before JCR: SQL. But SQL and JCR have a different focus:

  • SQL is a language; it manipulates rows and is geared toward generic relation manipulation,

  • JCR is a Java API; it manipulates nodes and is geared toward hierarchical manipulation (parent-children).

SQL and JCR have quite different underlying data models:

  • SQL's model is that of tables with fixed schemas, and relations between tables,

  • JCR's model is that of a tree of nodes with flexible schemas and with parent-children relations as the main focus — although other types of relations exist.

JCR also offers higher level features than SQL, notably workspace and version management.

For many kinds of applications, there is a focus on being able to arrange documents in folder hierarchies, and to have a wild variety of structure for these documents. In this case, a storage model based on JCR is much more suited than something based on SQL.

It should be noted that many things in the computing world already are based on the notion of folder hierarchies storing arbitrary documents:

  • filesystems are a tree of (unstructured) documents,

  • Revision control systems are based on filesystem concepts but add a lot of structure on top of them,

  • WebDAV (along with DAV and DELTAV) is a protocol that addresses documents using a path, and where documents have a flexible set of properties,

  • most proprietary content management systems are based on (or offer the notion of) organizing the content in a hierarchy,

  • Nuxeo's CPS is itself based on this classic model.

This is why JCR has emerged as a common and useful storage API for all these use cases.

For an ECM framework like Nuxeo 5, JCR interoperability can be seen from two different directions:

  • JCR provides an API that we can use to store our content, to make us vendor-independent and flexible regarding storage,

  • JCR provides an API through which we can expose our content, which makes our platform usable by any external system that understands it — we're ourselves a vendor providing JCR bindings.

For its initial release, Nuxeo 5 is focusing on the use of JCR as its main storage implementation and uses Jackrabbit to store most of our content. In the future, we'll also provide JCR bindings so that the high-level content we provide can be directly accessed from external applications using JCR too.