Metadata Agility Using Nuxeo Dynamic Facets and MongoDB
As data volumes and business processes change at an an increasing rate, enterprise systems need to adapt more quickly to these changes. Big data, a catch phrase for a while now, is becoming more practical. In addition to increased transactional data, content in the document and digital asset world is also increasing. ECM and Digital Asset Management (DAM) systems need the same flexibility in metadata. Extending data agility into content metadata requires a system architecture that inherently supports it.
One particular aspect I want to address is schema flexibility and specifically the concept of schema-on-read. This is possible to implement at the data storage level with NoSQL databases such as MongoDB.
Of course ultimately in the ECM world you will have an application layer for users to view this content and users require some sort of schema view and UI. For some time now, the Nuxeo Platform has had this capability to dynamically change and add additional schemas to content metadata using facets. Combining application level schema flexibility with an appropriate database capable of a schema-on-read, eliminates the performance limitation of additional tables and joins, and allows taking full advantage of the Nuxeo dynamic facet feature. No need to worry about having too many joins.
A Nuxeo facet is simply a property that can be assigned to a document. It can be done dynamically based on some event or can be set at Document Type definition. It can not only be used to evaluate a condition in an application but also for dynamically assigning an additional schema to content. In an SQL storage back end, Nuxeo document metadata is contained in a set of schemas with each corresponding to its own table. This is therefore normalized for efficient storage size and for allowing the addition of runtime dynamic schemas without rebuilding the database. This is what allows agility in metadata definitions to meet the changing business requirements.
Of course we know that you can’t add an indefinite amount of tables in SQL without lowering performance due to the increased number of joins. So while Nuxeo facets have been supporting dynamic schemas for some time now, in practice the limitation is on your choice of database backend.
The Nuxeo DB backend is pluggable and supports metadata storage in any number of database types. This includes the addition of MongoDB as a choice. This is completely transparent to your user, and the Nuxeo Platform handles the data retrieval from MongoDB. It’s possible to migrate your content into MongoDB without changing any of your UI and application logic while providing all the flexibility to add as many facets and schemas as you want without any of the performance limitations of SQL.
With the Nuxeo Platform and MongoDB, you are able to take advantage of schema-on-read in the DB system because every document is in one collection with all fields retrieved as one call regardless of the metadata. The Nuxeo Platform and Nuxeo Studio allows one to very easily add a facet and additional schema, apply it to your application instance, and put it into use. No data rebuilding is required and you can evolve your metadata content seamlessly.
Here’s a specific example of an adaptation. Since the Nuxeo Platform is designed for users, we need a schema definition to display the data in an UI in some sort of presentable and addressable way. We create a new schema (either deployed as xsd in a plugin or simply defined in Nuxeo Studio) and associate it with a new facet:
<extension point="doctype" target="org.nuxeo.ecm.core.schema.TypeService">
<facet name="Expanded Document">
This facet is defined and added to Nuxeo Studio in the Registry part under Facets. It just needs a name Id so it can be referenced elsewhere in Studio. Then we can add it to any of the document types where we want to track these new fields. This is simply checking off the new facet.
We then reconfigure the UI to display these new or changed fields. We can remove old fields we don’t need anymore, or use the new ones. Upon deployment of the Studio configuration your application can add and display the new metadata fields. You can bulk edit your documents to add values to these, and they will be simply added to your documents in MongoDB as additional values. No DB reconfiguration is needed. Nothing needs to be rebuilt and no performance is impacted.
This is a real, practical schema-on-read system and it’s already in the Nuxeo Platform. Just specify MongoDB as the document store to take full advantage.
In a future article, we will go deeper into the details of the Nuxeo Platform and MongoDB.
Category: Nuxeo Updates