How Elasticsearch and Collections Improve the Nuxeo Platform

Wed 16 April 2014 By Barb Mosher Zinck

Today, we have officially released the latest Fast Track for the Nuxeo Platform - Nuxeo Platform 5.9.3. There are a few great new features in this release, and we want to draw attention to two that we know you’ll appreciate greatly: Elasticsearch integration and Collections.

To better understand the importance of these new features, we took a few minutes to talk with Nuxeo CEO, Eric Barroca.

Elasticsearch Provides Even More Flexibility

One of the core features of any content management platform is the ability to store content and query it. In Nuxeo’s case, structured content is stored within some fairly big data structures. Until the 5.9.3 release it was being queried using the indexing system from the database. Eric said that this was good, it was highly optimized and fast. However, he said that it was limited by SQL’s ability to scale horizontally and still have good transactional integrity. It also limited some advanced search functions that are important to users today.

What they wanted, Eric said, was an index engine that could provide all the functionality the platform has today, and offer additional features such as enhanced faceted search and full-text search (e.g. stemming, similar entries, etc..).

Eric said that Elasticsearch was the perfect solution to their needs. It is distributed, so you can scale index capacity horizontally by adding nodes. But it’s much more than a pure index system.

Now, when you write content, it goes to the database, and when you read, it goes to the Elasticsearch query engine. In addition to its natural ability to query, it has great features for building queries and filtering them, and how you can extract data from those queries. It has hierarchical faceted search that lets you know how many results you have for each facet.

With the integration of Elasticsearch, the Nuxeo Platform can now scale by partitioning data (storage) and scaling the index engine independently. This, Eric said, provides a very flexible architecture and widens the possibilities for applications depending on workloads and application types.

I asked Eric if they looked at other alternatives, but he said there was no real competition for Elasticsearch. It is essentially Lucene distributed and it’s the distribution that it does very well. It is also a great aggregator, allowing you to do BI on your data (eg. give me all the tasks sorted by their completion date).

Elasticsearch is a high level index and computation engine for content in the Nuxeo Platform that can be used for many different use cases. And, Eric said, it’s extremely fast. The query response time was very impressive - the baseline they tested was 10x and that’s a big improvement.

Collections Help You Work More Efficiently

Collections are way to group content and perform actions on that group of content. You don’t actually physically move the content into a collection - it’s like a virtual folder. Eric told me that the idea isn’t to change the natural structure of how the content is stored, but to provide users with a way to organize content that may need to be brought together for a particular use case.

Collections are great for collaboration, or for simple things like sharing, printing, changing a property, or starting a workflow. You can apply any bulk action available in the Nuxeo Platform to the collection. It’s a lightweight folder that can be used in many ways.

Collection - Personal Workspace Collections - Personal Workspace

I asked Eric why they decided to build this functionality into the platform. He said they had a number of use cases requesting it. One common use is for a digital asset management (DAM) project (a collection is called a lightbox in DAM).. Collections enable you to pull together a number of pieces of content and share them with others. He said tags are a way you could do that, but you can’t apply access rights or processes to tags.

Working on revisions of documents that are spread across several workgroups is another use case for Collections. Case Management fits well here. Favorites is another great example of how to use Collections. If you are working on a set of documents, you can pull them into a collection and not have to go and get each document individually. You can easily synchronize Collections to your iPad Drive app.

Collections, Eric said, is simply a new view on your content. And it can be programmatically managed. So say you have a group of overdue cases in your case management application. You can have a collection created and the documents automatically added, then perform some action, such as send notification to a caseworker.

I think any organization can think of a number of ways to leverage the Collections functionality to improve the collaboration process.

These are just two of the new features that come with Nuxeo Platform 5.9.3, You can download this latest version on GitHub now, and read about the all the new features in the release notes.

Also, if you are wondering what’s next, check out the Nuxeo roadmap - we have lots in the works for Nuxeo Platform 5.9.4.

Category: Nuxeo Updates
Tagged: Digital Asset Management, Elasticsearch, Features, Nuxeo Release