When a file is uploaded to your Nuxeo application it often requires some work, such as extracting - and indexing - the full text from a PDF or a Microsoft Office document, transcoding a video, etc. This work can take time and consume resources (CPUs, Memory). The transcoding of a video is a perfect example of time and resources consuming work.

The Nuxeo Content Services Platform handles the work in asynchronous jobs. This way, the caller (for example, the end user) does not have to wait until completion- when a Nuxeo document is created with a huge PDF, or a Video document is created, the system returns as soon as the file has been uploaded. These asynchronous jobs are handled in a queuing system, and split in categories (Fulltext Updater, Video Conversion, PictureViews Generation, etc.). Dedicated workers in each category pick (consume) their next work and remove it from the queue.

Distributed Jobs

By default, out of the box, these background jobs live in memory on the same machine as the Nuxeo Platform. For production and scaling purposes, you want to optimize the system and use Nuxeo Distributed Jobs System, where you distribute the jobs among as many machines (virtual or not) as you want, and persist these jobs. For example, you can have a cluster of Nuxeo servers where:

  • n servers respond to client requests and never process the heavy work - n other servers handle the asynchronous jobs, and never answer to client requests

You can even scale more specifically, and have several different servers handling a dedicated category. A typical example is, again, video processing, where you can declare, say, 3 servers which will only handle video conversions, using all the available CPUs and memory of their machine:

Distributed Jobs

Setting up such a system is nothing more than just configuration and settings, as explained in our documentation. Everything is available out of the box, awaiting your deployment rules, and ready to scale to no limits!