In this blog, we will discuss how to import documents in the Nuxeo Platform using Node.js. In the last Nuxeo Tour workshops, we created two command line tool importers based on the Node.js version of the Nuxeo JavaScript client. Here, we will discuss about them in detail and show how they work.

Goal


The main goal is to be able to import a hierarchy of documents in a Nuxeo Platform instance without having a direct access to the server by using the Automation and REST API exposed by the Nuxeo Platform.

Overview of the Importers


The first one, nuxeo-node-importer, imports a local folder to a Nuxeo Platform instance using the Automation and REST API.
Some behaviors can be configured while running the import such as:


  • the document type to use when creating a Folder on Nuxeo

  • the Automation chain to use when importing files

  • the maximum number of concurrent requests


The second one, nuxeo-node-custom-importer, is a fork of the first one that shows how to customize it to import a custom hierarchy with custom rules.
You can take a look at the README file of nuxeo-node-importer which explains the logic we follow when creating documents on the remote instance.

Both importers are good examples to see how the Nuxeo JavaScript client works as they use most of the client features (calls on Automation API, REST API, batch uploader, ...).

Note that each call to the Nuxeo Platform instance is transactional but the whole import of a document is not. This means that if an error occurred in functions called after the document creation, the document won't be rollbacked / deleted.

How It Works


We mainly rely on the Nuxeo JavaScript client to make all the API calls to our Nuxeo Platform instance and on the async module to easily handle asynchronous calls.

The main execution flow, for both importers, is:


  • Asynchronously recursively walk the folder given as parameter

  • For each file or folder, add a new task to the queue

  • Each task processes the file or folder and calls a set of functions in a series (to avoid overloading the server)


We have a queue where we put a task per file or folder. We define the queue a worker when creating it, and it will process each task we put in the queue. This queue allows us to limit the number of concurrent calls on the Nuxeo Platform instance.

var queue = async.queue(worker, concurrency);

In each task in the queue, we use async.waterfall to execute a set of functions in series. For each task that runs, only one call at a time will be made on the server.

async.waterfall(funcs, callback);

On nuxeo-node-importer, the set of functions is fairly simple:


  • createFolder or createDocumentFromFile

  • documentCreatedCallback


On nuxeo-node-custom-importer, which is a sample of what we can do, the set of functions is more advanced (for creating a file):


  • createDocumentFromFile

  • uploadBlob

  • followTransition

  • setACE

Extend the Importers


The easiest way to extend the importers is to clone / fork one of the importer as a base, and modify the set of functions executed when a file or folder is imported.

To implement custom behaviour when importing documents, you need to add new functions or modify the existing ones and add them to the funcs array (see the async.waterfall() call).