Document Capture is often brought up by our customers and partners as it is something that fits particularly well with Document Management. There are many solutions available on the market, but today we'll talk about our partner Ephesoft. You can actually bridge Nuxeo and Ephesoft using CMIS. So here's a tutorial explaining how to do that. Written by Vincent Eclancher from AKKA technologies. Here's what he has to say.

Digitizing a document and automatically updating it to Nuxeo with pre-filled metadata is real. To do that, we use an OCR software in which we can specify which metadata we want to extract, then connect to Nuxeo through CMIS and upload the document as a PDF with its metadata. Akka has chosen Ephesoft which has a free OpenSource version and an Enterprise version providing more functionalities. Today we'll be using the enterprise version. Tests we did with the community version were not as good. The Enterprise version OCR engine is much better. Go to the end of this post to learn more about Ephesoft Enterprise version.

In this tutorial we are using the default Nuxeo doc types but you can of course use any custom type defined from Nuxeo Studio.
To install Ephesoft, easy and classic installation process, install the msi file. No specific configuration is required during the installation. Then connect to the Ephesoft administration interface and create a new batch by copying an existing one.

Click on Copy Click on Copy

Fill the different properties of the form and save it.

Batch Properties Batch Properties

Once the batch is created, open it by double clicking it.
Then create a document type that correspond to those you have by clicking on « Document Type » and « Add ».

Document Types tab and Add button Document Types tab and Add button

Then create the metadata fields. To do that, double click on the appropriate « Document Type » you just created and click on « Add ».

Metadata list of our document Metadata list of our document

Using a TIFF image uploaded in Ephesoft, we choose a KeyPattern (Date de la fiche) that will be used to detect the text to extract. Then we define where is the text to extract, the value of our metadata. We also to define a value pattern(a date regex) that will be used to validate the extracted text.

Key and Value Pattern Setup Key and Value Pattern Setup

Once the two document types have been created, we need to configure Ephesoft's CMIS plugin. To do that, click on the « Back » button until you're back to the main page of the batch. Then click on « Module ».

Back button, UpRight corner of the table Back button, UpRight corner of the table

Module Tab Module Tab

Then double- click on the « Export » module at the end of the batch modules.

Export Module Export Module

Then on « CMIS_EXPORT » , « Edit » button.

Ephesoft CMIS Export Plugin Ephesoft CMIS Export Plugin

Give the name (and the path if necessary) of the folder where the digitalized documents will be uploaded, the address of the CMIS server, the ID of the CMIS repository and the login/password of the user doing the import.

Ephesoft CMIS Export Plugin Configuration Ephesoft CMIS Export Plugin Configuration

We're mostly there:) We have to do the mapping between Ephesoft metadata and the Nuxeo document metadata. Edit the mapping file( in {EphesoftPath}/SharedFolders/{batch}/cmis-plugin-mapping. First thing to do is have a link between the Ephesoft document type and the Nuxeo one, then between their metadata name :

That's it, good to go!

About Ephesoft

As stated above, we are using the Enterprise Edition. We've asked Ian Pope, a friendly Welsh who likes rugby and who is Vice President, Sales and Marketing for Europe, Middle East & Africa, how to test the Enterprise version. Here's what he sent us:

Click here to download Enterprise Edition.

About Licensing:

Once you install it, you need to email us at [email protected] & [email protected] and attach your license file ( located in "C:EphesoftDependencieslicensing" folder.

The install comes with pre configured batches and sample images for testing.

About Vincent
Research Engineer Specializing in Document Management and Paperless Systems at AKKA TECHNOLOGIES