ILIKE Nuxeo and Elasticsearch


Fri 05 June 2015 By Thibaud Arguillere

We have already written quite a few blogs about the Nuxeo Platform and Elasticsearch and for good reason. It’s like the perfect wedding! Since their integration is so awesome, you (as a Nuxeo user/architect/developer) can do quite amazing things. We already talked about how easy it was to build powerful search forms with Nuxeo Studio in just a few minutes, thanks to aggregates widgets, we explained how to improve the user experience with relevant content suggestion, and wrote about the different kind of searches you can know perform.

Today’s blog is all about performing case insensitive, diacritical insensitive and wildcard sensitive queries using Elasticsearch (in a content-view, search widget, page provider, etc.). This kind of query uses the ILIKE operator, and by default - as of today and explained here - this operator does not work when we use Nuxeo + Elasticsearch (while ILIKE to use it(1).)

So for example, say you want to allow your users to query on dc:title, and get a list of documents whose title contains “nuxeo”, irrespective of its case and position in the title. To do that,


  • You build a Content-View with Studio, where:


    • The “Use Elasticsearch index” box is checked

    • The “Search content view” flag is checked



  • You drag-drop the dublincore-title field and set the “operator” property to ILIKE


Here it is:

01-Studio-config

If you deploy this configuration on your server, and test this search with an expression such as “%nuxeo%” (which translates to any document whose dc:title field contains “nuxeo” whatever the case) you will find nothing (assuming you do have documents with titles such as “The Nuxeo Platform”, or “How Amazing Is My Platform, and Guess What? It Is Nuxeo” or just “NUXEO”). This can be frustrating when you know the data is here!

To allow the use of the ILIKE operator with Elasticsearch, you must add a mapping to the Elasticsearch-Nuxeo configuration, and add the lowercase_analyzer to your field, dc:title in this case. To achieve this, you need to:


  • Modify the default mapping in order to add your customization

  • Use a template to deploy this configuration.


The first part is explained in Configuring Elasticsearch Mapping and the second part in Elasticsearch Setup, but I will summarize everything here to group these in one place. It’s going to be fast - I think a few minutes, including the time to restart the Nuxeo server.

So, to add dc:title to the Elasticsearch mapping and allow ILIKE search on that field you must:


  1. Create a custom template

  2. Reference this template to your nuxeo.conf file

  3. Rebuild the Elasticsearch index


Let’s start with creating a new template. In {yourserver}/templates create a my-custom-mapping folder. Then, in this folder:


  • Create a nuxeo.defaults file


    • Its content is just: nuxeo.template.includes=common



  • Create a “config” folder

  • In this folder, duplicate templates/common-base/nxserver/config/elasticsearch-config.xml.nxftl and keep the same name

  • Now, open this new templates/my-custom-mapping/config/elasticsearch-config.xml.nxftl file

  • Change the component name part and give it a custom name. For example:
    <component name="org.nuxeo.elasticsearch.myCustomMapping">

  • Right after the component name:


    • Delete the <require>org.nuxeo.elasticsearch.ElasticSearchComponent</require> line

    • Replace it with:
      <require>org.nuxeo.elasticsearch.defaultConfig</require>



  • Scroll this configuration file and find the Elasticsearch mapping (between <mapping>...</mapping>)

  • Add your field (be careful with the comas. These configuration files are very sensitive when loaded and parsed at runtime)

  • I’m adding the mapping right after the last field (dc:modified in this example):

. . .
"dc:modified": {
"format": "dateOptionalTime",
"type": "date"
},
"dc:title": {
"type": "multi_field",
"fields" : {
"dc:title": {
"include-in-all": true,
"type": "string"
},
"lowercase": {
"type": "string",
"analyzer": "lowercase_analyzer"
}
}
}

. . .

Our template is now ready. Let’s tell the Nuxeo Platform to deploy it when it starts. Open “nuxeo.conf” and add the name of the template. For example:
nuxeo.templates=postgresql,drive,my-custom-template
You can now:


  • Start your server

  • Rebuild the Elasticsearch index: ADMIN > Elasticsearch > Admin and click the very, very convenient button - “Re-index repository”

  • Once the index is built, search for %nuxeo% and find your content!


I am so happy, IreallyLIKE it!

(1) Probably the easiest word play I’ve ever done. Possibly in the top10 of my worst word plays. Still a play though.


Category: Product & Development
Tagged: Elasticsearch, How to, Nuxeo Platform