[Q&A Friday] Remotely Searching for a Document Using Tags


Fri 05 April 2013 By Laurent Doguin


Remotely searching a document using tags Remotely searching a document using tags

Today dedacosta asks if it's possible to search documents remotely using tags. First let's talk about tags.

The tag service uses two important concepts: a tag object, and a tagging action. Both are represented as Nuxeo documents.

A tag is a document type representing the tag itself (but not its association to specific documents). It contains the usual dublincore schema, and in addition has a specific tag schema containing a tag:label string field.

A tagging is a relation type representing the action of tagging a given document with a tag. (A relation type is a document type extending the default relation document type; it works like a normal document type except that it's not found by NXQL queries on document). The important fields of a tagging document are relation:source which is the document ID, relation:target which is the tag ID, and dc:creator which is the user doing the tagging action.

Both tag and tagging documents managed by the tag service are unfilled, which means that they don't have a parent folder. They are therefore not visible in the normal tree of documents, only queries can find them. In addition they don't have any ACLs set on them, which means that only a superuser (and the tag service internal code) can access them.


If you want to test this, tag a document and run the following query:

curl -H 'Content-Type:application/json+nxrequest' -H "X-NXDocumentProperties: " -X POST -d '{"params":{"query":"SELECT  FROM Tag WHERE tag:label = "tag3" AND ecm:isProxy = 0"}}' -u Administrator:Administrator http://localhost:8080/nuxeo/site/automation/Document.Query

This will return the tag document created for the tag labeled 'tag1' as a Json output:

{ "entity-type" : "documents",
"entries" : [ { "changeToken" : "1365178419352",
"contextParameters" : { },
"entity-type" : "document",
"facets" : [ "HiddenInNavigation" ],
"lastModified" : "2013-04-05T16:13:39.35Z",
"path" : "tag",
"properties" : { "dc:contributors" : [ "Administrator" ],
"dc:coverage" : null,
"dc:created" : "2013-04-05T16:13:39.35Z",
"dc:creator" : "Administrator",
"dc:description" : null,
"dc:expired" : null,
"dc:format" : null,
"dc:issued" : null,
"dc:language" : null,
"dc:lastContributor" : "Administrator",
"dc:modified" : "2013-04-05T16:13:39.35Z",
"dc:nature" : null,
"dc:publisher" : null,
"dc:rights" : null,
"dc:source" : null,
"dc:subjects" : [ ],
"dc:title" : null,
"dc:valid" : null,
"tag:label" : "tag"
},
"repository" : "default",
"state" : "undefined",
"title" : "tag",
"type" : "Tag",
"uid" : "5063b2fe-6abe-44c5-ac20-2b19e2922833"
} ]
}

This is the Tag document. Now if you want to retrieve the tagging document, you can do it with the following query:

curl -H 'Content-Type:application/json+nxrequest' -H "X-NXDocumentProperties: " -X POST -d '{"params":{"query":"SELECT  FROM Tagging WHERE tag:label = "tag3" AND ecm:isProxy = 0"}}' -u Administrator:Administrator http://localhost:8080/nuxeo/site/automation/Document.Query

Here's the associated Json output:

{ "entity-type" : "documents",
"entries" : [ { "changeToken" : "1365178419354",
"contextParameters" : { },
"entity-type" : "document",
"facets" : [ "HiddenInNavigation" ],
"lastModified" : "2013-04-05T16:13:39.35Z",
"path" : "tag",
"properties" : { "dc:contributors" : [ "Administrator" ],
"dc:coverage" : null,
"dc:created" : "2013-04-05T16:13:39.35Z",
"dc:creator" : "Administrator",
"dc:description" : null,
"dc:expired" : null,
"dc:format" : null,
"dc:issued" : null,
"dc:language" : null,
"dc:lastContributor" : "Administrator",
"dc:modified" : "2013-04-05T16:13:39.35Z",
"dc:nature" : null,
"dc:publisher" : null,
"dc:rights" : null,
"dc:source" : null,
"dc:subjects" : [ ],
"dc:title" : null,
"dc:valid" : null,
"relation:predicate" : null,
"relation:source" : "0cbb4117-1e86-4092-9278-cd04ecdea35c",
"relation:sourceUri" : null,
"relation:target" : "5063b2fe-6abe-44c5-ac20-2b19e2922833",
"relation:targetString" : null,
"relation:targetUri" : null
},
"repository" : "default",
"state" : "undefined",
"title" : "tag",
"type" : "Tagging",
"uid" : "4979c4cb-d253-41b9-8696-014eb9e44e1e"
} ]
}

As you can see this document contains two interesting metdatas: relation:source and relation:target. The first one is the id of the tagged document and the second one is the id of the Tag document.

Now that you know more about the Tags, let's go back to the original question. How to search documents remotely using tags. It's really easy to do especially since Florent added the tag support for NXQL. So starting from 5.7 you can do queries like this:

Select  From Document WHERE ecm:tag ='tag1'
Select
From Document WHERE ecm:tag IN ('tag1','tag4')

Note that this is not available in CMISQL, and that NXQL query currently cannot be ran using SOAP webservices. And unfortunately the 5.7 hasn't been released yet so we have to do otherwise. This can be overcome easily using a custom operation. From Nuxeo IDE, you just have to create a new operation from the wizard and implement it.
@Operation(id = TagSearch.ID, category = Constants.CAT_DOCUMENT, label = "Tag Search", description = "Search documents using a tag.")
public class TagSearch {

public static final String ID = "Document.TagSearch";

@Context
protected TagService tagService;

@Context
protected CoreSession coreSession;

@Param(name = "tagLabel", required = true)
protected String label;

@OperationMethod
public DocumentModelList run() throws Exception {
if (!tagService.isEnabled()) {
throw new ClientException("The tag service is not enabled.");
}
List<String> docIds = tagService.getTagDocumentIds(coreSession, label, coreSession.getPrincipal().getName());
DocumentRef[] documentRefs = new DocumentRef[docIds.size()];
for (int i = 0; i < docIds.size(); i++) {
documentRefs[i] = new IdRef(docIds.get(i));
}
DocumentModelList result = coreSession.getDocuments(documentRefs);
return result;
}
}


Here I am directly using the tag service. Under the hood, it uses NXQL queries on tag and tagging documents just like we saw at the beginning. Also this is using content automation, so you can call it remotely thanks to the REST binding. But it does not solve our issue with SOAP webservice. I actually tried to do a CMIS query with a join like this:

SELECT Doc.cmis:objectId FROM cmis:document Doc JOIN Tagging Tgg ON Tgg.relation:source = Doc.cmis:objectId JOIN Tag Tg ON Tgg.relation:target = Tg.cmis:objectId WHERE Tg.tag:label = 'tag

But this cannot work because the tag document type is not exposed in CMIS. We did this because it's a 'system' document. Meaning end users have nothing to do with it, hence it being not available through CMIS. So unfortunately there is no out of the box or easy solution if you want to use SOAP. You have to define your own SOAP based webservice, which is not as simple as using content automation IMHO.


Category: Product & Development
Tagged: How to, Java, Q&A