When a Picture is Worth a Thousand Words...Or a Lawsuit


Thu 18 August 2016 By Lisa McIntyre

Recently, we were asked if there was a way we could prevent the use of restricted words without approval. As you can imagine, this could be used in a couple of different ways: maybe there are inappropriate words to watch for; maybe, in your Brand Guidelines certain words should not be used within your marketing, or, maybe, as in this case, someone has tried using a term that is trademarked and requires specific permission to use.

To elaborate on this idea, marketers and advertisers spend a lot of money to be official sponsors of events (like The SuperBowl). As such, both The Event and those who act as sponsors want to protect their investment/brand. Likewise, companies don’t want to get sued for inadvertently using something they shouldn’t.

The question for Nuxeo isn’t can we do it (of course we can). Rather, we just needed to decide how we would handle the situation. For the demo, we kept it relatively simple:

  • Have a list of restricted words
  • Create a list of words in a file
  • Compare the lists
    • If the lists share words, then mark the file as restricted
    • Watermark the file
  • Start approval workflow.

Easy.

First, we need something to act as our “restricted” list. For this, we created a simple vocabulary within Nuxeo Studio (this could also be some external directory source).

List of Restricted Words

Next, we need to get a list of terms identified in our image (the image is titled “image.png”, no restricted names in the title, we want to be sure to show you the work is based on the addons).

Super Bowl XLVI To do this we take advantage of an addon, Nuxeo Vison, that my colleague, Michaël Vachette wrote. It uses the Google Vision API to identify text in images (specifically, we use the OCR capabilities Thibaud Arguillere describes in his blog post. We take the information and add it to the metadata for the image. Super Cool!

Now it’s time for the fun to begin.

Within Studio, we created an event handler to determine when Nuxeo Vision has returned our list of terms relevant to the image. We tied the event handler to an automation script that completes a couple of tasks (created in Studio within the automation scripting).

The script looks something like this (I’ve left in the logging notes so you can see where the work is happening in the log screenshot following):

function run(input, params) {

  var valuesAsJavaStringBlob, valuesAsString, valuesJson, dcSource,
      i, max, oneEntry, tasks, restricted, wasRestricted;

  // Get all entries
  // The operation returns a Java StringBlob
  valuesAsJavaStringBlob = Directory.Entries(null, {
    'directoryName': "RestrictedWords"
  });
  // This Java StringBlob has a getString() function that is cool :-)
  valuesAsString = valuesAsJavaStringBlob.getString();
  // Look, we have all
  Console.log("valuesAsString: \n" + valuesAsString);

  // Get JSON so we can work easily, loop, etc.
  valuesJson = JSON.parse(valuesAsString);
  max = valuesJson.length;

  restricted = false;
  wasRestricted = input["myType:restricted"];
  if (wasRestricted === null) {
    wasRestricted = false;
  }

  // The result of the OCR is in dc:source, filled by nuxeo-vision
  dcSource = input["dc:source"];
  if (dcSource === null || dcSource === "") {
    Console.log("Nothing to check.");
    return input;
  }
  dcSource = dcSource.toLowerCase();
  // Also, replace linefeeds with spaces. We could have Super\nBowl
  dcSource = dcSource.split("\n").join(" ");
  Console.log("dcSource: \n" + dcSource);

  // Loop
  for (i = 0; i < max; ++i) {
    oneEntry = valuesJson[i];

    // We now have oneEntry.id, oneEntry.label, ... (see the log)
    // label is the one we want in the vocabulary (it is not translated in this example)
    if (dcSource.indexOf(oneEntry.label.toLowerCase()) > -1) {
      Console.log("This kw is forbidden: " + oneEntry.label);
      input["mytype:restricted"] = true;
      input["mytype:restricted_str"] = "YES";
      input = Document.Save(input, {});

      restricted = true;

      Console.log("Watermarking the thing...");
      // In thsi chain, we use Blob.RunConverter and a converte/commandline
      // XML with parameters
      input = javascript.Picture_AddWatermarks(input, {});

      // Start the workflow (only if it was not already started by another event)
      tasks = Workflow.GetOpenTasks(input, {});
      if (tasks.length === 0) {
        Console.log("Starting the WF...");
        input = Context.StartWorkflow(input, {
          'id': "CheckRestrictedWord",
          'start': true
        });
      } else {
        Console.log("WF already started");
      }

      // Stop the loop
      break;
    }
  }

  if (wasRestricted && !restricted) {
    Console.log("Back to non restricted");
    input["mytpe:restricted"] = false;
    input["mytype:restricted_str"] = "NO";
    input = Document.Save(input, {});
  }

  return input;
}

First we start the process; see our restricted terms vocabulary; see the results of the image scan for terms; identify the offending term; then watermark the image. Lastly, we start a workflow to approve or reject the image.

The logs show the action taking place behind the scenes:

Logs

And now, the image in the system looks like this:

Workflow - Image with Restricted Stamp for review

Notice the watermark along with the workflow to validate the usage started.

As you can imagine, you really can expand on this idea/action a lot. For instance, as part of the workflow, you could automatically send a notification to someone alerting them to this sort of content. It really depends on your business needs.


Tagged: Nuxeo Studio, Nuxeo Plugin, How to