Hot Folder Uploader


Mon 31 March 2014 By Thibaud Arguillere

What is a hot folder? Well. You sure know what a folder is: A container, a directory on your drive. Now, what does it mean when a folder is hot?

Let’s start with what it does not mean:


  • The folder is not at a “high degree of heat, or high temperature”

  • It is not particularly sexy, nor attractive(1)


No. A hot folder actually is a folder whose content is permanently watched: When it is empty, nothing happens. When it contains at least one file, something happens.

In this article, that "something" is an upload of the files in the hot folder to a Nuxeo server. All is based on a shell script using bash(2). You can find the final script here, while in this article we are going to talk about the main principles. Here they are:


  • Having a script (an executable .sh file) which:


    • Loops on every file in the folder

    • Sends each one to Nuxeo, using curl



  • Adding it to your cron system. This is what makes the folder hot actually.


Let’s start with the script.

It needs the following information:


  • The Nuxeo server URL,

  • The destination container in this server (where the documents will be created), and

  • The credentials to access the server.


You will find them at the top of the script. Just make sure the credentials you are using allow creating documents in the destination container:

 SERVER_BASE_URL=http://localhost:8080/nuxeo
 USER_LOGIN=Administrator
 USER_PWD=Administrator
 # This folder, or workspace, or container *must*:
 # - Exists
 # - Have ACL which allows USER_LOGIN to create document
 NUXEO_DESTINATION_URL=$SERVER_BASE_URL/api/v1/path/default-domain/workspaces/hot-folder-import

Next, we need to create the correct type of document in Nuxeo. We don’t want to always create File documents if, for example, we are sending images or videos: We would not have preview, Storyboard, IPTC extraction, etc… To achieve this requirement, the script gets the mime-type of the file, and decides what kind of Nuxeo document to create. In my first tests it did not always work as I wanted. I found that .jpg, .png, .doc, .mp4, .avi, … files were correctly handled, but I had some raw pictures which were not recognized as “image/*” mime type. So, I added some more testing:
. . .
 nuxeo_doc_type="File"
 mime_type=`file --mime-type -b "$file"`
 case "$mime_type" in
 image/*) nuxeo_doc_type="Picture";;
 video/*) nuxeo_doc_type="Video";;
 *)
 case $file in
 *.mov) nuxeo_doc_type="Video";;
 *.ORF) nuxeo_doc_type="Picture";;
 *.xmp) nuxeo_doc_type="Picture";;
 *) nuxeo_doc_type="File";;
 esac
 esac
 . . .
Something quite interesting is that you could also ask Nuxeo to create your own custom document type, the one that you defined in Studio. Look at the “Room for enhancement” part of the README on GitHub; there already are some ideas. Creating the document, with its file, in Nuxeo is pretty easy with curl. Well, I found it very easy because I found YABFL(3) here. So, in a question of minutes, the script was ready - just the time to make typos and mistakes in passing the correct dynamic parameters to curl. Here is the send_to_nuxeo function (I removed test/log here, so it’s easier to read the main parts):
function send_to_nuxeo {
file_full_path=$1
doc_type=$2
filename="${file_full_path##*/}"
# Just cleanup spaces here
filename_clean=${filename// /_}

# Send the binary to nuxeo, using the file name as batch id
curl -H "X-Batch-Id: $filename_clean" -H "X-File-Idx:0" -H "X-File-Name:$filename" -F file=@"$file_full_path" -u "${USER_LOGIN}":"${USER_PWD}" "${SERVER_BASE_URL}/api/v1/automation/batch/upload"

# Create the document, asking nuxeo to use the file in this batch-id
curl -X POST -H "Content-Type: application/json" -u "${USER_LOGIN}":"${USER_PWD}" -d "{ "entity-type": "document", "name":"${filename_clean}", "type": "${doc_type}", "properties" : { "dc:title":"${filename}","file:content": {"upload-batch":"${filename_clean}","upload-fileId":"0"}}}" "${NUXEO_DESTINATION_URL}"

# Move of delete the file
if [ -n "$COPY_DEST_FOLDER" ]; then
mv "$filename" "$COPY_DEST_FOLDER/$filename"
else
rm "$filename"
fi
}
You can quick-test this script. Just make sure the user running the script has enough rights on the hot folder, and possibly on the hot folder backup folder if you used it. We now find ourselves with this wonderful script. But something is missing: So far, we don’t have a hot folder, do we? We ran this script manually. Our folder is just a folder. A regular, basic, nothing-special folder. We want to make it hot! We want to make it useful! We need to make the system run the script on a regular basis. One way to do it is to add the script (and its parameters) to our cron system. For example, to run the script every minute, you can use the crontab command. So you crontab -e, which opens the configuration file in vi, where you can add this line:
* * * * * "/path/to/script" "/path/to/hot-folder"

(Possibly, you would also add the 2nd parameter, to move the files instead of deleting them)

Isn’t this folder pretty hot?

(1) If you truly find yourself thinking a folder on your screen is attractive, then please, stop reading right now. Drop your pizza, take a shower, open the window, then open the door and take a walk outside. It's the doctor speaking.


(2) So it is mainly for Linux servers. Or Mac OS when doing quick tests. Sorry, not Windows. But I’ll be very happy to add your script for Windows to this GitHub repository.


(3) Yet Another Blog From Laurent

Category: Product & Development
Tagged: Document Management, How to