CMIS Meeting, Day Two

Thu 06 August 2009 By Florent Guillaume

(The first part of this series described the first day of the meeting.)

On the second day of the CMIS face-to-face meeting we again spent some quality time reading the spec nearly line by line, making sure everything is coherent, and discussing a few important points that people felt were important for their use cases.

Below I'll outline some important changes made to the spec on the first and second day of this meeting. There's more of course, you may want to follow everything in the CMIS JIRA.

The XML and XHTML property types are gone. No vendor was in support of them, and it was actually quite hard to standardize on exactly what kind of XML would be stored in such a property (well-formed? fragment? etc.). We kept the HTML property type, as many repositories still want to distinguish between "basic text" and "rich text", especially for presentation purposes. If a repository has XML or XHTML properties, it can easily expose them as Strings.

The ability to use paths to get to folders was extended to documents as well (getFolderByPath turns into getObjectByPath). For folders (were paths are well-defined), paths are retrieved through an explicit property "cmis:path", but for documents (which may be multi-filed) we have to be more careful. Whenever a document is retrieved in the context of a folder (getChildren, getDescendants, getObjectParents), its last path segment inside that folder will be available, so that clients can determine a full path for the document — but this segment is not a real property of the document, as it may change depending on context. Finally, the "cmis:name" property will be only a hint for the repository to choose a path segment for new objects, but the only way to be sure of an object's path is through folder's cmis:path and the aforementioned document path segment.

ACLs have been available since 0.62, but the exact set of basic permissions that they can expose is hard to pin down. We had cmis:read, cmis:write, cmis:delete and cmis:all, however some vendors have a hard time mapping their native permissions (or pseudo-roles) to such a basic model, and especially to cmis:delete which in itself is ambiguous considering that in a given repository deleting an object may require some permission on the parent and some other permission on the child. To further simplify the model, it's been decided that cmis:delete would go. But fear not, the ACL model is such that each vendor has the possibility of exposing its native permissions, and exposing which of them are required for each of the CMIS operations, so clients will still be able to make good use of ACLs even if not everything about them is standardized.

With ACLs come principals, and some special principals are sufficiently common that it's worthwhile for a client to know their ids. Therefore we added a way for a repository to tell a client what's the principal id for "anonymous", what's the one for "everybody", and we added a way to specify "me" when setting ACLs.

The need for Policies has been discussed as well, as there are no actual uses of them in the spec; they're an abstract placeholder for vendor extensions. Should they go? We now have ACLs after all... But there are already vendors making use of them to expose features of their repositories, so keeping them is good for them, and costs little to others (they're optional after all).

We now have a way to do copies! For the longest of time, this wasn't the case. There was strong opposition to adding a copy method, as copy semantics is very varied among repositories (do you copy document relations? acls? versions? renditions? streams? folder children? what about multi-filing? etc.). Nevertheless myself and others persistently asked for a way to do copies. The deciding argument this time was that even though in most cases the clients can do the copy themselves, by just creating a new object with the same properties as the one to copy, there is a problem with content streams as they may be multi-gigabyte objects — at a minimum we need a way to copy content streams. After lots of discussions, we decided to introduce a createDocumentFromSource method, which works just like a normal creation except that a source document is also provided. The repository will then use whatever it feels is best from this source document to fill in the created document. Note that we don't specify a way to do folder copies, as these very too much between implementations.

In AtomPub, if you want to create a document with a content stream, you have several standard ways available. However the only way to do creation in just one call is to embed the content stream in the message, and AtomPub has strong constraints on how you can do that: for XML- or text-related content types, AtomPub mandates that the stream be inlined in clear text (presumably for the benefit of AtomPub readers). But this is problematic as soon as you want to transfer content that is slightly invalid (but nevertheless stored in your repository!), or whose text content encoding is unknown, or is XML where you want to keep exact formatting, comments, prefixes, namespaces and all. Therefore, we added an extension to AtomPub (cmisra:content) that allows base64 transfer of content in all cases.

URI templates had been added to the AtomPub bindings in order to have a non-REST but very fast way for a client to access a document by ID, by path, or to make a query without a POST. URI templates, however, are still a draft, and it's problematic to include them in a standard. Furthermore the URI templates draft specifies many different ways along which variable replacement can be done, including tests, defaults, list delimiters, escaping, etc. We thus decided that a simplified subset would be used: just simple {variable} replacement, with percent-escaping. This solves most of the problems, and is still better than nothing.

Today is the third and last day of the meeting, mostly filled with interoperability tests and still more discussions about the spec. Stay tuned for more!

Category: Product & Development
Tagged: CMIS