Hi everyone, and welcome to this new tech report. Today we have some news about Nuxeo Drive and Content Automation.
The first release of Drive is out and we did a quick review of the main tasks and main limitations that should be addressed in the next few weeks.
Identified issues / tasks
Adapter system: documentation and convergence
The Drive server side model is built on top of a pluggable adapter system so that mapping between file system items and document types can be configured.
The Adapter implementation:
- holds the logic for getBlob, delete, etc.
- hides the logic of differences between real documents and roots
The Adapter mapping system provides:
- per instance mapping since binding is not limited to doc types
- a complex adapter class hierarchy
This model is not simple and not easy to contribute, but we already had similar issues with similar modules (WSS / WebDav, Importer, Publisher).
We can not easily reduce the complexity needed to handle all use cases, but we should manage convergence between the Drive Adapter system and other file system-based adapters we have for WSS and WebDav.
Automation Operations: better JSON serialization
Drive uses a set of dedicated Operations. These operations do not return
DocumentModelbut the JSON serialization of the Adapter objects.
For now, the Operation:
- handles JSON serialization inside the code
- returns a Blob type
This does make sense for 5.6, but with the ongoing work on Automation in 5.7, this is not good:
- serialization should be part of the JAX-RS/Automation marshaling infrastructure
- Operation signature should return the Adapter object and not a Blob.
File system local mapping: handling multi-filling
The local database identifies resources using a compound id 'adapterFactory|repo|id', this system allows to handle both
DocumentModelbacked items as well as Virtual resources, however, the multi-filling use case (the same file at several places) is not yet handled correctly.
This may seem like a low priority use case, but this will actually be required to handle
- virtual Folders defined by queries
- LiveEdit working directory integration (see later)
Transaction issue: NXP-10964
Drive revealed a transaction issue with Automation: client may receive the result of an operation and start a new one before the first one is really commited. This problem is not specific to Drive and can be visible for other Automation clients doing extensive usage of API. This means this is something that should not be addressed in Drive, but directly in Automation, at least in 5.7 as part of the big changes in Automations.
Packaging and release issue: move clients archive to Marketplace package
In the current build system, the Drive JSF modules depends on the client installer builds to be able to embed them. This is an issue because it make the release process complicated and this forces Drive to out of the set of standard addons. We must move the client installers packaging inside the Marketplace Package so that we can break the dependency issue.
The file upload currently consumes a lot of memory because of the Multi-Part format usage. File upload should be moved to the Automation Batch API: this should allow to fix the memory issue without having to change the HTTPlib.
Client and update policy
For now the client setups are hosted directly inside the Nuxeo server. In the future, we may want to have the client setups hosted on something like drive.nuxeo.com or updates.nuxeo.com, pretty much as we used to do for the Firefox plugins.
The ideal update process should:
- be driven from the client side
- in most of the cases the Nuxeo Server won't have access to internet whereas the client will
- fetch a descriptor about possible updates (XML or JSON file)
- take into account the Nuxeo Drive Server side version to check what is the most up to date client version
- download the update if needed
- we don't need a real auto-update, downloading and restarting the installer is ok
The current client side implementation does a full filesystem scan every 5 seconds to search for update:
- this is not efficient when there are a lot of files
- this consumes CPU and IO (and battery) for nothing
This means that very quickly Drive client becomes a pain, so we must fix this quickly. To optimize that we should simply subscribe to FileSystem events:
- wait for event in the local Drive sub dirs
- start the scan only after an update
Ben can probably give some help on that (NXP-9583).
In most companies, Nuxeo Drive will need to go through proxies to be able to reach Nuxeo server. While simple proxy configuration is trivial, this may be more complicated for people using .PAC
We had such issues with LiveEdit, possible leads may be:
- use native Win32 API?
- leverage some QT Win32 bindings?
The work is moving forward and will be restarted early next week:
- with Damien on REST
- with Vlad on Marshaling
Florent commited this week an initial support for Solaris in 5.7.
For now, the fix is "just about" the Process Manager, if we really want to have support for Solaris, we should:
- check other external dependencies under Solaris
- OpenOffice + JOD, pdf2html, imagegamick, ffmpeg...
- add Solaris slaves on CI
Basically, this is doable, but there is a cost. Before going further we should wait for input from the community side: is Solaris a real requirement?
Ben restarted the work on VCS Caching to realign the alternative cache implementation on 5.7. It now works with EhCache with several configuration options:
- Single cache (no MVCC like before)
- XA mode (MVCC done by EHCache)
- XA mode + Ram Disk cache
We did some benchmark, but so far the performance with EhCache are not really better:
- EhCache has by itself an overhead (about 10x slower than pure SoftRef)
- Disk storage is 100x slower than SoftRef (Serialization cost)
- XA mode is about 100x slower than SoftRef
- XA mode and Disk storage is 120x slower than SoftRef
In addition, for now we have worked at the RowMapper level, the
pristinecache that is managed by the
- a kind of cache
- something that uses lot of memory and slows down the GC
As long as we have this pristine cache using weak reference, we won't be able to see the real benefit of the EhCache integration. It means that we should also move the pristine cache to EhCache, which also mean that we must review the invalidation system. This could become very handy with clusters.
Pluggable Login Page
This is a work in progress that was started in the context of OpenId contribution. In addition, Lise has support request that ask how the Login Page could be better skinned. This shows that we should finish the work and do the required Studio integration.
In the context of the OpenId work, the Login Page is now configurable via an Extension point. (cf NXP-10918)
We must still:
- make the new login.jsp the default in Nuxeo
- align Studio configuration on that
Login Page and Studio config
As already identified we should align Studio on the new Login.jsf screen:
- remove the studio specific login.jsp
- create a new Builder
In addition, we may want to
- allow to configure iFrame URL/source in Studio
- make CSS configurable: add a custom CSS just for the login page
JSF and JS Widget: the select2 example
The tricky part is of course to align state management so that the widgets behave correctly:
- in case of validation errors
- in case of ajax requests
- inside lists
Thanks to some wizardry of Anahide on
nxu:ValueHolder(NXP-11533) we now have a simple integration sample for building hybrid JSF/JS widgets without having to write JSF components.
For now the result is visible in nuxeo-select2-integration. This should probably end up merged inside default Nuxeo, maybe along with other JS-based widgets!
Connect / Studio
Connect 1.16 / Studio 2.11
This next release will include some changes:
- Nuxeo Online Services front end UI
- AngularJS + Automation to provide better screens and more features for our clients
- New Studio Look (more CSS wizardary by Lise)
- DAM features.