Here's the July tech report. Let me warn you that this one is quite dense!


Infrastructure Improvements


Nuxeo Runtime


IDE and Deployment Model


The plan NXROADMAP-129 was to implement a white list/black list system. However, it does not seem to be needed anymore: we have found other ways to achieve the same goal.

This work includes:

  • IDE bundles always override server bundles

  • Make deployer deploy bundles defined as user lib in the IDE

  • Make user libs/bundles part of the static class loader

  • Will fix the initial deployment


Stephane will update NXROADMAP-129 accordingly. As a side note, for now we still have issues with the reload of some resources like JSF resources.

Even if we know there are solutions, it is not worth spending too much time on this: if a server restart fixes the issue, this is something that can be done when adding a new navigation case for example.

deployment-fragments


The cleanup for deployment fragment is part of the Tomcat 7 work: NXROADMAP-37.

This issue will be addressed after the Tomcat 7 alignment, hopefully in September.

Optional Contribution/Require


For dependencies, we would sometimes need to make them optional. Typically, nuxeo-poll depends on preview (which is in DM) to disable the preview, but it should work on CAP (with the needed nuxeo jars,but without the preview).

To solve this, we would need to make some requirements optional so that unresolved dependencies may not always trigger errors and break the bundle deployment.

Seam/JSF Improvements


RichFaces


Guillaume rebuilt a clean version of RichFaces including:

  • Our patches

  • All official patches we found


Seam/View States


Guillaume worked on NXP-11967 and NXP-11331 and thanks to that we now have:

  • An automatic Conversation creation when the user opens a new Tab/Window

    • The 'previous' link has been removed



  • A state manager that can handle correctly using Nuxeo in multiple tabs without having problem with the "can not restore view" issue


This work has been merged and validated by tests: this is a major improvement!

More Improvements to Come


We would like to improve our Seam layer to have:

  • Better error handling when conversation locking issues occur

    • At least know what are the concurrent requests

    • If we need to recreate a temporary conversation: create one that can work with Nuxeo!



  • Provide Nuxeo Runtime Service injection

    • This is easy and nice



  • Review existing Seam beans to make them lighter

    • Use Page rather than Conversation scopes

    • Reduce invalidation requirements

    • Reduce memory footprint




An other improvement we may want to do is to provide a simple way to integrate JS/Html Widgets with the existing Widget/Layout system. Most people find it complicated to build JSF components and prefer spending time debugging HTML/JS. We should help them in doing that if we have a clean solution.

The global approach is:

  • Use a simple h:inputText to create the component in the JSF tree

  • Use JSON serialization to init/reset the JS widget from the model

  • Use Automation RPC to do the processing

  • Use valueHolder to manage transient state in JSF tree


A first tentative was done with nuxeo-select2-integration and associated task https://jira.nuxeo.com/browse/NXP-12018

Tomcat 7


The task for aligning on Tomcat 7 is still here: NXP-10071. It was first reverted because of some broken tests, and since then some overlapping work has been done on the datasources system (to support non XA mode). Julien (probably with some help from Stephane) will try to realign and merge the branch.

The goal is to have this ready for 5.7.2!

Monitoring Integration


Since 5.7.1 we have an initial integration for Metrics and Graphite. However we still have some remaining work to do.

Cleanup and Finish the Work


There are still a lot of small tasks to be completed:

  • Upgrade Metrics version NXP-11995

    • Looks like we are using the version that was not properly tagged in GitHub!



  • Better expose the Geronimo pool

  • Rewrap application level probes NXP-11162

  • Cleanup metrics NXP-11161

  • Update documentation NXP-11164


JMX Integration


The data collected by Metrics is available via JMX. However, JMX:

  • Is not user friendly

  • Is not easy to tunnel/map as a protocol


As a result, it is a pain to explain how to get these info. This means we need an additional layer to make it easy to collect the stats. Stephane started the work in integrate a helper that includes:

  • A webserver to give access to JMS via web pages

  • Protocol adapters for JMX over http/https


We should(NXP-11997):

  • Deploy it by default

  • Only enable it via a nuxeo.conf property

  • Reuse server key for authentication


Ideally, we should have all counters/probes exposed via JAX-RS.

Global Monitoring Solution


Mathieu is working on providing a global monitoring solution. So, in addition of Metrics + Graphite + Diamond, the current solution includes:

  • LogStash

    • Fetch the logs

    • Clone the Metrics flow



  • Riemann

    • Manage alerts (based on rules written in closure)



  • Elastic Search

    • Index the logs



  • Kibana

    • Front end for Elastic Search




Optimizations and Misc Issues


Storage and Files


File Upload


The current file upload widget relies on RichFaces. This implementation has some drawbacks:

  • It relies on JSF/Seam

    • It maintains access to Seam conversation: this can create lock timeouts in Seam

    • Upload is done in the context of a Tx



  • Upload progress is managed server side

    • Several client/server round trips

    • Progress is wrong when nuxeo is behind a buffering reverse proxy like nginx




This is an opportunity to start a test implementation for a replacement of the fileupload action:

  • Just replacing the file import popup dialog is simpler than addressing the global Widget use case


The idea is to align with the way the import is currently done via Drag&Drop:

  • Use Automation Batch API for upload

  • Run an Operation to do the import


However, the upload part was currently done using a tweaked version of jquery-upload plugin so that we don't have hard dependency on JS File API. A prototype was done for 5.6 and is available here nuxeo-jqueryfileupload-integration

File Copies


We have improved the way big blobs are managed to relieve the server I/O. Florent did some optimizations to avoid local copies as far as possible: see NXP-11689

S3 Binary Manager


The S3 Binary Manager can easily become the bottleneck when dealing with a lot of file access and/or big files. We saw in recent bench that S3 can indeed become a limiting factor, especially when Network config is not optimum (Nuxeo and S3 being in separated networks). To make it scale for real we need some kind of Async S3 Binary manager:

  • Manage staging on a local FS

  • Send on S3 on async


The implementation is not as simple as it may seem (Tiry did a very naive implementation that does not work!), since it does require some infrastructure:

  • Distributed locking system

  • Persistent Job system


For this we decided to use Redis (that Ben and Florent already tested): this may end up being the first Nuxeo component with a direct dependency on Redis. See NXP-11731

Async Processing


Worker and Blocking Queue


We know we need to improve the infrastructure of the WorkManager, especially in the context of a cluster infrastructure:

  • Better manage synchronization between nodes

    • Manage exclusion



  • Avoid running concurrent jobs hitting the same document

    • Avoid dirty updates and deadlock



  • Make jobs persistent


The work on S3 Async Binary Manager may be the opportunity to bootstrap the infrastructure for that. Once we have that, Redis may become a requirement for:

  • Cluster architecture

  • Scalability of jobs

  • Safety of jobs


This may end up being a 5.8 feature!

Long Running Processing


The Problem

We (should) all know that long running transactions are bad:

  • Because it creates more concurrency against the resources like the Database

  • Because it ends up generating nasty issues like

    • Deadlock

    • Dirty updates

    • Transactions timeouts




The problem is all the more important when:

  • Database sucks

    • Here MS SQL Server does a very good job at being sucky



  • Processes have bad granularity

    • Typically processes that directly grow with number of users or number of documents



  • Processes that uses slow WebServices

  • Processes that manage big files


The typical examples are:

  • Async listeners that manage big file processing

  • Async listeners that call WebServices

  • Big Automation Chains that include a lot of processing in a big loop


Solving the Issue

Sadly there is no magic solution. The only way to avoid deadlock is to shorten the transactions. So when shortening the process is not an option, we have to split the process in smaller transactions.

So far, we have 2 patterns:

  • Read/Process/Save: split the global process in 3 steps

    • Read required data inside a Tx

    • Run the long processing on disconnected objects (outside of TX)

    • Save back result in the repository inside a TX



  • Create a new TX for each iteration inside a big loop


The tricky part is that we still have a low level problem where Session can not be correctly re-associated to the TX in some cases. See NXP-11681.
Asynchronous Listeners

A new base class for async listener was started using the read/process/save pattern to provide an example for a VirusScanner.

See nuxeo-virusscaner-sample and more specifically the AbstractLongRunningListener.java

The associated Jira task is NXP-11721 and as soon as NXP-11681 we will add this new base class inside Nuxeo.
Worker

Worker could also provide a base class that handles the read/process/save pattern. We could extract a base class from it and at least we should verify that the picture conversion in 5.7 uses a similar pattern to avoid long running transactions.

See nuxeo-imaging-async and more specifically PictureViewsComputerWorker.java

Task NXP-11975 was created for that.
Automation

In the context of Automation, we have an operation that runs an operation in loop. This is a good candidate for generating long running transactions, so this is a good candidate for testing the multi-tx solution. For that we created a initial implementation for RunOperationOnListInNewTransaction, see NXP-11885.

Misc Issues Reported via Support


Metrics Bad Impact on VCS Caching


Looks like VCS Cache is slowed down by Metrics timers. Stephane and Ben are investigating this point NXP-11925

Scalability Issue on FullText


As reported in NXP-11897 full text does not seem to scale well because of some migration issues on TS_VECTOR.

VCS and Dates


Most databases, even SQL Server starting with version 2005, allow us to store the time zone with datetime. However, for now, Nuxeo does not properly store the time zone with the date we save in the Database.

This means that, thanks to hacks, everything goes relatively well as long as we have the Nuxeo nodes and the database servers inside the same time zone. Fixing this won't be very complex, but will need some tooling to manage migration.

ACL Missing Prefixes


From the very beginning, we have forgotten to use prefix inside the ACL to differentiate users and groups. This lead us to several problems:


  • Having users and groups with the same name creates issues!

  • We have some services that use prefix and some that don't


This can end up generating very strange issues:

  • Tasks API expect to have prefixed users/groups for assignee

  • But it can be called with simple string extracted from an ACL

  • Tasks assignments are not correct!


A good fix would be:

  • Upgrade ACL

  • Provide a migration script

  • Force to use prefix everywhere


Automation


Client Side


Pseudo TCK


We have started a new documentation page that aims at defining some kind of TCK and register all known Automation Clients and their level of compliance. For now the TCK is nothing more than a suite of simple tests described in english, JS and java.

This page is just a draft for now, but we'll be working on this.

Vlad will externalize the tests resources that are, for now, in automation-test so that we can share a simple Nuxeo plugin that only contributes more complex schemas than what we have in default Nuxeo: this is used in the TCK.

Java Automation Client and Code Sharing


The Java Automation Client is evolving at the same time as the server side. But it becomes more and more obvious that there are some classes that should be shared by the server and the client:

  • Blob classes

  • Properties classes

  • DocumentRef classes


We also have some "almost duplicated" classes for the marshaling code. Moreover, the fact that we now support the concept of business adapter will push users to use the same class on the client and server side.

This task will not be easy to do without breaking anything on the client side API.

NXP-12010 was created for now.

JS Automation Lib


Inside Nuxeo we have an automation.js that contains primitives for doing automation calls. This script is mainly used by the Drag&Drop and some extensions. However the code was not cleaned and not documented. To address that, a new sandbox nuxeo-automation-jsshell was created.

The goal is to prepare in one module:

  • A better automation.js lib

    • Improved namespace

    • Improved api



  • The tests for the TCK

  • A JS based Shell that could replace the applet based shell


The current code base is a mix of:

  • The original automation.js

  • The uploader from the Drag and Drop script

  • Some new code


The first version includes tests done using qunitjs. Thomas forked the repository in troger/nuxeo-automation-jsshell to continue the work:

  • Improved namespacing and API cleanup

  • Switch tests to mocha and chai


For now, we have decided to keep the callback model (rather than switching to promises). Umbrella task is NXP-11860.

The target is to have everything working with Node.js, but this will require some additional work since some important API (Blob, File, XHR) are not completely available via Node.js. We'll also work on providing Bower and Node descriptors.

Python Automation Client


We will need to work on extracting it from the Drive client.

Dart Automation Lib


Nelson is moving forward on the Dart Automation Client. Hopefully, we will also have a playground in Dart + WebUI.

See NXP-11867

.Net Automation Client


Astone Solutions is moving forward on implementing the .Net client, and they are switching the Outlook extension to use it.

Server side


Marshaling


We have a lot of code that handles JSON marshaling for Nuxeo objects:

  • Inside Automation Server

  • Inside Automation Client


These classes are useful everywhere: so we will factorize the code in nuxeo-core-io.

See NXP-11949

REST


Damien has started the work on making Automation API HATEOAS. This new module will be based on WebEngine:

  • Using WebAdapter model

  • Leveraging the shared JSON marshaling


Business Adapters


Vlad (with the help of Stephane) added support for business adapters in the Automation API. Damien tested and did some adjustment to be able to reuse the same adapter system inside the REST bindings.

Vlad added some doc with an example in confluence.

For now, the system handles only one type of Business Adapter at a time. This means you can not have an heterogeneous list as input /output, unless you create a dedicated Business Adapter to wrap the list.

We'll see later if we have real uses like that where having support for multi-part would help...

Operation versus Chain Alignment


Vlad (with the help of Stephane) worked on aligning the models for Operations and Chains.

The idea is to:

  • Be able to call Chains or Operations the same way

  • Define documentation and parameters for Chains too


The final target will be to be able to bind forms to Automation Chains (ask Alain about this!).

Still to be addeed to the Chain:

  • Add input/output definition (infer from first and last operations)

  • Add label + description in the XMap descriptor

  • Add MVEL parsing at the right place (i.e. not in the descriptor)


Impact on existing API

  • Update the Flow operations (like RunChains...): done

  • Update the DnD import chains to leverage new features: to be done


Studio impacts to plan:

  • Add Chain Category

  • Add parameter definition in chain builder

  • Add doc edit in chain builder


Automation Documentation


The default site/automation/doc built-in documentation system has been updated to include the chains.

We still need to see with Lise if we can quickly make it nicer (it should not be a challenge to make it look better):

  • Make a CSS that does not look like 90's

  • Provide nicer navigation (now we have a lot of operations/chains)

  • Provide better display of JSON definitions


See NXP-12015.

Exception Management


Exception management in Automation will be addressed. As a first step, we'll focus on being able to get a proper exception call stack, we'll see for the control flow next.

UI Framework


Action System


We've added a type notion and a properties notion to Actions. This work is ongoing but won't be completed before 5.7.2.

Documentation


Action types and related properties are not documented yet, but this would be easier by using widget types to render actions of a given type.

Involved JIRA task NXDOC-188

Action Widget Types


Making actions used widget types for rendering would:

  • Improve pluggability: no need to override a nuxeo default template to add a custom action type

  • Ease up documentation: just need to integrate action widgets types to a layout showcase-like website

  • Ease up integration in Studio: action types configuration screens would be generated from widget types configuration

  • Allow defining complex filterable UI elements like the widget type displaying tabs (as it needs a notion of mode to handle the content of the tab, and this notion is already handed by widget types)


Work has already been started in a branch.

Involved JIRA tasks are: NXP-11142, NXP-9673, NXP-9384

Action Filters Evaluation Context


Action filters currently rely on a restricted context.

  • Make them rely on a context provided by the action, this will make actions more generic, and reusable in other UI contexts than Seam/JSF.

  • Make the current document used in action filters pluggable so that widget types displaying actions do not have to be specific to it (exp: "currentDocumentActions" widget type)


This will ease up usability. Users will be able to handle standard JSF expressions in action filters. This will also make it possible to define generic widget types.

Other features that this would unblock for the future (but not for now):

  • Possibility to add action widget types in layout listing rows.

  • Possibility to use actions displaying tabs outside of a current document context (exp: admin center)

  • Possibility to extend the action context, to provide current selected elements for instance (useful when adding action selection widget types)


Target task is NXP-10566.

Widgets


Documentation


Widgets have been added to a notion of control, that handle mostly (for now):

  • handlingLabels

  • addSurroundingForm

  • useAjaxForm


These controls are usually handled by default layout templates and default widget types accepting subwidgets.

NXDOC-185

Widget Type Definitions


The widget type definition has already been improved, but we need to continue to extend it so that we can handle more use cases inside in a cleaner way. The requirements come from the fact that we need to handle better:

  • What field types can be bound to a widget

  • If widget has a hard coded title or not

  • What are the default widget properties


We also need to be able to filter/present them better in Studio and improve their management.

Validation "properties"


For DAM we have requirements for adding some validation, especially on the file attributes. See NXP-11940

However, we should not try to handle this kind of validation server side: as much as possible everything should be managed on the client side. At least on recent browser this won't be an issue and for file uploads, doing server side validation is really a pain.

Products


Nuxeo Drive


Drive, Extensions and Profiles


We added some pluggable Java Factories so that users can have a different roots layout. For now pluggability is only java/factory level.

We should add the concept of profiles:

  • So that we can activate/deactivate profiles

  • So that tests can cover several layouts


Antoine will update Jira accordingly.

PyQT vs PySide


Testers have experienced stability issues on the QT bindings. After doing some digging, we found out that:

  • There are known bugs on PySide, the Python QT binding we use

  • The PySide project is almost dead


For testing purpose we created a branch using PyQT, the original Python QT binding. This works and this seems to fix the stability issue that testers reported.

Drive and Automation


For now the Python Automation is directly part of Nuxeo Drive. With the license switch, ideally, we would like to extract the Automation Client so that it can remain LGPL.

Upload and Download


Upload and Download of big files should now be fixed. Drive client now uses the 'automation batch upload api'.

CI Issues


Antoine took some time to fix the build and tests

  • fixed some encoding on Windows

  • fixed 5.7 issues with thumbs (double events in audit)

  • scalability issues on recursive adapter (may explain timeout on windows servers)


The scalability issue is probably just a side effect of the test that create 20 levels deep hierarchy.

Bench


Ben is currently doing performance tests on Drive.

Next Steps



  • Add support for http proxies

  • Improve scan system

  • Improve Drive hosting/deployment policy

  • Liveedit integration


DAM


Studio Integration


We have integrated some new widgets (player, picture view, video info). Complete configuration will require to finish the work on incremental widgets.

The integration of BoxListing in DM is also in progress.

Misc issues


ImportRoot

ImportRoot doc type was renamed AssetLibrary for consistency, but we should keep ImportRoot type for compatibility.

  • The new doctype is AssetLibrary to make it consistent and understandable in Studio

  • We keep backward compat on ImportRoot, but there are no subtypes for ImportRoot (for now)


After having the issue on the Intranet, we created NXP-11853. For now, we'll keep it like that, at least for the next FastTrack, even if this means adding the same subtypes to ImportRoot as AssetLibrary.

However, in the scope of 5.8, we should be able to handle automatic migration at SQL level.
Allowed Asset Types

Inside DAM import dialog, we need to filter available types based on the container. We cannot rely on default allowedTypes since there may be DM or SC types that we don't want to be available from DAM UI.

For now, we use a dedicated DAM allowedTypes:

  • This works

  • This is already integrated inside Studio


So, even if this is not very clean and this probably means we should have better flexibility on the allowedTypes definitions: we'll keep it like that for now...

NB: real option would be to extend the configuration of allowedTypes in the TypeManager
Import and DnD

The import system has been aligned on the Automation/DnD system of DM. This is still a transitional solution since:

  • We'll need to align everything on the automation.js changes

  • We want to add more features to the import/dnd (like picture previews via Canvas)


Bulk Tagging

We'll try to add support for:

  • Bulk tagging

  • define tags at import time (integrated with DnD)


Modernizr and HTML5 Features

For now, we are inferring browser features from the browser user agent using some server side helper. This does work for most cases, but still in some cases:

  • We need the info from client side without having to call the server

  • We need very precise feature detection like check history.push support that is used in REST linking


Modernizr does provide:

  • Clean html5 feature detection

  • Help to manage browser compatibility issues


Imaging Improvements


We know that we need to do some cleanup in Imaging components. High level list of tasks is:

  • Rework PictureBook type

    • Align on ContentView

    • Align on BoxListing

    • Rethink UI and slideshow



  • Cleanup adapters

    • Adapters should hold less automatic logic

    • Logic should be forwarded to service



  • Adapt the picture viewgeneration to multi-tx mode

  • Fix meta-data extraction:

    • Need to fix field size overflow

    • Dependency issues between meta-data/views (error on meta => rollback!)

    • Remove mistral for real

    • Integrate Laurent's work



  • Merge Nelson Silva Pull Request NXP-11582

  • Make Picture views configurable.


Social Collaboration


We know we need to improve the Social Collab feature, but we must be careful not to break everything.

DM and Social Collab integration


We want to add some of the Social Collab feature to a 'standard workspace':

  • Activity stream

  • Wall


This work could also include adding the rich profile in DM. Thomas initially started this work in a branch: at some point we should resume this work and do the merge. This should end up in creating 2 separated marketplace package:

  • One that adds basic social features to a Nuxeo DM

  • One that include the collaboration model of the current Social Collab module


Local Groups


Local Groups are one of the cool features that comes with Social Collab, Presales often use this inside DM. Ideally, Benjamin should package this work in an addon and submit it.

Ideas on Product Evolutions


About Gadgets


Alain would like to replace/improve the OpenSocial Gadget system. However, this is not really the time to go back to plain server side rendering for dashboard: this would be anachronistic!

What we could do is:

  • Improve the JS container to also support Non OpenSocial Gadgets

  • Provide a way to make Non OpenSocial Gadgets using Automation + Angular


UI/UX


JS/Html Widgets

We are likely to have more and more Html/JS Widgets. We have already started doing work on jQueryUpload and Select2. This is very likely to be continued.

CI and Build


Maven 3 upgrade


We need to start this upgrade ASAP and for this we must fix the distribution issue. Julien tested to see if there are quick fixes to align our existing nuxeo-distribution maven plugin: looks like there is none.

Before starting the work on building a brand new Maven3 plugin, we should be sure that there is no alternative good solution.

  • Use gradle to mix maven dependencies with procedural commands

  • Fix an ant/maven plugin that does work


If we really have no other valid option that rewrite we should:

  • Create the Git Repo

  • Update the ReadMe to explain what we want to do

  • Start some initial code (at least some tests)

  • Push on Maven3 Mailing list to see if we can have more players