If you follow this blog you’ll remember I’ve been writing a lot about file import recently. And I have a new trick for you on this matter. When I created the import factory, I took an existing Nuxeo XML export as example. But this could also work with single files. In this case, the importer uses the FileMangerService plugin. The role of this plugin is to create Documents in Nuxeo using a given file.
When the file is imported, Nuxeo goes through each plugin following their processes. If the mime type matches, then it tries to create the document. If it returns null instead of a DocumentModel, the next plugin with a matching mime type is used, and so on. Here is a simple plugin contribution as an example:
<extension target="org.nuxeo.ecm.platform.filemanager.service.FileManagerService" point="plugins"><plugin name="ExportedArchivePlugin" class="org.nuxeo.ecm.platform.filemanager.service.extension.ExportedZipImporter" order="10"><filter>application/zip</filter></plugin>
<plugin name="CSVArchivePlugin" class="org.nuxeo.ecm.platform.filemanager.service.extension.CSVZipImporter" order="11"><filter>application/zip</filter></plugin>
<plugin name="DefaultFileImporter" class="org.nuxeo.ecm.platform.filemanager.service.extension.DefaultFileImporter" order="100"><filter>.*</filter></plugin></extension>
Take a simple zip file as an example. The first plugin used will be the ExportedArchivePlugin. It will return null because the zip file is not a Nuxeo export; it doesn’t contain a .nuxeo-archive file. The next plugin will be the CSVArchivePlugin. Since the zip doesn’t contain a meta-data.csv file, the plugin returns null. The next one to be used will be the DefaultFileImporter, which will create a simple File document with the zip file as main attachment.
When should you use plugins?
Plugins are a good fit for two things. The first one is creating several documents from a single file. With the ExportedArchivePlugin, you can create as many documents as you want using the content of this archive. You could also have a big XML file or maybe a GEDCOM file.
The other good fit is choosing the right DocType for a specific file. Maybe you have defined an OfficeFile document type that would be better then the usual File for all your office files.
Sure, you could use it to add business code like metadata extraction, specific property updates, content transformation, etc., but this would be better in a listener. This way, your code is called when you create a document from a creation form or from the import. And you can choose to do this synchronously or asynchronously, which is always good for big imports or file conversions.
Creating a new plugin
So now I’m going to write a plugin that creates people and links them together from a GEDCOM file. A GEDCOM file represents a genealogical tree. You can get information about individuals, and meta data linking them together. That’s a perfect example for a FileImporter plugin. Let’s get to it!
Here is my XML contribution. It’s really simple – you need to indicate the Java class implementing your import logic, give it an order, a name and a list of mime types. GEDCOM files are text, so you just need the ‘text/plain’ mime type. I’ve chosen 9 as the order to be sure this plugin is the first to be used. If my GEDCOM parser does not recognize the file as a GEDCOM, my plugin will simply return null and the FileManagerService will give the file to the next plugin.
<extension target="org.nuxeo.ecm.platform.filemanager.service.FileManagerService" point="plugins"><plugin name="GEDCOMImporter" class="org.nuxeo.genealogy.core.plugin.GEDCOMFileManagerPlugin" order="9"><filter>text/plain</filter></plugin></extension>
This is my Java Class which extends the AbstractFileImporter (which itself implements the FileImporter Interface). I’ve found a nice library to do the GEDCOM parsing. It’s called gedcom4j and is available on Google code. It’s open source and the project is still active.
This is a really simple and basic example. It creates a folder named as the original file, then creates documents of type Individual, adding some metadata and then linking those documents together with relations.
public class GEDCOMFileManagerPlugin extends AbstractFileImporter {
public static final String SPOUSE_OF_PREDICATE = "http://purl.org/vocab/relationship/spouseOf";
public static final String CHILD_OF_PREDICATE = "http://purl.org/vocab/relationship/childOf";
private static final long serialVersionUID = 1876876876L;
private static final Log log = LogFactory .getLog(GEDCOMFileManagerPlugin.class);
protected RelationManager relationManager;
protected DocumentModel treeContainerDoc;
public DocumentModel create(CoreSession session, Blob content, String path, boolean overwrite, String filename, TypeManager typeService) throws ClientException, IOException {
GedcomParser parser = new GedcomParser(); try { parser.load(content.getStream()); } catch (GedcomParserException e) { log.warn("Not a gedcom file", e); return null; } Gedcom gedcom = parser.gedcom;
// create tree container treeContainerDoc = session .createDocumentModel(path, filename, "Folder"); treeContainerDoc.setPropertyValue("dc:title", filename); session.createDocument(treeContainerDoc);
// create every individual in the tree Collection <individual>everybody = gedcom.individuals.values(); for (Individual individual : everybody) { DocumentModel docInd = session.createDocumentModel( treeContainerDoc.getPathAsString(), individual.xref, "Individual"); docInd.setPropertyValue("dc:title", individual.formattedName()); docInd.setPropertyValue("idv:xref", individual.xref); docInd.setPropertyValue("idv:name", individual.formattedName()); docInd.setPropertyValue("idv:gender", individual.sex); session.createDocument(docInd); } // add spouseOf relations for (Individual individual : everybody) { for (FamilySpouse spouse : individual.familiesWhereSpouse) { if (spouse.family.husband == individual) { if (spouse.family.wife != null) { addIndividualsRelation(individual.xref, SPOUSE_OF_PREDICATE, spouse.family.wife.xref, session); } } else { if (spouse.family.husband != null) { addIndividualsRelation(individual.xref, SPOUSE_OF_PREDICATE, spouse.family.husband.xref, session); } }
} // add childOf relations for (FamilyChild f : individual.familiesWhereChild) {
if (f.family.wife != null) { addIndividualsRelation(individual.xref, CHILD_OF_PREDICATE, f.family.wife.xref, session); } if (f.family.husband != null) { addIndividualsRelation(individual.xref, CHILD_OF_PREDICATE, f.family.husband.xref, session); } } }
return treeContainerDoc; }
private RelationManager getRelationManager() { if (relationManager == null) { try { relationManager = Framework.getService(RelationManager.class); } catch (Exception e) { throw new RuntimeException(e); } } return relationManager; }
private void addIndividualsRelation(String subjectXRef, String predicat, String objectXRef, CoreSession session) throws ClientException { Resource documentResource = getDocumentResource(getIndDoc(subjectXRef, treeContainerDoc.getRef(), session)); Resource predicate = new ResourceImpl(predicat);
Resource objectResource = getDocumentResource(getIndDoc(objectXRef, treeContainerDoc.getRef(), session));
Statement stmt = new StatementImpl(documentResource, predicate, objectResource); Graph graph = getRelationManager().getGraphByName( RelationConstants.GRAPH_NAME); graph.add(stmt); }
public DocumentModel getIndDoc(String xref, DocumentRef parentRef, CoreSession session) throws ClientException { return session.getChild(parentRef, xref); }
public QNameResource getDocumentResource(DocumentModel document) throws ClientException { QNameResource documentResource = null; if (document != null) { documentResource = (QNameResource) getRelationManager() .getResource(RelationConstants.DOCUMENT_NAMESPACE, document, null); } return documentResource; }
}
That’s it for today – see you on Friday!