Artificial intelligence (AI) creates abundant opportunities for a wide range of intelligent, automated business operations. Two vital capabilities—metadata extraction and data enrichment—rank among the most valuable, commonly used functions for businesses seeking to harness immediate value from organizational data and content. AI-driven techniques for rapidly sorting, filtering, categorizing, and adding context to massive volumes of data can help deliver a distinct business advantage. By combining accessible, cloud-based AI services and customizable, specialized AI tools and training, businesses can shape data and content services to better meet their objectives.
The Content Conundrum
Despite the accelerating, never-ending spiral of accumulating content, most businesses aren’t gaining the insights they need nor seeing visible operational benefits, as asserted in a Software Development Times article. Based on survey results obtained from IDG, the article stated, “Data volumes are growing at an average of 63% per month, with 12% of organizations reporting over 100% percent growth every month. According to a survey by IDC, in 2018 alone, storage suppliers added more than 700 exabytes of storage capacity to keep up with growing data volumes.”
The ongoing challenge is that the insights and value from these huge volumes of data and content are typically submerged, rendered inaccessible by disjointed information management systems, non-uniform and inconsistent metadata values, restricted search capabilities across diverse business systems, and slow, error-prone, manual-entry processes. This is where AI and machine learning offer tremendous value by performing metadata extraction, both for image-based and text-based content, automating business operations that require transforming unstructured content or source data from multiple systems into coherent business intelligence. While there are many futuristic, blue-sky imaginings for AI technology, the real-world challenge of enriching documents and adding context to data is readily available today and this is a practical, workhorse application for AI. By making information more easily searchable, widely accessible across complex network systems, and rich with context, businesses amplify the value of their content resources.
Writing for Forbes, Bernard Marr commented
In the grand scheme of things, artificial intelligence (AI) is still in the very early stages of adoption by most organizations. However, most leaders are quite excited to implement AI into the company’s business functions to start realizing its extraordinary benefits.
In his article, 10 Business Functions That Are Ready to Use Artificial Intelligence, Marr goes on to list the kinds of business functions that can be improved with AI, including many that capitalize on the inherent value of data and content in novel, time-saving ways.
Harnessing AI to Enrich Data
An expanding slate of commodity AI services offer an entry point for performing data enrichment and can be integrated into a modern Content Services Platform (CSP) to enhance the value of documents, images, videos, and more. AI services available through Google, Microsoft, Amazon, and others include:
- Natural language processing, sentiment analysis, and entity extraction (to categorize and classify elements from text).
- Speech-to-text, text-to-speech, computer vision, forms recognition, and text analytics.
- Translation of various languages using neural-machine techniques.
- Automated building, training, and tuning of models with comprehensive management features.
With AI services that are generally equivalent among the leaders in this sector, users are free to select whatever AI solutions they favor.
A classic instance of how an organization might apply AI to an everyday challenge is automating the conversion of a large volume of TIFF images into PDF format. The end result is an enriched, more searchable document, offering full-text indexing, as well as enhanced in-document search processes. For example, searches for specific terms within the PDF—such as a customer name, a city, or a product code—can quickly be run, identifying each instance of the search term.
The conversion includes these steps:
- Use an AI-based optical character recognition (OCR) service to create a text file from the TIFF image, for indexing and manipulation. Amazon Textract is one service that can accomplish this.
- Transform the text content into PDF files, mapping the content to the original image.
- Ingest the PDF files into a CSP to gain full access to the full-text indexing and expanded search capabilities.
- Data enrichment is used in many different ways in various industries. Insurance companies can combine customer records, telematics data capturing during vehicle operation, theft or crash-site images, and video—enhanced by AI—to validate claims and speed processing.
Using AI processes, call center staff members can apply sentiment analysis to gauge customer responses during a conversation and automate record retrieval so customer data is instantly available onscreen. Banks can use AI to detect suspicious credit card activity—such as card charges taking place in a country far from the cardholder’s home of record—and query the cardholder to see if the charges are valid. Gaming companies that want to provide captured video clips of gameplay highlights can sort and categorize the video content by game, player, or other criteria for online access. These kinds of tasks, powered by AI and machine-learning models, can eliminate many tedious human activities and improve operational efficiency.
Nuxeo Insight for AI Services and Specialized Training
Artificial intelligence is having an increasing impact on content services. Nuxeo developed its own AI service, Nuxeo Insight, which allows customers to easily train machine-learning models with their own content and data. The service can provide significant advantages in the market by generating specific results for things like claims processing, product identification, and customer service.
Jim Lundy, CEO and Lead Analyst, Aragon Research
Nuxeo Insight enables organizations just getting started with AI to tap into public cloud-based services—including Amazon AI Services, Google Cloud AI, and Microsoft Cognitive Services—and apply machine-learning (ML) models to their business operations. Taking these generic capabilities a step further, Nuxeo can develop specialized machine-learning models that align with an organization’s brand information, company practices, and document formats, and as well as support critical operations. Insight is grounded in the popular, flexible AI framework TensorFlow, using this standard to fine tune and extend training models for specific industry requirements.
Models based on TensorFlow with generic capabilities can perform a wide range of functions, but for more explicit AI operations custom models deliver more precise results. For example, a basic TensorFlow model using a deep neural network can perform image recognition to distinguish a dog from a cat, as shown in the following figure. A more robust training model can recognize that the dog is a border collie, golden retriever, basset hound, or other breed.
A recently released, low-code user interface (UI) makes it easy for IT staff to set up and administer Nuxeo Insight, providing a dashboard for managing cloud-based AI services, as well as Nuxeo Platform resources. The UI also serves as a streamlined way to configure and train the machine learning models, providing visual cues to gauge the ongoing training effectiveness. Learn more about this visual UI in Nuxeo Insight: Raising the Bar Again. As has been the objective of Nuxeo Platform since its inception, the method for accessing services can range from menu-driven to code-based, as granular as necessary to configure tasks and control resources.
This episode is the first in a series of three, highlighting the ways in which customers can effectively power content with AI. For more information, we recommend reading Powering your Content with AI: A 10-minute guide to artificial intelligence in content services platforms.