Content Enrichment Industry Insights
Number 1 (2016)
Welcome to this first issue of Content Enrichment Industry Insights. Short but informative, our insights delve into how content enrichment is playing a major part in publishers' workflows and what publishing organisations, vendors and suppliers are doing in this space. We also share our expertise and experience on the technology and software and how these can help meet the evolving needs of modern publishers.
Content enrichment is the application of modern content processing techniques like machine learning, AI and natural language processing to add structure, context and metadata to content to make it more useful to humans and computers.
Content enrichment utilises capabilities like entity identification, content classification against a taxonomy and creation of a content item semantic fingerprint. These capabilities can each drive multiple product features like the display of related content items, improved content notifications, taxonomic browse, faceted search, content collections and semantic search. Learn more about content enrichment here.
Making research more discoverable
Content is at the centre of everything a publisher does. Enriching that content delivers significant value across the whole content life cycle. One particular area where the need for content enrichment can add significant value is in enabling the researcher to find and discover the most relevant content to assist in the researcher's workflow. Features that can be enhanced using enrichment techniques are relating articles, subject and context navigation, categorisation of content and identification of entities to provide linking to other relevant content.
Improving discovery through relatedness
Publishers are trying to increase the usage of their content but at the same time ensure that they are providing researchers with the most relevant content when they search. One area where content enrichment can assist is in the area of relatedness. Here rather than use just the words in the documents to find other related content you can use techniques to give more meaning and context to the articles and the relationship they have with other articles. One approach uses software tools to extract the meaning from content to create a content fingerprint and then uses this fingerprint to find related pieces of content.
Creating content collections
Using machine learning techniques publishers can start to create collection micro sites that can be used to promote specific subject areas or to collect all content together related to a specific theme (e.g. Zika). An example of the type of collections that can be built is http://www.thelancet.com/campaigns/zika.
In most cases collections are manually selected by carrying out searches and then selecting appropriate content to include. Adding machine learning techniques allows the automated efficient creation of many content collections without human input.
Serendipity discovery in academic research
SAGE Publishing have published a paper on the importance serendipity plays in the scholarly discovery process. The research included a global survey with students and faculty combined with user experience research and interviews with publishing experts and technology suppliers. The conclusion was that there is room for this type of discovery and it is often overlooked. Looking back in history there are a number of stories of scientific discovery with lucky coincidences, good fortune is often cited as a key factor. It was found that researchers prefer stumbling on interesting, relevant content in the course of their research as opposed to having materials recommended to them by peers or based on popularity.
The paper is available at https://us.sagepub.com/sites/default/files/serrdiscovery.pdf.
Semantic Scholar: improving discovery using artificial intelligence and semantic enrichment
There are a number of examples of the use of artificial intelligence and semantic enrichment to deliver improved discoverability of scholarly content. One example of this is Semantic Scholar. This is a new discovery platform currently in beta https://www.semanticscholar.org/.
The European Molecular Biology Organization (EMBO) and Wiley have launched the SmartFigures Lab, developed in partnership with 67 Bricks.
SmartFigures are interactive figures that display data in the context of related results published in other papers and link figures to major biological databases. They enable users to intuitively navigate across papers through interconnected figures, facilitating literature browsing and accelerating data discovery.