How can ‘document atomisation’ bring benefits to research?

28th January 2020

Clear and relevant research is fundamental to financial decision-making. Yet many market participants are swamped with an overload of information, making it difficult to navigate through the mass and gain a comprehensive understanding of the subject. By accessing material in a detailed and practicable way, large research documents can be transformed into targeted insights. This will power financial activity and enhance business progress, say Rowland Park and Simon Gregory, co-founders of financial research technology company, Limeglass.

Comprehensive research should be the foundation of any financial decision, yet often, accessing relevant information requires a time-consuming trawl through a range of long documents. As indicated in a recent Limeglass blog, an overload of information, coupled with outdated methods of research consumption, can lead to a loss of opportunity.

Traditional search processes such as looking through an email inbox or using a ‘Control+F’ function in a text document makes it difficult to discover real insights. The inability to quickly gauge relevant material may lead to market participants missing out on valuable investment opportunities in financial markets.

Furthermore, searching for information across multiple documents may result in changing contexts, which can be confusing. Ineffective research hampers the gathering of useful intelligence and hinders the ability to make confident trading decisions.

Innovation in technology, to provide a detailed assessment of paragraphs and generate a 360-degree view of the subject, is paramount, especially in challenging markets when margins are tight.

The need for granularity

Despite the vital importance of research to financial market participants, research budgets at buy-side firms continue to be scrutinised, and sell-side research teams find themselves having to fight harder to justify their roles.

One of the big drivers of this trend is an alteration in the business model of research teams within sell-side firms, which don’t have the right tools to adapt to changing business conditions. The current model of distribution does not enable them or their audiences to extract maximum value from their research.

The fundamental problem is that research is generally being distributed in an inherently non-digital, impersonalised and unresponsive manner.

We need to completely rethink the way we access a research document. Instead of thinking about it as a whole, the challenge is to consider a document as a series of interrelated insights and details, some of which will intersect with other potential areas of interest or additional articles. Such a philosophy can lead to tailored insights at a granular level.

The ability to ‘atomise’ documents

Yet how do we achieve this granularity? To maximise the value of research, every paragraph of each document should be tagged in context in real-time; this transforms a body of research from a series of unstructured documents sitting in a digital library, into a huge matrix of tagged material.

Tagging is not only of particular words themselves but also of synonyms and associated phrases. For example, if you were looking for information on ‘the US China trade war’ you would have to search for a series of combinations involving multitude synonyms from ‘tariffs’ to ‘US-Sino trade tensions’ (Limeglass has more than 40 synonyms on this particular topic) and then you would need to surface only those specific paragraphs relating to the relevant countries.

At Limeglass, we define this ability as ‘Document Atomisation’. Essentially this means unlocking the value buried deep within the research without requiring the analyst to change how they write or publish their articles.

How does this help market participants?

Once the insights in the research have been atomised, they can be reassembled in any number of different combinations to perfectly suit the needs of the individual market participant at any given moment.

For example, if a bond trader is planning to buy German Bunds, their search would bring up the relevant paragraphs in a document on German GDP as well as useful sections of other articles on 10-year Bund yields. Other relevant paragraphs in documents dealing with the Euro, or the latest German Purchasing Managers Index numbers, would also be surfaced.

Personalised atomisation of the information enables the trader to quickly read all the relevant paragraphs within their research library without having to sift through entire documents one-by-one, saving a considerable amount of time. In addition, because the reader is only seeing the relevant paragraphs, detailed and accurate metrics are supplied to the research report writer on what information within their output is most useful; this enables a refinement of research production in future.

The start of something special

This kind of technological innovation within the financial research market is only the beginning. Once research is atomised in this manner on a regular basis and at scale, it provides opportunities for use in other industries and for further innovation.

The atomisation process enables unstructured information to be transformed into structured intelligence, capable of being analysed by both humans and computers, and communicated via APIs. Facilitating access to particular paragraphs via APIs can enhance both a publisher’s research offering as well as providing a better reading experience for report consumers.

Creating context-aware paragraph tagging and research atomisation technology has involved developing bespoke solutions. Leveraging proprietary rich Natural Language Processing (NLP), artificial intelligence (AI) and machine learning has led to the construction of a comprehensive cross-asset and macro taxonomy. As well as providing unparalleled access to research paragraphs, Limeglass’s mission is to deliver metrics that allow for smarter analytics which empower financial institutions to quantify and qualify exactly how market participants utilise and invest in their research.

Document atomisation is a fundamental building block in providing personalised research to users whilst delivering a trackable and traceable model for how research is generated and consumed.

With developments in AI and rich NLP, now is the opportune time to harness smart technology and transform information overload of financial research, from a liability into the valuable asset it is intended to be.