Refinitiv Labs has developed a COVID-19 Company News Tracker to help customers quickly and effectively identify signals of risk and opportunity in news data.
Refinitiv Labs’ COVID-19 Company News Tracker helps customers cut through the large volume of information about the pandemic, and spot risks and opportunities in news data.
This new machine learning (ML) and natural language processing (NLP) model can provide insights when unforeseen, high-impact events occur and create uncertainty in financial markets.
- To learn more about the news tracker, and how you can access these new data insights, tune in to our virtual Learn-it-all Lab.
For more data-driven insights in your Inbox, subscribe to the Refinitiv Perspectives weekly newsletter.
The fallout from the COVID-19 pandemic has generated an overwhelming volume of news on companies, industries and supply chains, leaving Refinitiv customers with the challenge of finding signals that will help them manage risks and build resilience.
Aggregating available data from sources such as the World Health Organization (WHO), and delivering news headlines via data vendor interfaces alone, do not generate the insights required to make more informed business decisions.
We have built the COVID-19 Company News Tracker to provide time series signals related to the Coronavirus that enable customers to focus on companies and industries of interest, and identify both opportunities and risks at pace.
The team behind this latest Refinitiv Labs prototype includes a number of specialists who together have delivered the application at speed, with ideation to deployment taking just 10 weeks.
Refinitiv Learn-it-all Labs: Leveraging machine learning to signal COVID-19 risks & opportunities
Incorporating customer feedback
The project was developed in collaboration with customers to ensure the creation of a relevant, high-quality solution. We have engaged strategists, economists, traders, wealth managers, company analysts and data specialists, and shared developments with customers.
Customer feedback informed both the functionality and the user interface of the application, and provided additional use cases for the prototype’s underlying ML model.
Annotating Refinitiv news data
Key to the tracker’s effectiveness is accurate, high-quality and timely news data. This was sourced from the Refinitiv News Archive, and hosted in a database dedicated to the application. The data was filtered down to 150,000 news articles published since November 2019 and related to COVID-19.
Consistent annotation of the data was critical to ensuring that the ML model provided meaningful outcomes. Annotation was a team effort. We have initially worked iteratively on 100 example articles that would detect whether COVID-19 articles mentioned any of the three classifications of ‘risk’, ‘opportunity’, or ‘none’.
Once a consistent method was put in place, 7,500 example articles were annotated by the team.
Additional Refinitiv datasets and services used in the development of the COVID-19 Company News Tracker include:
- Thomson Reuters Business Classification (TRBC), a comprehensive sector and industry classification.
- Refinitiv Knowledge Graph, a Proof of Concept (PoC) available in the company’s Data Exploration Tool.
- Refinitiv company fundamentals.
- Refinitiv Intelligent Tagging, which uses NLP, text analytics and data mining to derive meaning from vast amounts of unstructured content.
Combining ML with NLP
The ML model behind the COVID-19 Company News Tracker builds on an existing risk tracking model designed by Refinitiv Labs. The original risk tracker, trained using four million Refinitiv news articles, focused on highlighting financial, environmental, and operational risks.
We have narrowed down the remit of the tracker to COVID-19, and the material issues of risk and opportunity, and produced a new model.
It is based on Google’s open source NLP model, BERT (Bidirectional Encoder Representation from Transformers), a neural language model that generates language representations. BERT is pre-trained on 3.3 billion words from a general domain corpus including Wikipedia and the BookCorpus dataset.
Watch — The economic impact of coronavirus: What can news data tell us?
Training the ML model
While a generic version of BERT is readily available, Refinitiv Labs used 2.7 million relevant words from the Refinitiv News Archive to pre-train a new BERT model (BERT-RNA), and make it more attuned to the language of news stories. BERT-RNA was then fine-tuned with the 7,500 annotated article examples in order to focus its scope on COVID-19.
Based on these features, the COVID-19 Company News Tracker automatically reads news to determine whether an article contains a COVID-19 risk or opportunity for the companies or industries mentioned.
The tracker is available in the MacroVitals app within Refinitiv’s Eikon and Workspace solutions, and enables users to perform company and industry searches to:
- Track risk and opportunity trends over time, and compare company results against those of their peers.
- Analyze risks and opportunities mentioned in news articles about supply chains.
- Read the latest news headlines and stories relevant to their interests.
Creating ML models at speed
Investing time in training and fine tuning the COVID-19 tracker has strengthened Refinitiv Labs’ capability to further enhance the model, and develop new models in the future at ever greater speed.
We are currently investigating a number of potential improvements to the tracker, including the ability to determine the temporal nature of news stories, or to show customers when the data relates to past events, ongoing events or developments expected in future.
In the long run, unforeseen, high-impact events will continue to create volatility in financial markets. Refinitiv customers using the Labs’ ML models will have the advantage of being able to read the situation quickly, and respond at speed.