1. Home
  2. Refinitiv Labs
  3. Projects - what we're working on
  4. SentiMine

Project SentiMine

Surfacing equity performance themes in unstructured content

An advanced discoverability tool for unstructured content to identify key drivers of equity performance, and changes in outlook over time.

Problem and opportunity

The buy-side spends millions of staff hours combing through unstructured text to drive asset management and investment decisions.

The problem can be illustrated by a buy-side equity analyst covering Amazon.

In a 90-day period, the analyst typically receives over 200 research reports about Amazon, with each report from two to 60 pages in length. On top of this, the analyst will also come across more than 50 company transcripts and filings, as well as hundreds of news stories and emails regarding Amazon.

This is just one of over 50 equities the analyst is covering. A clear case of information overload.


There is a distinct need to make it easier and quicker to understand and extract insights from text-heavy unstructured content.


Refinitiv Labs created SentiMine to do this time-intensive research and analysis for our customers. The application helps users gain more value from unstructured content by reducing the time and associated cost of consuming research.

SentiMine combines natural language processing (NLP), sentiment analysis and deep learning to provide insights from thousands of unstructured research reports and company transcripts quickly and efficiently.

Insights include:

  • Potential drivers of equity performance
  • Changes in analysts’ outlooks (or sentiment) over time and across themes in equity research reports
  • Changes in analysts’ and companies’ outlooks over time across themes in multiple transcripts
  • Contrarian analyst views in a sea of consensus views

Refinitiv Labs created an ontology of potential drivers of equity performance that analysts look for when consuming equity research reports and transcripts. These potential drivers are referred to as themes. A supervised machine learning model identifies key themes in unstructured text. 

The SentiMine engine currently accounts for 110 themes that affect all equities. Each theme falls into one of seven categories: accounting, business drivers, valuation, economics, management change, key risks, and ESG issues. 

It also covers 40 themes across business sectors including finance, consumer retail, telecommunications, and technology. 

This [SentiMine] is incredible. You have built this so fast. Just few weeks ago we were talking with a PowerPoint. The interface is incredible!

SentiMine in action

The SentiMine prototype includes over 907 tickers / 478 equities and 3 years of research reports and transcripts from 2017 to today. New research reports and transcripts are added on a daily basis.

The prototype includes: 

  • A thematic overview that breaks down hundreds of reports and transcripts by key themes and provides an analyst’s or company’s outlook (sentiment) for each theme
  • A deep dive capability that surfaces every sentence from a report or transcript that contains the theme being investigated; helping to understand the analyst’s outlook
  • A change in outlook highlights an analyst’s or company’s outlook on a specific theme over time, as well as comparing the outlook to the stock market price and mean target price
  • A functionality to move between content classes, in this case equity research reports and transcripts
  • An equity overview that provides a summary of the equity from both research reports and transcripts, as well as a peer analysis

White paper

Discovering the sentiment in finance's unstructured data

In response to the increased demand for more advanced and scalable ways of consuming unstructured content, Refinitiv Labs created SentiMine, a novel discoverability tool specialized in highly complex financial documents such as equity research reports (ERR) and transcripts.

A collaborative approach

Refinitiv Labs takes a collaborative, customer-focused approach to building solutions to real problems in financial markets by combining customer feedback, extensive data capability and exceptional partner technologies.

Collaborating with our customers:

  • Sharing goals to ensure SentiMine will be a valuable customer solution
  • Incorporating customer feedback into every stage of the development process 
  • Concept validation with Refinitiv customers and stakeholders
  • Ongoing conversations with Refinitiv users interested in using SentiMine

Refinitiv partner and open source technologies:

  • Amazon Simple Storage Service (S3)
  • Amazon Athena query service
  • PyTorch machine learning library
  • TensorFlow machine learning platform
  • Apache Spark analytics engine
  • Mlflow machine learning lifecycle 
  • PostgreSQL relational database
  • React JavaScript library for user interfaces
  • Node JavaScript runtime environment 
  • Python