At a recent webinar hosted by Refinitiv, industry experts from MathWorks spoke about the usefulness of point-in-time data for avoiding common biases in backtesting and better training machine learning models, and safeguarding your investment process.
- Data biases, such as survivorship and look-ahead bias, can lead to backtests of investment strategies overstating achievable results and machine learning models being incorrectly trained.
- These biases can be adjusted for by using point-in-time data, which provide a more accurate representation of current reality during the sample periods in question.
- Insights into the effects of data biases in backtests can be conveniently and intuitively visualised and shared using MATLAB.
For more data-driven insights in your Inbox, subscribe to the Refinitiv Perspectives weekly newsletter.
Data scientists and quantitative analysts rely on historical financial data to run backtests on investment strategies and train machine learning (ML) models. The accuracy of these backtests and the effectiveness of their models depend on the data being clean and free of bias.
However, building and testing models using standard data sets, which are prone to bias, could mean jeopardising the entire investment process.
According to insights shared by experts speaking at a webinar hosted by Refinitiv in June 2021, the use of point-in-time data can help avoid two of the most common biases: survivorship bias and look-ahead bias.
“Survivorship bias involves ignoring companies that no longer exist, either because they were delisted, went bankrupt or because of some other corporate action, such as M&A activity,” said Richard Goldman, Global Director, Sales Strategy & Execution – Quantitative Analytics at Refinitiv.
Meanwhile, look-ahead bias involves “using data that was not known at the point in time, which is analogous to investing with a ‘crystal ball’.’”
Survivorship bias: lessons from history
To illustrate the effect of survivorship bias in backtesting, Goldman offered an example from World War II, when the U.S. military examined the damage sustained by aircraft returning from missions and concluded that armour should be added to the areas that were most hit (see Figure 1).
Fortunately, statistician Abraham Wald stepped in and flagged the bias inherent in this assessment by pointing out that armour should instead be added to the areas showing the least damage, because any bombers that had been shot down would be excluded from the assessment. Therefore, the bullet holes in the returning aircraft likely represented areas where bombers could take damage and still manage to return to home base.
Figure 1: survivorship bias can lead to the wrong conclusions
Survivorship bias can be significant when dealing with financial data because the total number of companies is rarely constant, with old companies shutting down and new ones being launched all the time.
For instance, of the 3,000 constituents of the Russell 3000 index as of October 1986, only 565 have survived. But to have an accurate historical database and avoid survivorship bias, information must be retained on the 2,435 companies that are no longer actively traded. This will enable them to be included in backtests and in training ML models.
Look-ahead bias is also common when assessing financial data because of the way companies report information over time.
“Initially, a company will report their results through a press release, which usually happens four or five weeks after a quarter ends,” noted Goldman. “A month or two later, they will issue official regulatory filings and company reports, which may include some restated data, but often the restated data happens years later.”
Data is generally restated either because the company has changed its accounting standards, had an audit, or due to changes in regulatory requirements.
Companies are also constantly restating their results. Since 2010, there have been over a million restatements, and the numbers are likely to increase significantly as many more restatements will take place in the years ahead.
Goldman explained further, with an example from Yum Brands, which owns KFC, Pizza Hut and Taco Bell. He noted Yum Brands reported 2014 revenue of just over USD13 billion about five weeks after the end of the year.
Two years later, the company reported a revised number closer to USD6 billion, at which point the USD13 billion figure would be overwritten in non-PIT (point-in-time) databases.
“But a true point-in-time database would retain that original USD13 billion figure, and when the USD6 billion figure comes out, we’d have another field which says it was restated and the date of the restatement,” Goldman pointed out.
The impact of biases
In a nutshell, failure to appropriately adjust for survivorship, look-ahead and other biases lead to inaccurate backtests, explained Goldman.
Take the period following the internet bubble crash of the early 2000s.
According to research carried out by Thomson Reuters in 2010, in the period right after the crash, a backtest using the non-point-in-time data database significantly outperformed a backtest of the same factor using the point-in-time database.
The point-in-time data performed in line with the S&P 500, giving a more accurate representation of how the factor would perform in live market conditions.
Sharing insights into data bias
Lawrence Johny, Senior Application Engineer at MathWorks, demonstrated how the effect of using non-point-in-time data could be visualised and presented.
“Imagine you’re a systematic manager specialising in factor investing and you’ve quantified how using non-point-in-time data overstates performance,” said Johny. “Using MATLAB, you can easily share this insight with your colleagues by visually representing point-in-time, non-point-in-time and overstated performance.”
There are numerous challenges with backtesting, particularly when it comes to data. These include connecting to a database, converting output from the database to a calculation-friendly format, and cleaning and collating trading signals to be fed into a backtesting engine.
Figure 5: the four steps of quantifying bias in backtesting
These challenges can be met using MATLAB and accessing Refinitiv’s rich database, according to Johny. MATLAB’s database explorer app helps navigate the database visually and intuitively, while its grouping functions enable quick summaries of very large datasets in a rapid, efficient and robust way, and live tasks allow data to be visually cleaned without writing code.
The logistics of a backtesting exercise are complex. Multiple strategies must be coded, the rebalancing frequency needs to be aligned across all strategies and trading costs should be uniformly applied across the strategies.
“You can see how easy it is to introduce human error while working with lots of these strategies,” noted Johny. These challenges, too, can be addressed by using MATLAB. Its vectorised logic, along with its timetable format, makes working with large datasets quick and relatively easy, allowing the construction of a synchronised signal dataset and implementing independent strategies at scale.
By inputting a single line of code, multiple defined strategies can be backtested at once, and another single line of code summarises and visualises the backtest.
“The built-in visualisation allows each of the strategies to be plotted next to one another, so it’s easy to compare,” added Johny.
Figure 6 provides visual confirmation of the prevalence of look-ahead bias based on value and growth strategies.
Using non-point-in-time data in the example, a value strategy would overstate performance by 14 percent in long-only and by 24 percent in long-short strategies.
Figure 6: visual confirmation of look-ahead bias in value and growth strategies
Why bias must be identified and addressed
Using non-point-in-time historical data to backtest strategies and train machine learning models could lead to some fatal errors.
“It would assume, for example, that you’ll never be able to invest in a company that went bankrupt because it won’t be in your database, or that you have perfect foresight for a month on what the reported earnings are, even though, in reality, that won’t be available to you in a real live trading situation,” stressed Goldman.
In short, using non-point-in-time data leads to biases, which, if not addressed, will lead to “machine learning models being incorrectly trained and backtests that overstate achievable results and can’t be replicated in real markets.”