Methods to Measure Forecast Error
Updated: Feb 7, 2022
The first rule of forecasting is that all forecasts are either wrong or lucky. However, failing to learn when your forecasting is wrong or lucky makes it a lot less likely that forecasting accuracy will improve over time. So the first and most beneficial purpose of accuracy analysis is to learn from your mistakes. After all, you can't manage what you don't measure.
There is no single 'best' measure that can be used to describe forecasting accuracy, as there is in so many other areas of forecasting. Instead, a combination of MAPE, FVA, and Exception Analysis is the dream ticket for typical operational forecasting, but I'll explain why later in this article.
Weighted Mean Absolute Percent Error (WMAPE)
Mean Percentage Error (MPE)
Error Total (ET)
Each of these measures has advantages and disadvantages. Still, context is everything, especially when determining which measure should be used for what situation, which is highly dependent on what you are forecasting. FVA and Exceptions Analysis are methods that differ slightly from the first six listed above, and I will go into more detail about when and how to use them later in this article.
The overall accuracy of any forecasting method, regardless of which method is used, is determined by comparing forecasted values to actual values. However, to determine which of the first six methods is best for your situation, you must first understand the two major types of error measurement:
Error Sign - In layman's terms, this error type determines whether you want to treat Positive and Negative Forecast Error the same or differently, i.e., does it matter if Forecast is greater than Actual or less than Actual? In most operational planning settings, both types of error can be equally damaging; however, if you were forecasting for perishable products, you would always prefer the Forecast to be less than the Actual because surplus production is as good as loss.
Error Spread - This error type determines whether it is important that the forecast error is concentrated in a few points or spread evenly across many. For example, do you mind if the Forecast is horribly wrong on a few points as long as it is accurate for the entire horizon? Again, the item's shelf life is meaningful here. A live channel customer contact, for example, has a very short shelf life (you have until the customer abandons their attempt), whereas answering a customer email has a longer shelf life, which means an email unanswered in one period can be effectively put into use in subsequent periods, and thus it is less impactful if the Forecast goes wrong in one period as long as we make up for it in accuracy over a broader horizon.
Here's a quick reference table showing which Forecasting Accuracy measurement weights Error Sign and Error Spread:
As previously stated, for operational workload forecasting, either a positive or negative "Error Sign" is usually equally damaging, with either a positive or negative error resulting in either under or overstaffing. As a result, the three most commonly used accuracy methods are often Mean Absolute Deviation (MAD), Mean Squared Error (MSE), and/or Mean Absolute Percent Error (MAPE) (MAPE).
However, a common issue for both MAD and MSE is that their values are affected by the forecasted item's magnitude. If the forecast item is measured in thousands, for example, the MAD and MSE results can be very large – not ideal for typical operational planning workload forecasting unless your organization is massive. So we're left with MAPE.
MAPE or WMAPE
Given the limitations of MAD and MSE, this leads us logically to MAPE. In its traditional form, MAPE is calculated as the average absolute difference between forecasted and actual values and expressed as a percentage of actual values. MAPE is also one of the simplest measures to interpret because a single large value does not skew it.
However, one significant problem with MAPE is that the result cannot be calculated if the base in any individual percent error calculation is zero. This is commonly known as the divide by zero problem. Various workarounds have been implemented to address this issue, but none of them are mathematically correct. The most severe issue arises when MAPE evaluates the historical errors associated with various forecasting models to select the best model. As a result, MAPE is entirely unsuitable for assessing any item with an intermittent demand pattern in this manner.
Also, when calculating the average MAPE for a number of time series, you may run into a problem: a few of the series with very high MAPEs may distort a comparison of the average MAPE of a time series fitted with one method versus the average MAPE of a time series fitted with another method.
The disadvantage of MAPE is obvious from the preceding example: a large percentage error for a small actual can result in a large MAPE. In this case, the last day's result explains more than half of the MAPE.
Other measures, such as the SMAPE (symmetrical MAPE), weighted absolute percentage error (WAPE), real aggregated percentage error, and relative measure of accuracy, have been developed to address this issue (ROMA).
WAPE is my personal favorite due to its simplicity and ease of calculation. WMAPE can be calculated in a very straightforward manner. This entails adding the absolute errors at the detailed level and then calculating the total of the errors as a percentage of total volume. This calculation method has the added benefit of being resistant to individual instances when the base is zero, thereby avoiding the divide by zero problem that frequently occurs with MAPE.
WMAPE is a handy metric that is becoming more popular in corporate KPIs and operational use. It is simple to calculate and provides a concise forecast accuracy measurement that can summarize performance across any grouping of products and/or time periods. If accuracy is required, this is calculated as 100 - WMAPE.
Forecast Value Added (FVA)
WMAPE will tell you the size of your forecast error, which is important. Still, it will not tell you how efficient you are forecasting, help you understand the drivers or underlying true variability, the minimum error rate, or even whether the different methods and models you are using are improving or worsening the Forecast.
I recommend using a straightforward Forecast Value Added (FVA) process to determine this. It requires a little extra effort upfront. Still, in the long run, it can significantly improve forecasting accuracy and reduce forecasting man-hour costs by assisting you in avoiding pointless forecasting processes. It essentially uses the simplest, least labor-intensive method of forecasting (namely, a "naive" forecast) to benchmark the forecasting accuracy at each stage of your current process. For example, how much accuracy is added by causal factors, and is the leadership review adding value or presenting biased perspectives?
The above diagram is from a typical forecasting process. By running FVA, you can answer the following questions;
Are all the stages in your forecasting process actually adding accuracy?
Is the effort expended at each stage actually worth it?
Where in the process are your weak points?
WMAPE and other summary measurements are useful for tracking accuracy over time. On the other hand, exceptions analysis seeks to identify and explain the causes of the largest / most costly forecast errors, allowing for the opportunity to learn from mistakes and potentially apply lessons of experience to future forecasts.
The whole point of measuring your Forecast's accuracy is to improve it, and the only way I know of accomplishing this is to figure out why you have a gap.
As a result, you must include a process in your method for quickly identifying exceptions – those large deviations that caused the most problems – and ask yourself the simple question, could the causes have been anticipated? If this is the case, you have now clearly identified that improved information or interpretation in this area will improve future forecasts. The following is a straightforward high-level process to follow for exception analysis:
Exception Analysis Preparation- Define rules used to identify and classify exceptions.
Mining Phase. Apply algorithms to the data to identify exceptions based on pre-defined rules.
Research exceptions – look for supporting information on the cause of these exceptions.
Submit changes to forecast - If the research changes the Forecast and/or resolves the exception.
For example, if you use any historical method, an influx of customer contacts in the morning throughout the month will most likely be factored into the next Forecast. On the other hand, knowing why this happened tells you whether you should include/exclude/smooth this data in future forecasts. For example, would you want to include a one-time TV ad only shown in the mornings that month in future forecasts if it is not to occur again for the foreseeable future?