What Is Time Series Forecasting?
How do weather forecasters predict tomorrow’s weather, or stock market analysts analyze future market trends? It all comes down to a powerful statistical technique known as time series forecasting.
By analyzing past observations of a time series data, this method can predict its future values. It has found wide-ranging applications in fields such as finance, economics, medicine, weather forecasting, earthquake prediction and more.
Implementing time series forecasting techniques can empower businesses to make informed decisions, anticipate customer demands well in advance, and gain a competitive edge in their respective markets.
Time series forecasting can provide insights into a wide range of questions, such as:
- What will be the demand for a product in the next month?
- What will be the stock market price of a company in the next quarter?
- What will be the trend of a disease outbreak in the next few weeks?
- What will be the temperature of a city in the next week?
Here, we’ll dive deeper into the fascinating world of time series forecasting, learning the steps taken to make forecasts with time series data, and which methods are most commonly used.
Preparing the Data for Time Series Forecasting
Preparing the data for time series forecasting is a critical step in the modeling process. Properly preparing the data can involve different people or teams, depending on the complexity of the data and the organization.
Data scientists or analysts may collect, clean, and prepare the data in some cases. Domain or subject matter experts may also assist in identifying relevant variables and formatting the data appropriately. Data engineers may help set up data pipelines and automate data collection and storage. A cross-functional team, including individuals with expertise in data science, domain knowledge and engineering, may be responsible for preparing data for time series forecasting.
This can help to improve the accuracy of the model and ensure that it can make useful predictions on new data.
Data should be collected at regular intervals; you will need to select an appropriate time interval to ensure accuracy. Imputing values or interpolating missing data can be used to handle incomplete data.
To prepare the data for time series forecasting, you need to follow these steps:
- Collect the data. This can be done from various sources such as databases, APIs or data files.
- Select the time interval. Once you have collected the data, you need to decide on the time interval for the forecast, such as daily, weekly or monthly. This depends on the nature of the data and the problem you are trying to solve.
- Handle missing data. The data may have missing values, which can be handled by imputing the missing values or interpolating between existing values.
- Detect seasonality and trends. Seasonality refers to the repeating patterns in the data over time, while trends refer to the overall direction of the data over time. Autocorrelation and moving averages can be used to detect seasonality and trends.
- Apply transformations. Transformations can be applied to the data to make it more amenable to time series forecasting. Log transformation can help to reduce the impact of extreme values while differencing and detrending can be used to remove trends or seasonality.
It is also important to split the data into training and testing sets. The training set is used to fit the model, while the testing set is used to evaluate the performance of the model on new data. This can help to ensure that the model is not overfitting to the training data and can generalize well to new data.
Choosing a Time Series Forecasting Method
Once the data is prepared, it’s time to choose a forecasting method. The method you use to analyze the data and make forecasts depends on the problem you are trying to solve and the nature of the data.
Time series forecasting uses a variety of statistical and machine learning-based methods. Statistical methods typically involve modeling the underlying patterns and trends in the data, while machine learning methods use algorithms to learn patterns and make predictions.
Some popular statistical methods for time series forecasting include:
- ARIMA (autoregressive integrated moving average).
- Exponential smoothing.
- Seasonal decomposition.
Machine learning methods for time series forecasting include:
- Neural networks.
- Deep learning models
When selecting a time-series forecasting method, it is important to consider the balance between accuracy and interpretability. While some methods may offer greater accuracy, they may be more intricate and challenging to interpret, while simpler techniques may be more straightforward to understand but could compromise accuracy.
Each method carries advantages and disadvantages. Method selection is based on factors such as data characteristics, forecasting horizon (i.e., the length of time into the future being forecast), and computational complexity. Here’s a look at some of the most commonly used methods.
ARIMA is a popular statistical method for time series forecasting that models the autocorrelation of the data using three components: autoregression (AR), differencing (I), and moving average (MA).
The AR component captures the dependence of the current value on previous values, while the MA component captures the dependence on the previous error terms. The I component is used to remove trends and seasonality from the data.
ARIMA is a flexible method that can handle a wide range of time-series patterns, making it popular in fields such as finance, economics and marketing.
SARIMA (seasonal autoregressive integrated moving average) is a variation of the ARIMA model that is specifically designed to handle time series data with seasonality. It includes the same three components as the ARIMA model (AR, I, and MA) but also includes additional seasonal components.
The seasonal component captures the dependence of the current value on previous values from the same season (such as, the same month of the year). The seasonal differencing component removes the seasonal patterns from the data, and the seasonal moving average component captures the dependence on the previous error terms for the seasonal component.
However, SARIMA models can be more complex than standard ARIMA models and may require more data and computational resources to train.
Advantages of an ARIMA Model
- Captures non-linear patterns and relationships in the data, making it useful for modeling complex time series data
- Can be applied to both stationary and non-stationary time series data, including data with trends and seasonal patterns
- Well-suited for short-term forecasting, especially when the time series data exhibit a high degree of autocorrelation
- Provides diagnostic statistics that can be used to evaluate how well the model fits the task at hand, and the quality of the forecasts.
Disadvantages of an ARIMA Model
- Sensitive to outliers in the data, which can lead to inaccurate estimates and forecasts.
- The coefficients of an ARIMA model can be difficult to interpret, especially for non-experts.
- Typically used with univariate time series data, and may not apply to multivariate or panel data
- Not well-suited for long-term forecasting, as its accuracy tends to degrade over time.
Exponential smoothing is a simple time series forecasting method that assigns more weight to recent data points while gradually decreasing the weight for older data points. It is a popular method for short-term forecasting, as it can quickly adapt to changes in the data.
Several different techniques fall under the umbrella of exponential smoothing, including simple exponential smoothing, double exponential smoothing, and triple exponential smoothing.
Simple exponential smoothing is the most basic form of this method, and it is used to forecast a time series that does not exhibit any trend or seasonality. Simple exponential smoothing uses a single smoothing parameter, alpha, which controls the weight given to the past observations. The forecast for the next period is a weighted average of the past observations, with more weight given to the most recent observation.
Double exponential smoothing, also known as Holt’s method, is used to forecast a time series that exhibits a trend but no seasonality. Double exponential smoothing uses two smoothing parameters, alpha and beta, which control the weights given to the past observations and the past trends, respectively. The forecast for the next period is a weighted average of past observations and past trends.
Triple exponential smoothing, also known as the Holt-Winters method, is used to forecast a time series that exhibits both trend and seasonality. Triple exponential smoothing uses three smoothing parameters — alpha, beta and gamma — that control the weights given to previous observations, trends and seasonal variations, respectively. The forecast for the next period is a weighted average of past observations, trends and seasonal variations.
The TBATS Model
TBATS (trigonometric seasonality, Box-Cox transformation, ARMA errors, trend, and seasonal components) is a state-of-the-art time series forecasting model that extends the basic exponential smoothing framework. It’s a hybrid model that combines the strengths of exponential smoothing and other time series forecasting techniques, such as ARIMA and Fourier analysis, to capture a wide range of temporal patterns in the data.
TBATS model is based on exponential smoothing, in the sense that it uses a similar framework of exponentially weighted moving averages to model the level, trend, and seasonality of the time series. However, TBATS also includes additional components to model more complex temporal patterns, such as multiple seasonal periods, long-term trends, and non-linear relationships between the predictors and the response variable.
Advantages of Exponential Smoothing
- A simple and easy-to-understand technique that requires only one or a few parameters.
- Computationally efficient and can quickly generate forecasts for large datasets.
- Can handle missing values and outliers by giving more weight to recent observations and less weight to older ones.
Disadvantages of Exponential Smoothing
- Assumes that the time series is stationary, meaning that it has a constant mean and variance over time. If the time series exhibits non-stationary behavior, such as a trend or seasonality, then exponential smoothing may not be appropriate.
- May not perform well for time series with long-term trends or complex seasonal patterns. In these cases, more advanced forecasting techniques may be more appropriate.
- Assumes that the errors in the forecast are normally distributed with a mean of zero and a constant variance. If the errors exhibit non-normal behavior, such as skewness or kurtosis, then the accuracy of the forecasts may be compromised.
- Does not provide explicit measures of uncertainty or prediction intervals, which can make it difficult to assess the accuracy of the forecasts and make informed decisions based on them.
- Requires some judgment in selecting the smoothing parameters, which can be subjective and may vary depending on the characteristics of the time series and the specific application. Incorrect parameter selection can lead to poor forecasts and unreliable results
Seasonal decomposition is a method that separates the time series data into its trend, seasonal and residual components. The trend component represents the long-term pattern in the data, while the seasonal component represents the repeating patterns over time. The residual component represents the noise or irregular variation in the data.
Seasonal decomposition can provide insights into the underlying patterns and trends in the data, making it useful for understanding the seasonality and trend of a time series.
Facebook Prophet is an open source, time series forecasting library published by Facebook that is based on decomposable models, specifically trends, seasonality, and holidays. Prophet is designed to be flexible, scalable and easy to use, and it can be applied to a wide range of time series forecasting problems.
Prophet uses a generalized additive model (GAM) framework, which allows for non-linear relationships between the predictors and the response variable. The model includes several components:
- Trend. Prophet fits a piecewise linear trend to the data, with optional user-specified changepoints that can capture abrupt changes in the trend.
- Seasonality. Prophet models both weekly and yearly seasonality, as well as user-defined seasonality patterns.
- Holidays. Prophet can include holiday effects, which capture the impact of known events, such as public holidays, on the time series.
- Regressors. Prophet allows for the inclusion of additional regressors, such as economic data, to improve the accuracy of the forecasts.
Prophet also provides several features that can help with model selection and tuning, including automatic selection of changepoints, hyperparameter optimization, and cross-validation. Additionally, Prophet provides uncertainty estimates for the forecasts, which can help with decision-making and risk management.
One of the key advantages of Prophet is its ease of use and its ability to handle complex time series data without requiring extensive domain knowledge or data preprocessing.
The vision for the future of Facebook Prophet is outlined in this article published in February on Medium, by Cuong Duong, a staff data scientist at Canva who is also a Prophet maintainer.
Advantages of Seasonal Decomposition
- Separates the different components of a time series, helping to understand underlying patterns and trends.
- Can remove the seasonal component from a time series to analyze the trend and residual components separately.
- Can be used to forecast future values based on identified components.
- Easy to implement and requires minimal domain knowledge or specialized software.
Disadvantages of Seasonal Decomposition
- Assumes the seasonal component is strictly periodic, which may not be true for all data.
- Based on linear relationships and may not capture non-linear patterns or relationships.
- Limited to additive models, which may not be appropriate for all data types.
- Only applicable to univariate time series data.
In traditional time series forecasting, a univariate time series is used to make predictions, which means that only one variable or feature is considered. However, in many real-world applications, there are often multiple variables that are interdependent and can affect each other.
Neural networks are a type of machine learning model that consists of interconnected layers of nodes, each of which performs a specific mathematical operation on the input data. They are popular in time-series forecasting as they can capture non-linear patterns and interactions between variables. Neural networks can be trained to learn the patterns in the data and then use those patterns to make forecasts.
Long short-term memory (LSTM) and gated recurrent unit (GRU) are both types of recurrent neural networks (RNNs) that are capable of processing sequential data. They are designed to handle the vanishing gradient problem that can occur in traditional RNNs, where information is lost as it passes through multiple layers of the network.
LSTM and GRU networks use gated units to selectively remember or forget information from previous time steps. This allows them to effectively capture long-term dependencies and patterns in the time series.
1D CNNs — convolutional neural networks — are a type of neural network architecture commonly used in image processing. However, they can also be applied to time series forecasting by treating the time series as a 1D image. The convolutional layers in the network can learn to extract features from the time series, such as trends and patterns, which can then be used to make predictions.
To use these neural network architectures for multivariate time series forecasting, the input data must be structured appropriately. Each observation in the time series should be represented as a vector of features, and the target variable should be included as one of these features. The input data can then be split into training and testing sets, and the neural network can be trained on the training data using a suitable loss function and optimization algorithm.
Once the neural network is trained, it can be used to make predictions on the test data. The predicted values can then be compared to the actual values to evaluate the performance of the model. In multivariate time series forecasting, it is important to use appropriate evaluation metrics that take into account the interdependence of the variables.
Additionally, 1D CNNs are often faster and more computationally efficient than traditional time series forecasting methods, such as ARIMA or exponential smoothing, especially when dealing with large datasets.
Advantages of Neural Networks
- Neural networks excel at capturing complex non-linear patterns in data that might be difficult to capture with other models. This makes them suitable for a wide range of applications where complex patterns need to be identified.
- Neural networks are known for their ability to generalize well to new data. This makes them particularly useful for forecasting and predictive modeling applications, where the data distribution might change over time.
- Neural networks are robust to noise and missing data, which can be a significant advantage when working with real-world datasets that might be noisy or incomplete.
- Neural networks can be trained on large datasets and can handle a large number of input variables, making them useful in applications where a large amount of data needs to be processed
Disadvantages of Neural Networks
- Neural networks can suffer from overfitting, where the model becomes too complex and starts fitting to noise in the data. This can lead to poor generalization performance and is a major challenge when training neural networks.
- Neural networks are complex models that can be difficult to interpret and understand, particularly for non-experts. This can make it challenging to diagnose issues with the model or understand why it is making certain predictions.
- Training neural networks can be computationally expensive, especially on large datasets. This can limit their practical use in applications where real-time predictions are needed.
- Neural networks require a significant amount of data to learn the underlying patterns in the data. In some applications, this data may not be readily available, which can limit the effectiveness of neural networks in those domains.
Deep Learning Models
Deep learning models are a subset of neural networks that consists of multiple layers of interconnected nodes, with each layer learning more abstract representations of the input data. They are popular in time-series forecasting as they can capture very complex patterns and relationships in the data.
Deep learning models require a large amount of training data and computational resources, but they can provide highly accurate forecasts.
Besides RNNs and CNNs, several techniques are commonly used in deep learning models, including:
A type of deep learning model that can be used for unsupervised learning and dimensionality reduction. Autoencoders learn to reconstruct the input data from a compressed representation, allowing them to capture the most important features of the data.
In time series forecasting, autoencoders can be used for tasks such as anomaly detection and noise reduction.
Deep Belief Networks (DBNs)
DBNs are a type of deep learning model that consists of multiple layers of Restricted Boltzmann Machines (RBMs). DBNs can be used for unsupervised learning and feature extraction, and they have been applied to time series forecasting tasks.
A type of deep learning model that was originally developed for natural language processing tasks, but they have also been successfully applied to time series forecasting. Transformer models use self-attention mechanisms to capture dependencies between different time steps, making them well-suited for forecasting tasks with long-term dependencies.
Advantages of Deep Learning Models
- Captures highly complex patterns in the data, making them suitable for time-series forecasting applications where the relationships between variables are not well understood.
- Handles a wide range of input data types and formats, making them useful for a wide range of applications.
- Generalize well to unseen data, making them useful in forecasting applications where the data distribution may change over time.
- Able to automatically extract useful features from the input data, which can save time and effort compared to manually selecting and extracting features.
Disadvantages of Deep Learning Models
- Require a large amount of computational resources, such as processing power and memory, which can limit their practical use in some applications.
- Can suffer from overfitting if they are not properly regularized, which can lead to poor generalization performance.
- Can be difficult to interpret and understand, especially for non-experts, due to their complex structure.
- Require a large amount of training data to learn the underlying patterns in the data, which may not be available in some applications.
Implementing a Time Series Forecasting Method
Now that you’ve selected your forecasting method, here are the crucial steps to take to generate predictions for future periods.
- Select the appropriate forecasting horizon. The forecasting horizon is the time for which you want to generate predictions. The horizon can range from a few periods into the future to several years, depending on the application, and it should align with the goals of your forecasting task.
- Prepare the input data. Provide the model with input data for the period you want to forecast. This data should be formatted in the same way as the training data, with the same variables and time intervals. Make sure the input data is clean and free of missing values or outliers.
- Fit the model to the input data. Once the input data is prepared, you can use the selected model to generate forecasts. This involves using the training data to estimate the model parameters and then applying the model to the input data to generate predictions.
- Evaluate the model’s performance. After generating forecasts, it’s important to evaluate the model’s performance. This involves comparing the model’s predictions to actual values for the forecast period, using metrics such as mean absolute error (MAE), root mean squared error (RMSE), or mean absolute percentage error (MAPE).
- Refine the model. If the model’s performance is not satisfactory, you may need to refine the model by adjusting the model parameters or trying a different model altogether. This may involve retraining the model on additional data or tweaking the input variables.
- Generate final forecasts. Once the model’s performance is satisfactory, you can generate final forecasts for the chosen forecasting horizon. These forecasts can be used to inform business decisions or support other applications, such as resource planning or demand forecasting. It’s important to monitor the performance of the forecasts over time and refine the model as needed, to ensure ongoing accuracy.
Evaluating the Forecast Results
In addition to MAE, RMSE and MAPE, other metrics can be used to evaluate the accuracy of time-series forecasts. For example, mean absolute scaled error (MASE) compares the forecast to a naive forecast, such as the previous observation or the average of past observations. This provides a measure of how well the forecast model is performing compared to a simple baseline.
Another metric is the symmetric mean absolute percentage error (SMAPE), which measures the percentage difference between the forecast and the actual values. Unlike MAPE, SMAPE is symmetric, meaning that it gives equal weight to over- and under-forecasting errors.
This makes it useful when the cost of over- and under-forecasting is similar. However, SMAPE has some limitations, such as a tendency to produce infinite values when the actual value is zero. Therefore, it is important to use multiple metrics to evaluate the accuracy of time series forecasts.
Visualizing the forecast results can also provide additional insights into the data. Time series plots can show any trends or patterns in the data, while residual plots can reveal any systematic errors or biases in the forecast model. Quantile-quantile plots can also be used to check whether the forecast residuals follow a normal distribution.
Interpreting the forecast results involves considering the context and purpose of the forecast. For example, a forecast for a short time horizon may require more accurate and precise forecasts, while a longer-term forecast may be more concerned with identifying general trends and patterns.
It is also important to consider any external factors or events that may impact the forecast, such as changes in market conditions, new competitors or unexpected events.
Several additional considerations should be taken into account when performing time series forecasting:
Data quality. The accuracy and completeness of the data can have a significant impact on the accuracy of the forecast. It is important to ensure that the data is clean, consistent and representative of the underlying phenomena being modeled.
Domain knowledge. Understanding the underlying domain and the factors that may influence the variable being forecast can help to inform the choice of method and parameter selection. For example, in demand forecasting for a seasonal product, knowledge of seasonal patterns and trends may inform the choice of a forecasting method.
Uncertainty and risk. Forecasting inherently involves some degree of uncertainty, and it is important to consider the potential risks associated with inaccurate forecasts. Sensitivity analysis and scenario planning can help to identify potential risks and mitigate their impact.
Updating the model. As new data becomes available, the forecasting model should be updated to incorporate the latest information. This may involve re-training the model with the updated data or using adaptive methods that can update the model in real-time.