Main Content

Train Network for Time Series Forecasting Using Deep Network Designer

This example shows how to forecast time series data by training a long short-term memory (LSTM) network in Deep Network Designer.

Deep Network Designer allows you to interactively create and train deep neural networks for sequence classification and regression tasks.

To forecast the values of future time steps of a sequence, you can train a sequence-to-sequence regression LSTM network, where the responses are the training sequences with values shifted by one time step. That is, at each time step of the input sequence, the LSTM network learns to predict the value of the next time step.

This example uses the data set chickenpox_dataset. The example creates and trains an LSTM network to forecast the number of chickenpox cases given the number of cases in previous months.

Load Sequence Data

Load the example data. chickenpox_dataset contains a single time series, with time steps corresponding to months and values corresponding to the number of cases. The output is a cell array, where each element is a single time step. Reshape the data to be a row vector.

data = chickenpox_dataset;
data = [data{:}];

figure
plot(data)
xlabel("Month")
ylabel("Cases")
title("Monthly Cases of Chickenpox")

Partition the training and test data. Train on the first 90% of the sequence and test on the last 10%.

numTimeStepsTrain = floor(0.9*numel(data))
numTimeStepsTrain = 448
dataTrain = data(1:numTimeStepsTrain+1);
dataTest = data(numTimeStepsTrain+1:end);

Standardize Data

For a better fit and to prevent the training from diverging, standardize the training data to have zero mean and unit variance. For prediction, you must standardize the test data using the same parameters as the training data.

mu = mean(dataTrain);
sig = std(dataTrain);

dataTrainStandardized = (dataTrain - mu) / sig;

Prepare Predictors and Responses

To forecast the values of future time steps of a sequence, specify the responses as the training sequences with values shifted by one time step. That is, at each time step of the input sequence, the LSTM network learns to predict the value of the next time step. The predictors are the training sequences without the final time step.

XTrain = dataTrainStandardized(1:end-1);
YTrain = dataTrainStandardized(2:end);

To train the network using Deep Network Designer, convert the training data to a datastore object. Use arrayDatastore to convert the training data predictors and responses into ArrayDatastore objects. Use combine to combine the two datastores.

adsXTrain = arrayDatastore(XTrain);
adsYTrain = arrayDatastore(YTrain);

cdsTrain = combine(adsXTrain,adsYTrain);

Define LSTM Network Architecture

To create the LSTM network architecture, use Deep Network Designer. The Deep Network Designer app lets you build, visualize, edit, and train deep learning networks.

deepNetworkDesigner

On the Deep Network Designer start page, pause on Sequence-to-Sequence and click Open. Doing so opens a prebuilt network suitable for sequence-to-sequence classification tasks. You can convert the classification network into a regression network by replacing the final layers.

Delete the softmax layer and the classification layer and replace them with a regression layer.

Adjust the properties of the layers so that they are suitable for the chickenpox data set. This data has a single input feature and a single output feature. Select sequenceInputLayer and set the InputSize to 1. Select fullyConnectedLayer and set the OutputSize to 1.

Check your network by clicking Analyze. The network is ready for training if Deep Learning Network Analyzer reports zero errors.

Import Data

To import the training datastore, select the Data tab and click Import Data > Import Custom Data. Select cdsTrain as the training data and None as the validation data. Click Import.

The data preview shows a single input time series and a single response time series, each with 448 time steps.

Specify Training Options

On the Training tab, click Training Options. Set Solver to adam, InitialLearnRate to 0.005, and MaxEpochs to 500. To prevent the gradients from exploding, set the GradientThreshold to 1.

For more information about setting the training options, see trainingOptions.

Train Network

Click Train.

Deep Network Designer displays an animated plot showing the training progress. The plot shows mini-batch loss and accuracy, validation loss and accuracy, and additional information on the training progress.

Once training is complete, export the trained network by clicking Export in the Training tab. The trained network is saved as the trainedNetwork_1 variable.

Forecast Future Time Steps

Test the trained network by forecasting multiple time steps in the future. Use the predictAndUpdateState function to predict time steps one at a time and update the network state at each prediction. For each prediction, use the previous prediction as input to the function.

Standardize the test data using the same parameters as the training data.

dataTestStandardized = (dataTest - mu) / sig;

XTest = dataTestStandardized(1:end-1);
YTest = dataTest(2:end);

To initialize the network state, first predict on the training data XTrain. Next, make the first prediction using the last time step of the training response YTrain(end). Loop over the remaining predictions and input the previous prediction to predictAndUpdateState.

For large collections of data, long sequences, or large networks, predictions on the GPU are usually faster to compute than predictions on the CPU. Otherwise, predictions on the CPU are usually faster to compute. For single time step predictions, use the CPU. To use the CPU for prediction, set the 'ExecutionEnvironment' option of predictAndUpdateState to 'cpu'.

net = predictAndUpdateState(trainedNetwork_1,XTrain);

[net,YPred] = predictAndUpdateState(net,YTrain(end));

numTimeStepsTest = numel(XTest);
for i = 2:numTimeStepsTest
    [net,YPred(:,i)] = predictAndUpdateState(net,YPred(:,i-1),'ExecutionEnvironment','cpu');
end

Unstandardize the predictions using the parameters calculated earlier.

YPred = sig*YPred + mu;

The training progress plot reports the root-mean-square error (RMSE) calculated from the standardized data. Calculate the RMSE from the unstandardized predictions.

rmse = sqrt(mean((YPred-YTest).^2))
rmse = single
    175.9693

Plot the training time series with the forecasted values.

figure
plot(dataTrain(1:end-1))
hold on
idx = numTimeStepsTrain:(numTimeStepsTrain+numTimeStepsTest);
plot(idx,[data(numTimeStepsTrain) YPred],'.-')
hold off
xlabel("Month")
ylabel("Cases")
title("Forecast")
legend(["Observed" "Forecast"])

Compare the forecasted values with the test data.

figure
subplot(2,1,1)
plot(YTest)
hold on
plot(YPred,'.-')
hold off
legend(["Observed" "Forecast"])
ylabel("Cases")
title("Forecast")

subplot(2,1,2)
stem(YPred - YTest)
xlabel("Month")
ylabel("Error")
title("RMSE = " + rmse)

Update Network State with Observed Values

If you have access to the actual values of time steps between predictions, then you can update the network state with the observed values instead of the predicted values.

First, initialize the network state. To make predictions on a new sequence, reset the network state using resetState. Resetting the network state prevents previous predictions from affecting the predictions on the new data. Reset the network state, and then initialize the network state by predicting on the training data.

net = resetState(net);
net = predictAndUpdateState(net,XTrain);

Predict on each time step. For each prediction, predict the next time step using the observed value of the previous time step. Set the 'ExecutionEnvironment' option of predictAndUpdateState to 'cpu'.

YPred = [];
numTimeStepsTest = numel(XTest);
for i = 1:numTimeStepsTest
    [net,YPred(:,i)] = predictAndUpdateState(net,XTest(:,i),'ExecutionEnvironment','cpu');
end

Unstandardize the predictions using the parameters calculated earlier.

YPred = sig*YPred + mu;

Calculate the root-mean-square error (RMSE).

rmse = sqrt(mean((YPred-YTest).^2))
rmse = 119.5968

Compare the forecasted values with the test data.

figure
subplot(2,1,1)
plot(YTest)
hold on
plot(YPred,'.-')
hold off
legend(["Observed" "Predicted"])
ylabel("Cases")
title("Forecast with Updates")

subplot(2,1,2)
stem(YPred - YTest)
xlabel("Month")
ylabel("Error")
title("RMSE = " + rmse)

Here, the predictions are more accurate when updating the network state with the observed values instead of the predicted values.

See Also

Related Topics