How to solve the following drift problems when using long-term short-term memory (LSTM) network to predict time series data？
3 views (last 30 days)
I use the official "chickenpox_dataset to predict the number of chickenpox cases in the future" example code chickenpox.m to make my own one-dimensional time series data prediction. The initial effect is very good, but with the passage of time and the update of data, the prediction error becomes larger and larger. I try to modify various Super parameters of the model, but there is no much improvement, Therefore, the length of training data is increased. The improvement effect of RMSE is obvious, but it seems to produce "drift", as shown in the figure, when the data length is 1000, 2000, 3000 respectively,Is there any other way to improve?
data length is 1000:
data length is 2000:
data length is 3000:
David Willingham on 25 Oct 2021
For time series forecasting, as you've seen too short and the model forecast will drift over time. Too long, and the model may be more accurate over the long term, but not as high in the short term. So yes, the "window" length of data you use to train is important.
However, it isn't the only thing to consider. On top of the "window" that you train, you also need to determine how often you want to retrain the data. For real time data, this is a must. I.e. at some period of time, you retrain the model based on the new data, usually at the same window length, and then compare the models performance to the one you've already trained. If it's better, then replace the existing, if it isn't keep the existing.