Optmize Model Hyperparameters using Directforecaster
22 views (last 30 days)
Show older comments
Dear all,
Is it possible to optmize the hyperparameters when using the directforecaster function to predict times series?
clear
close
clc
Tbl = importAndPreprocessPortData;
Tbl.Year = year(Tbl.Time);
Tbl.Quarter = quarter(TEU.Time);
slidingWindowPartition = tspartition(height(Tbl),"SlidingWindow", 4, "TestSize", 24)
Mdl = directforecaster(Tbl, "TEU", "Horizon", 1:12, "Learner", "lsboost", "ResponseLags", 1:12, ...
"LeadingPredictors", "all", "LeadingPredictorLags", {0:12, 0:12}, ...
"Partition", slidingWindowPartition, "CategoricalPredictors", "Quarter") % I would like to optmize lsboost hyperparameters
predY = cvpredict(Mdl)
0 Comments
Answers (1)
Taylor
on 20 Nov 2025 at 18:53
Yes. Instead of Learner="lsboost", create an LSBoost ensemble whose hyperparameters were tuned with fitrensemble, then pass that template (or its optimized settings) into directforecaster.
Step 1: Optimize LSBoost with fitrensemble
Use a representative regression problem (same predictors and response as your forecasting setup, but no time lags/partition yet) and let fitrensemble do Bayesian HPO on LSBoost.
% Example: X, Y built to mimic your forecasting design
X = Tbl(:, setdiff(Tbl.Properties.VariableNames, "TEU"));
Y = Tbl.TEU;
treeTmpl = templateTree("Surrogate","on"); % or your preferred base tree
MdlBoostOpt = fitrensemble(X, Y, ...
"Method","LSBoost", ...
"Learners",treeTmpl, ...
"OptimizeHyperparameters",{"NumLearningCycles","LearnRate","MaxNumSplits"}, ...
"HyperparameterOptimizationOptions",struct( ...
"AcquisitionFunctionName","expected-improvement-plus", ...
"MaxObjectiveEvaluations",30)); % tune budget
MdlBoostOpt now contains the tuned LSBoost settings (number of trees, learning rate, tree depth, etc.).
Step 2: Build an equivalent LSBoost template
Pull the chosen hyperparameters from MdlBoostOpt and recreate them as a templateEnsemble to feed into directforecaster.
% Extract tuned hyperparameters
numTrees = MdlBoostOpt.NumTrained; % or MdlBoostOpt.NumLearningCycles
learnRate = MdlBoostOpt.LearnRate;
maxSplits = MdlBoostOpt.Trained{1}.ModelParameters.SplitCriterionParameters.MaxNumSplits;
minLeaf = MdlBoostOpt.Trained{1}.ModelParameters.MinLeafSize;
% Base tree with tuned depth / leaf size, etc.
treeTuned = templateTree( ...
"MaxNumSplits", maxSplits, ...
"MinLeafSize", minLeaf, ...
"Surrogate", "on"); % if used in optimization
% LSBoost ensemble template with tuned settings
lsboostTuned = templateEnsemble( ...
"LSBoost", numTrees, treeTuned, ...
"LearnRate", learnRate);
(If some properties are awkward to extract, you can hard‑code them from MdlBoostOpt.HyperparameterOptimizationResults.XAtMinObjective instead.)
Step 3: Use tuned LSBoost in directforecaster
Now replace "lsboost" in your original code with the tuned template:
Mdl = directforecaster(Tbl, "TEU", ...
"Horizon", 1:12, ...
"Learner", lsboostTuned, ... % << tuned LSBoost template
"ResponseLags", 1:12, ...
"LeadingPredictors", "all", ...
"LeadingPredictorLags", {0:12, 0:12}, ...
"Partition", slidingWindowPartition, ...
"CategoricalPredictors", "Quarter");
directforecaster will then train one tuned LSBoost model per horizon step, using your sliding‑window cross‑validation partition, but without doing any further hyperparameter search on its own.
2 Comments
Taylor
ongeveer een uur ago
Short answers:
- Yes, there is potential leakage in how you are doing the fitrensemble tuning if you then evaluate with a different, time‑aware partition in directforecaster.
- Yes, optimizing without the lagged design can give you a sub‑optimal or even misleading hyperparameter choice for the lagged forecaster you actually use.
Data leakage in your current HPO
What fitrensemble is doing now:
- It is treating X = [Month, Year] and Y = TEUs as an i.i.d. regression problem and performs k‑fold CV with random folds by default when you turn on "OptimizeHyperparameters".
- For a time series, random k‑fold means some “future” time points end up in the training fold while their “past” counterparts are in the validation fold, which is a classic leakage pattern for forecasting.
So:
- There is no leakage inside the final directforecaster evaluation, because that uses your tspartition(...,"SlidingWindow",...), which is time‑ordered and valid.
- But there is inconsistent model selection: the hyperparameters were chosen to look good under a leaky, random‑CV objective, not under a forecasting‑style, time‑ordered objective. That can bias the chosen NumLearningCycles, LearnRate, MaxNumSplits toward values that would not be optimal for true forecasting performance.
In other words, you are not leaking into the final test in directforecaster, but you are likely over‑optimistic during HPO and may choose the wrong hyperparameters.
A better approach:
- Either: feed a time‑series partition into fitrensemble via "CrossVal","on","CVPartition",tsp (where tsp is a tspartition with holdout or sliding window) so that its optimization respects time ordering.
- Or: skip fitrensemble’s internal CV altogether and run an outer optimization loop where each objective evaluation builds a templateEnsemble and calls directforecaster(...,"Partition",slidingWindowPartition,...), then uses loss on the resulting PartitionedDirectForecaster as the objective.
The second option is the “cleanest” from a leakage standpoint, because the exact pipeline used at tuning time matches what you use in evaluation.
Optimizing with no lags vs using lags in directforecaster
Your current HPO design matrix:
X = [X.Month, X.Year]; % only current month/year
Y = Tbl.TEUs;
Your directforecaster design matrix (conceptually):
- Includes response lags 1:12, so each row has past 12 TEUs values as features.
- Includes “leading” predictors (e.g., Month, Year) with lags {0:12, 0:12}, i.e., multiple shifted versions of those calendar variables relative to horizon.
So the LSBoost model in directforecaster is learning on a much higher‑dimensional and more structured feature space than the 2‑column [Month, Year] you used during HPO.
Consequences:
- Hyperparameters like MaxNumSplits, MinLeafSize, NumLearningCycles, LearnRate are sensitive to feature dimension, redundancy, and signal‑to‑noise profile.
- Tuning them on a 2‑feature problem and then applying them to a 20‑plus‑feature, heavily lagged design can easily under‑ or over‑regularize the model compared to what would be optimal for the lagged problem.
So your concern is valid: you are optimizing a different problem.
Better approximations if you want to keep using fitrensemble HPO:
1. Mimic the lagged design in X before HPO
- Manually construct a “static” regression table that includes the same lagged columns ResponseLags and LeadingPredictorLags that directforecaster would generate, then tune LSBoost on that table with a time‑series‑aware partition.
- You can get a template for how directforecaster builds features from the preparedPredictors doc; it shows how lagged predictors are derived.
2. Use preparedPredictors from directforecaster
- Fit a rough directforecaster once with default LSBoost, then call preparedPredictors to extract the fully constructed predictor matrix and time indices.
- Use that matrix as X for fitrensemble HPO (with a proper time‑series partition), so HPO happens in the exact feature space your forecasts will use.
3. Outer optimization directly on PartitionedDirectForecaster (most principled)
- Define an objective function that, given a hyperparameter vector θθ, builds a templateEnsemble("LSBoost",...), calls directforecaster with your lag specs and slidingWindowPartition, and returns loss(Mdl,...) or a function of per‑horizon losses.
- Run bayesopt or a custom search over θ. This guarantees no mismatch between the model you tune and the one you deploy.
Given your current script, a minimal but improved workflow would be:
- Construct a lagged table TblLagged that mimics your ResponseLags and LeadingPredictorLags.
- Use tspartition(height(TblLagged), "SlidingWindow", ...) as CVPartition in fitrensemble so it respects time.
- Optimize LSBoost on TblLagged(:, predictors) vs TblLagged.TEUs.
- Extract tuned hyperparameters and build lsboostTuned as you already do.
- Use that lsboostTuned in directforecaster with the same lag specs and slidingWindowPartition.
That removes the two inconsistencies you called out: the leakage‑prone random CV and the mismatch between non‑lagged and lagged designs.
See Also
Categories
Find more on Classification Ensembles in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!