How to split dataset into 2/3 for training and 1/3 for testing include plot the graph?

3 views (last 30 days)
This is my coding...but i got error and cannot get the correct answer.
can you guy help me...Pleaseeeee
clear all, close all, clc
load hald; % Load Portlant Cement dataset
A = ingredients;
b = heat;
N=13; %number of row
idx=1:13;
PD=2/3;
%split data for training and testing
Ptrain=idx(1:round(PD*N));Ttrain=idx(1:round(PD*N));
Ptest=idx(round(PD*N)+1:end,:);Ttest=idx(round(PD*N)+1:end,:);
dataPTrain=hald(Ptrain);
dataPTest=hald(Ptest);
[U,S,V] = svd(A,'econ');
x = V*inv(S)*U'*b; % Solve Ax=b using the SVD
plot(dataPTrain,'k','LineWidth',2); hold on % Plot data
plot(dataPTest,'r-o','LineWidth',1.,'MarkerSize',2); % Plot regression
l1 = legend('Heat data','Regression')
%% Alternative 1 (regress)
x = regress(b,A);
%% Alternative 2 (pinv)
x = pinv(A)*b;

Answers (2)

Sulaymon Eshkabilov
Sulaymon Eshkabilov on 15 Jan 2023
You should use random partition of your total data set, e.g.:
rng("default"); % For reproducibility
n = length(X); %
C = cvpartition(n, "HoldOut", 65); % 65% for training and the remaining 35% for testing
INDEXtrain = training(C,1);
INDEXtest = ~ INDEXtrain;
X_test = X(INDEXtest,:);
Y_test = Y(INDEXtest,:);
X_train = X(INDEXtrain,:);
Y_train = Y(INDEXtrain,:);

Voss
Voss on 15 Jan 2023
This:
Ptest=idx(round(PD*N)+1:end,:);Ttest=idx(round(PD*N)+1:end,:)
should be this:
Ptest=idx(round(PD*N)+1:end);Ttest=idx(round(PD*N)+1:end)
because idx is a row vector, and the way you had it was trying to index beyond row 1 (the only row it has) of idx.
  1 Comment
NOR AZIERA
NOR AZIERA on 15 Jan 2023
sir...what the right coding for plot?
plot(dataTrain,'k','LineWidth',2); hold on % Plot data
plot(dataTest,'r-o','LineWidth',1.,'MarkerSize',2); % Plot regression
l1 = legend('Heat data','Regression')

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!