Computation of distance of a set of points from a fitting curve (of 2n grade like a parabola).

1 view (last 30 days)
Hi everybody, It seems to be an easy problem but I m in trouble:
I have to compute the distance btw the black points (distance for each of these points)( which have vector coordinates (flowrate, pPHeadLoss) ) from the red curve that is the fitting curve which has as coordinates the row vectors ((min(flowrate):0.0001:max(flowrate), y)
Actually I am truing to use pdist function but it doesnt work for me, what am I wrong?
Thanks in advance
[y,delta] = polyval(p, (min(flowrate):0.0001:max(flowrate)),S,mu);
% FIT
plot(min(flowrate):0.0001:max(flowrate), y , 'r'); % red curve
The full code is the next one:
flowdata = readtable("flodata.txt");
% column 1: pressure at pipe start [bars]
% column 2: acoustic noise at start [standard deviation]
% column 3: pressure at pipe end (10km from start) [bars]
% column 4: acoustic noise at pipe end[standard deviation]
% column 5: flow rate [m3/hour]
load('flowdata.mat');
% summary(flowdata)
% Vectors of Pressures
pressureIN = flowdata.pressureAtPipeStart_bars_;
pressureOUT= flowdata.pressureAtPipeEnd_10kmFromStart__bars_;
% Vectors of Acoustic Noises at IN and at OUT
aNoiseIN=flowdata.acousticNoiseAtStart_standardDeviation_;
aNoiseOUT=flowdata.acousticNoiseAtPipeEnd_standardDeviation_;
% Vector of flowrate
flowrate = flowdata.flowRate_m3_hour_;
% % Table of flowRate
% flowRate = table(flowrate);
% rawData = [pressureIN aNoiseIN pressureOUT aNoiseOUT flowrate];
% t = (linspace(0, length(cleanData)*10, length(cleanData)))/86400;
sample_distance = 10;
%t = (linspace(0, 7948790, 794879))/86400;
[samples,channels] = size(flowdata);
t = linspace(0, samples*sample_distance, samples)/86400;
% all main curves overlaid
figure()
plot(t, pressureIN, '-g',t, pressureOUT, '-r',t, flowrate, '-b');
axis tight;
title(' Input , Output Pressure , FR vs time');
legend('Input Pressure','Output Pressure','Flow Rate');
xlabel('Tempo [Days]');
ylabel('all');
PHeadLoss = pressureIN-pressureOUT;
densityplot(flowrate, PHeadLoss, [100,100])
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
rawData = [pressureIN pressureOUT flowrate];
% TOLGO I NaN
% Creo un array di indici logici che indica quali righe contengono almeno un NaN
idx_nan = any(isnan(rawData), 2);
cleanData = rawData;
% Elimino le righe selezionate
cleanData(idx_nan, :) = [];
[samples,channels] = size(cleanData);
t = linspace(0, samples*sample_distance, samples)/86400;
% Vectors of Pressures
pressureIN = cleanData(:,1);
pressureOUT= cleanData(:,2);
flowrate = cleanData(:,3);
PHeadLoss = pressureIN - pressureOUT;
%DATA
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
hold on
%densityplot(flowrate, PHeadLoss, [50,50]);
[p, S, mu] = polyfit(flowrate, PHeadLoss, 2)
[y,delta] = polyval(p,flowrate,S,mu);
%LINEAR FIT
plot(flowrate, y, 'r-');
%hold on
% Linear Fit of Data with 95% Prediction Interval
%plot(flowrate,y+2*delta,'m--',flowrate,y-2*delta,'m--')
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
densityplot(flowrate, PHeadLoss, [100,100])
% TOLGO I VALORI NEGATIVI
% Creo un array di indici logici che indica quali righe hanno almeno un valore negativo
idx_neg = any(cleanData < 0, 2);
cleanData = cleanData;
% Elimino le righe selezionate
cleanData(idx_neg, :) = [];
[samples,channels] = size(cleanData);
t = linspace(0, samples*sample_distance, samples)/86400;
pressureIN = cleanData(:,1);
pressureOUT= cleanData(:,2);
flowrate = cleanData(:,3);
PHeadLoss = pressureIN - pressureOUT;
%DATA
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
hold on
%densityplot(flowrate, PHeadLoss, [50,50]);
[p, S, mu] = polyfit(flowrate, PHeadLoss, 2)
[y,delta] = polyval(p,flowrate,S,mu);
%LINEAR FIT
plot(flowrate, y, 'r-');
%hold on
% Linear Fit of Data with 95% Prediction Interval
%plot(flowrate,y+2*delta,'m--',flowrate,y-2*delta,'m--')
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
densityplot(flowrate, PHeadLoss, [100,100])
% % Table of flowRate
% flowRate = table(flowrate);
% rawData = [pressureIN aNoiseIN pressureOUT aNoiseOUT flowrate];
% t = (linspace(0, length(cleanData)*10, length(cleanData)))/86400;
sample_distance = 10;
%t = (linspace(0, 7948790, 794879))/86400;
[samples,channels] = size(cleanData);
t = linspace(0, samples*sample_distance, samples)/86400;
% all main curves overlaid
figure()
plot(t, pressureIN, '-g',t, pressureOUT, '-r',t, flowrate, '-b');
axis tight;
title(' Input , Output Pressure , FR vs time');
legend('Input Pressure','Output Pressure','Flow Rate');
xlabel('Tempo [Days]');
ylabel('all');
% Prima considerazione
% large variations on the output pressure can be seen at days 1 to 15 ,
% not related to input pression
% we should not use that portion of the data
% It's a brutal cleaning but...
% I am sceptical about what happens during the first 15 days :
% lots of peaks are seen on the output pressure side that are
% not reflected / correlated to the input side ,
% so I prefer to discard the first 20 days (yes it's brutal !)
ind = t>13;
t = t(ind);
pressureIN = pressureIN(ind);
pressureOUT = pressureOUT(ind);
flowrate = flowrate(ind);
PHeadLoss = pressureIN-pressureOUT;
figure()
plot(t, pressureIN, '-g',t, pressureOUT, '-r',t, flowrate, '-b');
axis tight;
title(' Input , Output Pressure , FR vs time');
legend('Input Pressure','Output Pressure','Flow Rate');
xlabel('Tempo [Days]');
ylabel('all');
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
%DATA
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
hold on
%densityplot(flowrate, PHeadLoss, [50,50]);
[p, S, mu] = polyfit(flowrate, PHeadLoss, 2)
[y,delta] = polyval(p,flowrate,S,mu);
%LINEAR FIT
plot(flowrate, y, 'r-');
%hold on
% Linear Fit of Data with 95% Prediction Interval
%plot(flowrate,y+2*delta,'m--',flowrate,y-2*delta,'m--')
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
densityplot(flowrate, y, [100,100]);
% seconda considerazione
% keep data (valid) for gradient below a certain threshold (and non zero
% FR too btw)
% i also don't like those sudden drops to zero ,
% because to me its like dividing zero by zero ,
% when you want to make a relationship between two variables ,
% I avoid using background noise data , it will blurr the result.
% This function gradient calculates the gradient of a signal,
% which is the rate of change of the signal over time.
pressureIN_grad = abs(gradient(pressureIN));
pressureOUT_grad = abs(gradient(pressureOUT));
flowrate_grad = abs(gradient(flowrate));
threshold = 0.0073;
ind1 = pressureIN_grad<threshold*max(pressureIN_grad);
ind2 = pressureOUT_grad<threshold*max(pressureOUT_grad);
ind3 = flowrate_grad<threshold*max(flowrate_grad);
ind4 = flowrate> 0;
ind = ind1 & ind2 & ind3 & ind4;
t = t(ind);
pressureIN = pressureIN(ind);
pressureOUT = pressureOUT(ind);
flowrate = flowrate(ind);
figure()
plot(t, pressureIN, '.g',t, pressureOUT, '.r',t, flowrate, '.b');
axis tight;
title(' Input , Output Pressure , FR vs time');
legend('Input Pressure','Output Pressure','Flow Rate');
xlabel('Tempo [Days]');
ylabel('all');
PHeadLoss = pressureIN - pressureOUT;
densityplot(flowrate, PHeadLoss, [50,50])
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
%DATA
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
hold on
%densityplot(flowrate, PHeadLoss, [50,50]);
[p, S, mu] = polyfit(flowrate, PHeadLoss, 2)
[y,delta] = polyval(p,flowrate,S,mu);
%LINEAR FIT
plot(flowrate, y, 'r-');
%hold on
% Linear Fit of Data with 95% Prediction Interval
%plot(flowrate,y+2*delta,'m--',flowrate,y-2*delta,'m--')
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
densityplot(flowrate, y, [50,50]);
% terza considerazione
% PRESSURE HEAD LOSS (APPROX _no fattore di attrito, flusso del fluido e
% densità del fluido)
PHeadLoss = pressureIN - pressureOUT;
% consider only positive values
ind = PHeadLoss> 0;
t = t(ind);
pressureIN = pressureIN(ind);
pressureOUT = pressureOUT(ind);
flowrate = flowrate(ind);
PHeadLoss = pressureIN - pressureOUT;
densityplot(flowrate, PHeadLoss, [50,50])
ylabel('P Head Loss');
%DATA
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
figure('Name',' PRESSURE HEAD LOSS vs FLOWRATE')
plot(flowrate, PHeadLoss, '*k');
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
hold on
%densityplot(flowrate, PHeadLoss, [50,50]);
[p, S, mu] = polyfit(flowrate, PHeadLoss, 2)
n=size(flowrate)
[y,delta] = polyval(p, (min(flowrate):0.0001:max(flowrate)),S,mu);
% FIT
plot(min(flowrate):0.0001:max(flowrate), y , 'r');
%hold on
% Linear Fit of Data with 95% Prediction Interval
%plot(flowrate,y+2*delta,'m--',flowrate,y-2*delta,'m--')
title('Pressure Head Loss vs FLOW RATE');
xlabel('flow rate');
ylabel('P Head Loss');
densityplot(flowrate, y, [50,50]);

Answers (1)

Torsten
Torsten on 31 Jan 2024
Moved: Torsten on 31 Jan 2024
Use "polyval" to compute the polynomial values p at the flow rates f of the "black points".
The distance is then abs(p-Head Loss of the black points).
  2 Comments
Giuseppe Zumbo
Giuseppe Zumbo on 31 Jan 2024
Hi @Torsten, thanks for answeing but in this way this distance would not be the minimum btw the point of the dataset and the curve, but it's just the vertical distance, isn't it ?
To estimate the accuracy, DO I need this one or the minimum distance? THanks in advance
Torsten
Torsten on 31 Jan 2024
Edited: Torsten on 31 Jan 2024
"Polyfit" minimizes the sum of vertical distances squared between your measurement data and the polynomial curve. So you also should use the vertical distance in your evaluation, shouldn't you ?

Sign in to comment.

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!