downsample data adapively/​"intellige​ntly"

35 views (last 30 days)
Andreas
Andreas on 16 Sep 2021
Commented: Wolfie on 28 Jul 2022
I have a set of data that contains ~7800 data points (the red curve in the figure). I have reduced the number of data points with downsample() to 19 data points (blue curve). However, I would like to position the points more "intelligently". Is there a function in Matlab where the points are placed with some kind of least square minimization/adaptive spacing? The data is currently stored in a struct (data.Stress and data.Strain).
The number of data points must be possible to fix by the user (e.g. =19 in this case).
The most important part of the curve is the beginning, but I think that a general minimization would give a good enough result.
Adding a point between the first and second point and remoing one point where the curve is more or less linear would give a good-enough result. But modifying the curve manually isn't so attractive as there are quite a few curves to process.
  1 Comment
Mathieu NOE
Mathieu NOE on 17 Sep 2021
hello
I would suggest you do a linear interpolation, but the new x vector should have more points in the beginning compared to the end of the curve
you can try different x axis distributions (linear decreasing or exponential) ....

Sign in to comment.

Accepted Answer

Jan
Jan on 17 Sep 2021
Edited: Jan on 17 Sep 2021
This is not a trivial problem. In the general case it is a global optimization problem and there can be a huge number of equivalent solutions.
The Douglas-Peuker-algorithm is a fair approach, if the input is given as discrete set of points, see: https://www.mathworks.com/help/images/ref/reducepoly.html or https://www.mathworks.com/matlabcentral/fileexchange/61046-douglas-peucker-algorithm
An alternative: Calculate the cumulative sum of the absolute values of the 2nd derivative. Now find rounded equidistant steps on this line. The higher the curvature, the more points you get:
x = linspace(0, 4*pi, 500);
y = sin(x);
n = 21; % Number of points
ddy = gradient(gradient(y, x), x); % 2nd derivative
sy = cumsum(abs(ddy));
idx = interp1(sy, 1:numel(x), linspace(sy(1), sy(end), n), 'nearest');
plot(x, y);
hold on
plot(x(idx), y(idx), 'ro');
This looks very nice even for a small number of points. To my surprise it looks such good, that this must be a standard approach. Does anybody know, how this algorithm is called?
  1 Comment
Wolfie
Wolfie on 28 Jul 2022
This is nice, but if it's possible for the 2nd derivative to be zero then interp1 will complain about grid vectors not being unique. Adding some small steady increment to sy can avoid this e.g.
sy = sy + linspace(0,max(sy)/10,numel(sy));

Sign in to comment.

More Answers (1)

Andreas
Andreas on 17 Sep 2021
Edited: Andreas on 17 Sep 2021
Thanks @Jan and @Mathieu NOE, I used bits and pieces from your input to this (principal) code
x = linspace(0,10,500);
y = log(x);
plot(x,y,'-r','LineWidth',2)
hold on
n = 12;
pot = 2.0;
% logarithmic
scale = x(end)/(10^pot-1.0);
red_x = (logspace(0,pot,n)-1.0)*scale;
red_y = interp1(x,y,red_x,'nearest');
red_y(1) = -3.9; %"cheating" for this example, as ln(0) -> -inf
plot(red_x,red_y,'-b*','LineWidth',2)
% linear
red_x = linspace(0,x(end),n);
red_y = interp1(x,y,red_x,'nearest');
red_y(1) = -3.9;
plot(red_x,red_y,'-co','LineWidth',2)
plot(x,y,'-r','LineWidth',2)
legend('Function log(x)','logarithmic distribution','linear distribution')
It gives a better interpolation for my types of curves with logarithmic distribution. Below is an example of my "real" curves.

Categories

Find more on Colormaps in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!