How to re-bin the 2D data?

25 views (last 30 days)
blues on 30 Dec 2019
Edited: Adam Danz on 31 Dec 2019
Hi everyone,
I've 2D data, first column (X) is the time and second column (Y) is the corresponding data. It is easy to re-bin the X data as the data values are linear, but could not figure out the way to re-bin the Y data as it is exponentially decaying. Attached screenshot is a piece of data. How can I re-bin the Y data so that I can pull out Y values for each re-binned X data accurately?
X = data(4:214, 1);
Xmax = max(X); Xmin = min(X);
N = 106; % no. of bins I want, this will give me bin size of 0.1 (say)
dy = (Xmax - Xmin)/ (N-1); % N-1 will be clear in the next line
Xedges = Xmin - dy/2 : dy : Xmax + dy/2;
Xedges = Xedges'; % change row to column matrix (transpose)
% re-bin Y data so that I can pull out Y values for each re-binned X column
Y = data(4:214, 2);
Ymax = max(Y); Y1min = min(Y);
N = 106; % no. of bins we want
.... % not sure how to proceed ahead

Adam Danz on 30 Dec 2019
Edited: Adam Danz on 30 Dec 2019
You can use Y = discretize(X,edges) to bin the X data and use the bin index to group the Y data into bin categories. You probably cannot assume that there will be an equal number of data points in each bin so the grouped y-data must be stored in a cell array.
Here's a demo using your variable names.
[bins, Xedges] = discretize(X,N); % Xedges are not used here
Y = data(4:214, 2);
yBinned = arrayfun(@(i)Y(bins==i),unique(bins),'UniformOutput',false);
yBinned is an n-by-1 cell array where yBinned{j} are the Y values in bin number j whose edges are defined by Xedges([j,j+1]).
blues on 30 Dec 2019
You can get Xedges from the code that I attached before/here:
X = data(1:211, 1);
Xmax = max(X); Xmin = min(X);
N = 106; % no. of bins we want
dy = (Xmax - Xmin)/ (N-1); % N-1 will be clear in the next line
Xedges = Xmin - dy/2 : dy : Xmax + dy/2;
Xedges = Xedges';
Previously I read the data excluding header info from .xls. So, after removing headers data(1:211, 1) i.e., same as in code.
I don't have a mat file.
Adam Danz on 31 Dec 2019
"You can get Xedges from the code that I attached before"
"I don't have a mat file."
It would take you less than 1 minute to create one using the save() function. By sharing a mat file, we know we're using the exact same data. But when you give the csv file, it requires me to read in the data and there are multiple ways to do that so we could end up with slightly different values. Also, it takes more time for us volunteers to read-in data. The idea is to make it easy for us to help you and to make sure we're all looking at the exact same thing.
I copied your data from the csv file into the command window and named the variable "data". Then I ran the following two sections of code. The first section produces the variable yMean and the second section produces the variable yMean2. Then I compare the values.
X = data(1:211, 1);
Xmax = max(X); Xmin = min(X);
N = 106; % no. of bins we want
dy = (Xmax - Xmin)/ (N-1); % N-1 will be clear in the next line
Xedges = Xmin - dy/2 : dy : Xmax + dy/2;
Xedges = Xedges';
% VERSION 1
bins = discretize(X,Xedges);
Y = data(1:211, 2);
yBinned = arrayfun(@(i)Y(bins==i),unique(bins),'UniformOutput',false);
yMean = cellfun(@(x)mean(x(~isnan(x))),yBinned);
% VERSION 2
[n,bins2] = histc(X,Xedges);
yMean2 = accumarray(bins2,Y,[],@mean);
% COMPARE VERSIONS
isequal(bins,bins2) % = TRUE; so they are the same
isequal(yMean,yMean2) % = TRUE; so they are the same
As you can see, the two sections produce the same outputs. If you are getting different values it could be due to any of the following reasons
1. The inputs in your code are different between the two versions.
2. Your versions don't match my versions.
3. I'm using r2019b and you're using r2016a. I doubt this is the problem.