Troubles assigning colorbar values in a bivariate histogram plot

11 views (last 30 days)
Hello!
Im working with data of height and period of ocean waves, each pair have a power value associated, I have to get a bivariate histogram but instead of number of observations in colorbar i want that the scale represents the power associated.
I attached an example of the plots that i already have and the code to obtain it.
Xedges = [1:0.25:16];
Yedges = [0.00:0.025:1.6];
h = histogram2(Tp,Hs,Xedges,Yedges)
h.FaceColor = 'flat';
h.DisplayStyle = 'tile';
c=colorbar;
c.Label.String = 'Número de observaciones';
c.Label.FontSize = 14;
view(2)
I am new in matlab and maybe Im wrong with the function that i am using, i also tried with sccater3 but i have to group my data in bins
so this option doesnt work.
Thanks in advance for your help
  2 Comments
Max Murphy
Max Murphy on 4 Jan 2020
You could try using imagesc instead; without knowing the format of data in Tp or Hs it is hard to say otherwise.
Mar
Mar on 7 Jan 2020
Sorry, i just attach the data, first row is Hs, second Tp and the last one is power, i will check imagesc to see if I can achieve something,
Thanks a lot

Sign in to comment.

Accepted Answer

Adam Danz
Adam Danz on 8 Jan 2020
Edited: Adam Danz on 8 Jan 2020
Binned approach
The problems with using the binned bivariate histogram are
  1. The colors on the histrogram will be based on density rather than the power values.
  2. There are often more than 1 power value within each bin.
A solution to problem 1 is to use imagesc() instead.
A solution to problem 1 is to average the power values within each bin. Note that you're losing information here since the powers are averaged and the actual (x,y) coordinates of the data are binned. The scatter plot version does not have these losses.
Here's a demo using your data; you can add the contour from my other answer.
d = readmatrix('Dzi.txt');
Hs = d(:,1);
Tp = d(:,2);
power = d(:,3);
Xedges = [1:0.25:16]; % defined by OP
Yedges = [0.00:0.025:1.6]; % defined by OP
% determine which bin each Tp and Hs value is in
TpBins = discretize(Tp,Xedges);
HsBins = discretize(Hs,Yedges);
% group the Power values by bin
xyGroups = [TpBins(:),HsBins(:)];
[xyGroupUnq, ~, unqGroupID] = unique(xyGroups,'rows');
% average power values within groups
powerGroupMean = splitapply(@mean, power, unqGroupID);
% Reorganize power into matrix
powerMat = zeros(numel(Yedges)-1, numel(Xedges)-1);
idx = sub2ind(size(powerMat),xyGroupUnq(:,2),xyGroupUnq(:,1));
powerMat(idx) = powerGroupMean;
% Plot it
imagesc(Xedges, Yedges, powerMat)
set(gca,'YDir','normal')
cmap = jet(100);
cmap(1,:) = 1; % to make background white
colormap(cmap)
colorbar()
200108 111749-Figure 1.png

More Answers (2)

David Goodmanson
David Goodmanson on 7 Jan 2020
Edited: David Goodmanson on 8 Jan 2020
Hi Mar,
I have been looking at this with some data I made up, but now that you posted it I was able to plug it in.
I thought your plot was pretty good. From your data, the power is
Pdens = 4*Hs^2*Tp or Hs = (1/2)*sqrt(Pdens/Tp) [ Pdens in kW/m ]
with variations on the order of 15%. Is going to color-by-power to try and bring out that fairly minor effect worth the information on number of occurrences that you lose?
Incidentally, the dashed lines on your plot do not seem to be consistent with the formula above. Right or wrong, the plots in the code below use the formula above.
A loglog plot of the data may be useful here, because the power density curves turn into straight lines and the data for small Hs and Tp is not all squished into the corner. The binning is done by logs of the variables and comes out differently. Now the highest-occupancy bins have higher values of Hs and Tc than before. I believe this is because in terms of logs there is a lot more space for small-value bins and the small data get spread around among more bins. On the loglog plot the power density lines aren't labeled but they go from 20 to 40 dBW/m (so 30 dB is 1kW/m) in 5dB steps.
The code below uses x for Tp and y for Hs, as on a plot.
load('Dzi.txt')
yvals = Dzi(:,1);
xvals = Dzi(:,2);
Pwr = Dzi(:,3);
% initial histogram
figure(1)
h = histogram2(xvals,yvals,'DisplayStyle','tile')
hold on
xPwr = linspace(1,15,100);
yPwr = (1/2)*sqrt([.25 1 3 7 13]')./sqrt(xPwr);
plot(xPwr,yPwr,'k--')
xlim([1 16])
ylim([0 1.6])
grid on
colorbar
hold off
% duplicate histogram2 with patch
figure(2);
h2 = histogram2(xvals,yvals);
xbe = h2.XBinEdges;
ybe = h2.YBinEdges;
bco = h2.BinCounts'; % transpose is necessary
close(2)
bco = bco(:);
[x y] = meshgrid(xbe,ybe);
[xp yp] = mesh2patch(x,y);
ind = (bco == 0);
xp(:,ind) = [];
yp(:,ind) = [];
bco(ind) = [];
figure(2)
patch(xp,yp,bco,'EdgeColor','none')
hold on
xPwr = linspace(1,15,100);
yPwr = (1/2)*sqrt([.25 1 3 7 13]')./sqrt(xPwr);
plot(xPwr,yPwr,'k--')
xlim([1 16])
ylim([0 1.6])
grid on
colorbar
hold off
% loglog version
figure(4)
h4 = histogram2(log10(xvals),log10(yvals))
xbe = 10.^h4.XBinEdges;
ybe = 10.^h4.YBinEdges;
bco = h4.BinCounts'; % transpose is necessary
close(4)
[x y] = meshgrid(xbe,ybe);
[xp yp] = mesh2patch(x,y);
ind = (bco == 0);
xp(:,ind) = [];
yp(:,ind) = [];
bco(ind) = [];
figure(4)
ax = gca;
ax.XScale = 'log';
ax.YScale = 'log';
patch(xp,yp,bco,'EdgeColor','none')
hold on
xPwr = linspace(1,15,100);
dB = [20 25 30 35 40]; % power density, dBW/m
Pvals = 10.^(dB/10)/1000; % power density, KW/m
yPwr = (1/2)*sqrt(Pvals')./sqrt(xPwr);
plot(xPwr,yPwr,'k--')
xlim([1 16])
ylim([.01 1.6])
grid on
colorbar
hold off
function [xpatch ypatch] = mesh2patch(x,y)
% using x and y matrices provided by meshgrid, function creates
% corresponding patch matrices for every 'cell' in meshgrid.
% See notes below.
%
% [xpatch ypatch] = mesh2patch(x,y)
xlf = x(1:end-1,1:end-1); % col index for x
xlf = xlf(:);
xrt = x(1:end-1,2:end);
xrt = xrt(:);
xpatch = [xlf xrt xrt xlf]';
ydn = y(1:end-1,1:end-1); % row index for y
ydn = ydn(:);
yup = y(2:end,1:end-1);
yup = yup(:);
ypatch = [ydn ydn yup yup]';
end
% Meshgrid:
% For [x y] = meshgrid(xvec,yvec)
% coordinate x varies with the column index of x, and
% coordinate y varies with the row index of y.
% If xvec and yvec are vectors of increasing values as usual,
% then the lower left corner of a plot has the coordinates (x(1,1),y(1,1)).
% In a plot, since increasing y goes from down to up, the upper left
% corner of the y matrix is the lower left corner of the plot.
% Patches:
% If x and y have dimension mxn, then the patches, if they were
% the contents of a matrix, have dimension (m-1)x(n-1).
% The xpatch and ypatch matrices have dimension 4x((m-1)*(n-1)).
% The order of patches in xpatch and ypatch is such that if the patch
% colors are a function of x and y, e.g. ColoR = f(x1,y1), then
%
% patch(xpatch,ypatch,ColoR(:))
%
% gives the correct coloring. Note that x1 and y1
% have dimension (m-1)x(n-1), not mxn.
  2 Comments
Ahmed Alaa
Ahmed Alaa on 4 Oct 2021
Hi David,
In the follwoing line "yPwr = (1/2)*sqrt([.25 1 3 7 13]')./sqrt(xPwr)", why did you choose these values [.25 1 3 7 13] for Pdens?
How can I plot the wave energy isolines depicted in the figure below? As can be seen, the isolines show the Kw/m values in relation to Tp and Hs?
Thank you in advance for your reply.
David Goodmanson
David Goodmanson on 12 Oct 2021
Edited: David Goodmanson on 12 Oct 2021
Hi Ahmed,
I think I must have copied those values off of the Dzilam.jpg image that was posted with your original question.
The standard expression for P is
P = const*H^2*t
with const = 1/2, which the Site 3 plot in your comment just above is doing. However, the plot in Dzilam.jpg seems to have a constant = 4 approximately and so does your data. So I really don't know what is going on. It's easy enough to speculate about a factor of 8, though, since (peak-to-peak)^2 vs (0-to-peak)^2 is worth a factor of 4, and peak^2 vs (rms_peak)^2 is worth a factor of 2.

Sign in to comment.


Adam Danz
Adam Danz on 7 Jan 2020
Edited: Adam Danz on 8 Jan 2020
Scatter plot approach
"i also tried with sccater3 but i have to group my data in bins so this option doesnt work."
What's wrong with grouping the data in bins? You've got more than 61,000 unique values of power within a range of 45,400 values but that doesn't mean you need tens of thousands of unique color values to get a meaningful image of your data. And if you want that level of precision, you could greatly increase the number of color levels but you won't see much of a difference.
This solution uses scatter() and allows you to set as many color levels as you want. The image below is based on 100 color levels of the jet colormap. You could change the number of levels to 1000 or 10,000 and you probably won't notice a difference.
See the inline comments for details.
% Read in the matrix data from text file and split the columns into variables.
d = readmatrix('Dzi.txt');
Hs = d(:,1);
Tp = d(:,2);
power = d(:,3);
% set your colormap and the number of levels
cmap = jet(100); % you can use any colormap with any number of levels.
% scale the power values to the colormap
cIdx = round((power - min(power))/range(power)*(size(cmap,1)-1))+1;
% Plot the data
figure()
scatter(Tp, Hs, 15, cmap(cIdx,:),'filled')
xlabel('Tp')
ylabel('Hs')
% Add colorbar
set(gca,'Colormap',cmap)
cb = colorbar();
caxis([min(power),max(power)])
ylabel(cb, 'power')
One drawback to this method is that you lose the density of clustered data. You could overlay a contour map that shows the density by adding these lines.
hold on
[nCount,xEdges,yEdges] = histcounts2(Tp,Hs,50);
contour(xEdges(2:end)-(xEdges(2)-xEdges(1))/2, yEdges(2:end)-(yEdges(2)-yEdges(1))/2, nCount','-','Color',[.6 .6 .6])
  2 Comments
Mar
Mar on 8 Jan 2020
Hello Adam! thanks for your contribution, maybe i didnt explain properly, I want my data in bins that´s why i was using histogram2, anyway i found helpful your propose to overlay the contour map to show the density of the data.
At the end what i have in mind is a bivariate histogram in which the color bar represents power values and then add in each square of the histogram the probability that represents, but it seems to me that with a scatter plot with density contours (like the one that you post) the objective of the graph is achieved, that is to show the probability to found some power values.
Thank you so much
Adam Danz
Adam Danz on 8 Jan 2020
Edited: Adam Danz on 10 Jan 2020
" I want my data in bins that´s why i was using histogram2"
See my other answer I just added.
"the objective of the graph is achieved"
Great! Please concider accepting whichever answer is most suitable.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!