Can you organize scatter plot points?

Hi,
I would like to graph my data with the mean +/- SEM and display the data points. When I plot this, my data points are overlapping and difficult to distinguish. Rather than a random distribution/jitter, I would like my data points for same/similiar values to form nonoverlaping rows of points. Is this poissible in Matlab? Any help is greatly appreciated!
Thanks,
Dan
Example code:
subplot 133
% Plot horizontal line at 50%
yline(0.5,'k--','LineWidth',ylineLW)
hold on
% Plot data points
scatter(y1(:,1),data(:,1),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','randn')
scatter(y1(:,2),data(:,2),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','randn')
scatter(y1(:,3),data(:,3),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','randn')
scatter(y1(:,4),data(:,4),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','randn')
% Plot errorbar (SEM)
errorbar(y2(1),mData(1),SEM(1),'k_','LineWidth',errorbarLW)
errorbar(y2(2),mData(2),SEM(2),'r_','LineWidth',errorbarLW)
errorbar(y2(3),mData(3),SEM(3),'k_','LineWidth',errorbarLW)
errorbar(y2(4),mData(4),SEM(4),'r_','LineWidth',errorbarLW)
% Plot mean values
plot(y2(1),mData(1),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(2),mData(2),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(3),mData(3),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(4),mData(4),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
hold off
PS I attached one of my plots to help visualize the problem. Please let me know if you have any questions... thanks again!

12 Comments

Yeah, I have questions... :)
"I would like my data points for same/similiar values to form nonoverlaping rows of points"
What does the above mean, exactly?
Plots look ok to me; what do you think is hard to distingush? Can you sketch what you think one of these should look like instead?
I can plot my data in a single line ('XJitter' = 'none') or randomly distribute ('XJitter' = 'rand', 'randn', or 'density') my data in each column. My PI would like me to distribute the dots in uniform rows within each column to avoid overlapping dots for 'better' visualization. I attached the image my PI sent me as reference. I'm trying to avoid manually calculating the x position for each data point.
It would help if you also provided the data, so your code can be excuted and tweaked a bit.
One idea... Have you tried using rand instead of randn for the XJitter property? Perhaps that will produce the non-overlapping effect you are after.
I attached example data... its the same data used to generate the example plot above.
I have triend using none, rand, randn, and density but none of them have produced column scatter plots with nonoverlapping data points.
Thanks again for your help!
Well, unless the x values for every point are identical, the points aren't going to be in columns. If that's what you want, replace each X vector with, say, mean(X).
I'd never noticed there being a 'jitter' propery on scatter before; don't know when that might have been introduced--or maybe it's been there "since forever" and I just never saw it. Not sure of the point/intent.
OK, that's a start. Still can't execute your code, however. Please provide the script used to read the data from the file into the variables in your example code -- or just edit the example code so it can be executed.
Without a 'jitter' parameter the points are in a single column... 'jitter' allows you to disperse the points horizontally... The problem is that 'jitter' randomly moves the points nd that data point may overlap... I want to disperse my points in a more orgnaized fashion. See the attached plot to see a jitterless plot.
My full code below...
close all;
clear;
clc;
%% Load data
data = xlsread('Example Data.xlsx', 'A2:L22');
%% Create Y values
[row, col] = size(data);
y1(1:row,1) = 1;
y1(1:row,2) = 2;
y1(1:row,3) = 4;
y1(1:row,4) = 5;
y2 = y1(1,:);
clear row
%% Calculate mean
mData = nanmean(data);
%% Calculate SEM
std = nanstd(data);
L = sum(~isnan(data),1);
SEM = std./sqrt(L);
%% Plot data
% Plot settings
ylineLW = 2;
errorbarLW = 3;
plotLW = 3;
plotMS = 56;
titleFS = 18;
xnameFS = 18;
xlabelFS = 18;
ylabelFS = 24;
names = {'Sham','TBI',' ','Sham','TBI'};
% TOR
subplot 133
yline(0.5,'k--','LineWidth',ylineLW)
hold on
scatter(y1(:,1),data(:,1),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,2),data(:,2),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,3),data(:,3),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,4),data(:,4),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
errorbar(y2(1),mData(1),SEM(1),'k_','LineWidth',errorbarLW)
errorbar(y2(2),mData(2),SEM(2),'r_','LineWidth',errorbarLW)
errorbar(y2(3),mData(3),SEM(3),'k_','LineWidth',errorbarLW)
errorbar(y2(4),mData(4),SEM(4),'r_','LineWidth',errorbarLW)
plot(y2(1),mData(1),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(2),mData(2),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(3),mData(3),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(4),mData(4),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
hold off
title('Temporal Order Recognition','FontSize',titleFS)
set(gca,'xtick',[1:5],'xticklabel',names,'FontSize',xnameFS,'linewidth', 2)
ylabel('Discrimination Ratio','FontSize',ylabelFS)
xlabel('3 month 6 month','FontSize',xlabelFS)
ylim([0 0.9])
xlim([0 6])
% NOR
subplot 131
yline(0.5,'k--','LineWidth',ylineLW)
hold on
scatter(y1(:,1),data(:,5),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,2),data(:,6),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,3),data(:,7),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,4),data(:,8),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
errorbar(y2(1),mData(5),SEM(5),'k_','LineWidth',errorbarLW)
errorbar(y2(2),mData(6),SEM(6),'r_','LineWidth',errorbarLW)
errorbar(y2(3),mData(7),SEM(7),'k_','LineWidth',errorbarLW)
errorbar(y2(4),mData(8),SEM(8),'r_','LineWidth',errorbarLW)
plot(y2(1),mData(5),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(2),mData(6),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(3),mData(7),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(4),mData(8),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
hold off
title('Novel Object Recognition','FontSize',titleFS)
set(gca,'xtick',[1:5],'xticklabel',names,'FontSize',xnameFS,'linewidth', 2)
ylabel('Discrimination Ratio','FontSize',ylabelFS)
xlabel('3 month 6 month','FontSize',xlabelFS)
ylim([0 0.9])
xlim([0 6])
% NOL
subplot 132
yline(0.5,'k--','LineWidth',ylineLW)
hold on
scatter(y1(:,1),data(:,9),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,2),data(:,10),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,3),data(:,11),'black','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
scatter(y1(:,4),data(:,12),'red','filled','MarkerFaceAlpha',0.5,'MarkerEdgeAlpha',0.5,'XJitter','density')
errorbar(y2(1),mData(9),SEM(9),'k_','LineWidth',errorbarLW)
errorbar(y2(2),mData(10),SEM(10),'r_','LineWidth',errorbarLW)
errorbar(y2(3),mData(11),SEM(11),'k_','LineWidth',errorbarLW)
errorbar(y2(4),mData(12),SEM(12),'r_','LineWidth',errorbarLW)
plot(y2(1),mData(9),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(2),mData(10),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(3),mData(11),'k_','LineWidth',plotLW,'MarkerSize',plotMS)
plot(y2(4),mData(12),'r_','LineWidth',plotLW,'MarkerSize',plotMS)
hold off
title('Novel Object Location','FontSize',titleFS)
set(gca,'xtick',[1:5],'xticklabel',names,'FontSize',xnameFS,'linewidth', 2)
ylabel('Discrimination Ratio','FontSize',ylabelFS)
xlabel('3 month 6 month','FontSize',xlabelFS)
ylim([0 0.9])
xlim([0 6])
"see a jitterless plot."
I have to say that looks far preferable to me than any of the other results. My take is that "if data overlap, they overlap" and to show them as something else is a fabrication.
But, if you want something that isn't going to be random, add a fixed delta to each point which is within some tolerance of its neighbor(s). However many are in the overlap region in y direction will be 2X the number of intervals required with 0 as [-N:N]*dx.
Not trying to misrepresent or fabricate data... The idea would be to take the single line of points and speed them horizontal to visualize all of the points at a given value... Rather than trying to see subtle changes in opacity, you could clearly distinguish data points. You are 100% correct that similar y values complicate manually spreading the data points. I'm debating the idea with my PI and will likely go with something similar to the last plot or export the data to Prism for making figures for publication. Thanks for your help!
What you describe is basically the same as a histogram. Why not just use the histogram() command?
We would like to plot the mean +/- error with overlay of the data points. Can you do this with histograms? I have made histograms, but never to do anything like this... Can you share an example? Thanks!

Sign in to comment.

 Accepted Answer

I think it may make more sense for you to move to a proper violin plot. With this 3rd party file, for example
you can generate plots like the one below. The violin envelope clearly delineates the boundaries of the estimated distribution for each level y, which I think is ultimately what you want.
load Data.mat
hv=violinplot(data,{'Sham','TBI'});
ylabel 'Discrimination Ratio'
for i=1:2
hv(i).ViolinAlpha=0.1;
hv(i).ScatterPlot.MarkerFaceAlpha=0.7;
end
hv(1).ViolinColor='k';
hv(2).ViolinColor='r';
If you wish, the jittered data points can be overlaid as well:
[hv.ShowData]=deal(true); %remove if you don't want data points included

2 Comments

Thanks! I saw the violin plot script, but didn't realize that it could show data points. I'll run it by my PI but this might be the closest we are going to get in Matlab.
y=sort(data(:,1));
delta=0.075;
dx=zeros(size(y));
dx(1:2)=delta*[-1 1].';
dx(6:end-3)=delta*repmat(-1:1,1,4);
hSc=scatter(x+dx,y,'black','filled');
xlim([0 5])
produces
I didn't work on the differences algorithm much, but look at something like
find(diff(y<=thresh))
where thresh is about 0.02. If you want more separation, increase dx.
I can count 20 individual points in the above, although a few are still touching with above choices.
ADDENDUM: Remember the length(diff(y)) is one less than length(y) when locating indices to modify.

Sign in to comment.

More Answers (2)

This may be more along the lines of what you were originally looking for
load Data;
w=max( max(data,[],1) - min(data,[],1) ); %Maximum swarm width
xoffset=w; %separation between swarms
P=size(data,2);
for n=1:P
y=data(:,n);
bins=linspace(min(y),max(y),15);
[counts,~,G]=histcounts(y,bins);
sumc=sum(counts);
x=nan(size(y));
for i=1:numel(counts)
J=(G==i);
N=counts(i);
if N<2
x(J)=0;
else
x(J)=(w*N/sumc)*linspace(-0.5,+0.5,N);
end
end
x=x+n*xoffset;
h(i)=scatter(x,y,'filled');
axis equal
hold on
end
hold off
xticks((1:P)*xoffset)
xlim([xoffset/2,P*w+xoffset/2])
xticklabels({'Sham','TBI'})
Steven Lord
Steven Lord on 7 May 2021
When I look at your plot it looks to me an awful lot like the boxplot thumbnail in the Plots tab of the Toolstrip. You may want to investigate if that can display the information in a less cluttered approach than a simple scatter plot.

Products

Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!