Help on ploting bar graph using table

Dear Sir/Madam,
Im having difficulty plotting my data in my table to a bar graph. I have renamed my columns of my table and wish to plot a bar graph of the data with the column heading under each bar. Im a beginner in matlab and Ive had a look at the bar help website, but cant seem to solve my issue
Im getting the error: Input arguments must be numeric, datetime, duration or categorical.
I would greatly appreciate help on my problem. Thank You in advance
Best Regards,
Jeevs S
newNames={'Protein A','Protein B','Protein C','Protein D','Protein E','Protein F','Protein G'...
,'Protein H','Protein I','Protein J','Protein K','Protein L''Protein M','Protein N','Protein O','Protein P',...
'Protein Q','Protein R','Protein S','Protein T','Protein U','Protein V'};
T=array2table(data1, 'VariableNames', newNames);
bar(T);

2 Comments

What happened to the lesson about categorical variables we created in <help-with-changing-text-in-a-table#answer_963965> just a day or so ago?
>> tSinghBar=readtable('singhDataBar.xlsx');
>> head(tSinghBar)
ans =
2×21 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14 Var15 Var16 Var17 Var18 Var19 Var20 Var21
____________________ ____________________ ______ ____ ____ ____ ___________________ ____ ___________________ ____________________ _____ ____________________ ______ _____ _____ _____ ___________________ _____ ___________________ ____________________ _____
1 2 NaN 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.000110990692337116 0.000194477028347996 0.0001 0 0 0 9.6678862932733e-05 0 0.00026838526259599 0.000239223260270692 0 0.000194477028347996 0.0001 0 0 0 9.6678862932733e-05 0 0.00026838526259599 0.000239223260270692 0
>>
What does any of the above mean? How do you expect to plot whatever this is?
Similar to what we illustrated there, the way/place to put the variable names is in the table if you're going to make a table,
>> tSinghBar.Properties.VariableNames=compose("Protein %c",['A':'U'].')
tSinghBar =
2×21 table
Protein A Protein B Protein C Protein D Protein E Protein F Protein G Protein H Protein I Protein J Protein K Protein L Protein M Protein N Protein O Protein P Protein Q Protein R Protein S Protein T Protein U
____________________ ____________________ _________ _________ _________ _________ ___________________ _________ ___________________ ____________________ _________ ____________________ _________ _________ _________ _________ ___________________ _________ ___________________ ____________________ _________
1 2 NaN 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.000110990692337116 0.000194477028347996 0.0001 0 0 0 9.6678862932733e-05 0 0.00026838526259599 0.000239223260270692 0 0.000194477028347996 0.0001 0 0 0 9.6678862932733e-05 0 0.00026838526259599 0.000239223260270692 0
>>
although there are only enough columns in the data file for A thru U, not V.
But, the above doesn't seem to be a candidate for a flat table at all -- if one gets out the crystal ball, perhaps the numbers are counts of the second record values???
>> tSinghBar{1,:}.'
ans =
1
2
NaN
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>>
Well, no, that doesn't make any sense either, thery're just ordinal numbers with one missing -- we're stumped at what you think these data are, sorry...no idea what to try to do with as is.
Dear dpd,
Apologies for the error in the data table. I have updated the data, given by 'data2'
T=readtable('data2.xlsx');
head(T);
Thank you for having a look at my problem
Jeevs S

Sign in to comment.

 Accepted Answer

There are mismatches between the number of names and the number of variables, and a missing comma between ‘Protein L’ and ‘Protein M’. I am not certain what the first row is for, or how to use it here, since if used with bar3, it dominates the plot .
Anyway, try this —
T = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/999080/data1.xlsx');
T{:,end+1} = [22; rand*1E-4]; % Add Variable So That Everything Works
newNames={'Protein A','Protein B','Protein C','Protein D','Protein E','Protein F','Protein G'...
,'Protein H','Protein I','Protein J','Protein K','Protein L','Protein M','Protein N','Protein O','Protein P',...
'Protein Q','Protein R','Protein S','Protein T','Protein U','Protein V'};
T.Properties.VariableNames = newNames
T = 2×22 table
Protein A Protein B Protein C Protein D Protein E Protein F Protein G Protein H Protein I Protein J Protein K Protein L Protein M Protein N Protein O Protein P Protein Q Protein R Protein S Protein T Protein U Protein V __________ __________ _________ _________ _________ _________ __________ _________ __________ __________ _________ __________ _________ _________ _________ _________ __________ _________ __________ __________ _________ __________ 1 2 NaN 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 0.00011099 0.00019448 0.0001 0 0 0 9.6679e-05 0 0.00026839 0.00023922 0 0.00019448 0.0001 0 0 0 9.6679e-05 0 0.00026839 0.00023922 0 1.9708e-05
figure
bar(T{2,:})
set(gca, 'XTick',1:numel(newNames), 'XtickLabel',newNames)
.

8 Comments

Yeah, if ignore that record and fixup the number of variables one way or t'other (I just named the ones in the file instead of making up another, then
tSinghBar.Properties.VariableNames=compose("Protein %c",'A'+ [0:size(tSinghBar,2)-1].');
bar(categorical(tSinghBar.Properties.VariableNames),tSinghBar{2,:})
produces the same plot for the first 20 instead of for 21 where the x axis is categorical and the names come along "for free".
Thank you both dpd and StarStrider, both of your solutions have solved my problem. I greatly appreciate the massive help.
Apologies for the error in the data, I have updated the data in the original post.
In StarStrider's post, does the below code set the names in 'newnames' to every bar by nume1 ?
1:numel(newNames)
Also, in dpd's solution, what does the first line of code do? im having trouble undertstanding the function inside compose, copied below:
tSinghBar.Properties.VariableNames=compose("Protein %c",'A'+ [0:size(tSinghBar,2)-1].');
Thank you both for helping me. Have a great week !
Mr Singh
There are 20 variables in the table and 22 values in ‘newNames’. They must match for this to work.
The compose call creates something approximating ‘newNames’, however matching the number of existing variables (since ‘newNames’ does not).
T = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/999665/data2.xlsx')
T = 2×20 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14 Var15 Var16 Var17 Var18 Var19 Var20 __________ __________ ______ ____ ____ ____ __________ ____ __________ __________ _____ __________ ______ _____ _____ _____ __________ _____ __________ __________ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.00011099 0.00019448 0.0001 0 0 0 9.6679e-05 0 0.00026839 0.00023922 0 0.00019448 0.0001 0 0 0 9.6679e-05 0 0.00026839 0.00023922
% T{:,end+1} = [22; rand*1E-4]; % Add Variable So That Everything Works
newNames={'Protein A','Protein B','Protein C','Protein D','Protein E','Protein F','Protein G'...
,'Protein H','Protein I','Protein J','Protein K','Protein L','Protein M','Protein N','Protein O','Protein P',...
'Protein Q','Protein R','Protein S','Protein T','Protein U','Protein V'}
newNames = 1×22 cell array
{'Protein A'} {'Protein B'} {'Protein C'} {'Protein D'} {'Protein E'} {'Protein F'} {'Protein G'} {'Protein H'} {'Protein I'} {'Protein J'} {'Protein K'} {'Protein L'} {'Protein M'} {'Protein N'} {'Protein O'} {'Protein P'} {'Protein Q'} {'Protein R'} {'Protein S'} {'Protein T'} {'Protein U'} {'Protein V'}
T.Properties.VariableNames = newNames
The VariableNames property must contain one name for each variable in the table.
figure
bar(T{2,:})
set(gca, 'XTick',1:numel(newNames), 'XtickLabel',newNames)
I will revisit this when the necessary dimensions match.
.
@gurjeevan singh -- break it down from inside out --
compose("Protein %c",'A'+ [0:size(tSinghBar,2)-1].')
size(tSinghBar,2) % number columns actually in the table
% could have used numel(tSinghBar.Properties.VariableNames) as well
'A'+ [0:size(tSinghBar,2)-1].' % build vector of that many elements
% try it at command line -- MATLAB adds automagically and conversts to double
compose("Protein %c",'A'+ [0:size(tSinghBar,2)-1].'); % and convert to the string appending letter
% NB the -1 for zero-based counting to add to the starting letter
Then set the variable names in the table to the result. Done deliberately in part to illustrate such manipulations can be automated to use the available data -- you don't need to type all those in by hand and you can make the result match the table size read in to eliminate the mismatch problems.
We had already illustrated using colon expansion with a known range in the last Q? in creating the categorical variables set names as something like
"Protein "+['A':'C'].'
using the overloaded plus operator on the string class; this illustrated how to get the letter range programmatically as well.
Thank you both for your explanations @dpb @Star Strider.
Im working more on analysis with similiar datasets. This is of great help !
Thank you !
OK, you fixed the mismatch it appears, but what's the point of having the ordinal numbers for the first row here? All that does is confuse readtable on import in thinking that you have two lines of data unless you go to the trouble of specifying 'NumHeaderLines',1 and even if do that then you get the non-helpful column variable names 'Var1, 'Var2', ...
Why don't you set the first row to the real variable names in the spreadsheet to start with -- would make it much easier to read/use from the user-friendly aspect inside Excel. Even if you just use 'A', 'B', etc., would seem to help immensely.(*)
Alternatively, as shown you can build names programmatically, but if you're going to do that and the Excel file is being built automatically not by hand, then why not just forget the header line entirely?
Also, depending on how/where this is going, a table may not be the best way to do this -- as the example showed, if it is a table, to plot the columns as the independent variable means pulling the data out as an array and maniuplating the variable names -- or having a duplicate set of variable names which is redundant data to manually write as did SS.
It would seem better in this case to order the spreadsheet the other way -- list the Proteins going down by row and the values by column -- then you can use the first column as the independent variable "Protein" and the second with whatever it is that that number is measuring. If there are more observations or things measured, they become additional columns. That organization would simpify the code even further; you could read the protein ID as categorical on import, even. That organization would also have the effect of making the sheet easier to read -- 20 rows shows easily on a screen; 20 columns "not so much"...
(*) ADDENDUM: If there is some reason to use 1:N as an ID instead of the letter or other label, then I'd suggest (if also must keep the horizontal organization)
tData=readtable('yourfile.xlsx','ReadVariableNames',1);
Then the column names will be "1", "2", ... "N" which are easy to address dynamically progrmmatically (not that other forms can't be, just is especially trivial). I'd still think the column orientation would work far better here, though.
Thank you for the feedback. The data above is part of a larger dataset and converting to columns would not be ideal for the data given. However, I most definetly agree with your points above. Thank you for helping with my problem and giving more insight into Matlab coding
Attach a representative section of the dataset...there still may be "more better" ways to import/use it...we can only see through a tiny peephole here.

Sign in to comment.

More Answers (1)

dpb
dpb on 16 May 2022
Edited: dpb on 16 May 2022
Pursuant to the previous comments; my suggestions would result in something like
>> tSinghBar=readtable('data2.xlsx');
>> tSinghBar.Protein=categorical(tSinghBar.Protein)
tSinghBar =
20×2 table
Protein Measure1
_______ __________
A 0.00011099
B 0.00019448
C 0.0001
D 0
E 0
F 0
G 9.6679e-05
H 0
I 0.00026839
J 0.00023922
K 0
L 0.00019448
M 0.0001
N 0
O 0
P 0
Q 9.6679e-05
R 0
S 0.00026839
T 0.00023922
>> bar(tSinghBar.Protein,tSinghBar.Measure1)
>>
I attached the updated/reformatted Excel file for your convenience/viewing pleasure...

Asked:

on 15 May 2022

Commented:

dpb
on 16 May 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!