Comma delimited csv with letters and numbers

4 views (last 30 days)
I have a dataset coming in from a .csv file as a table. But when imported the data appears all on one column delimitd by commas'
in the format B,1,2,B,B. for 60 rows.
I am trying to separate each digit into a seperate column and be able to access the numberes while ignoring the B's
I've tried replacing the B's wiht NaN. but 'First argument must be a string array, character vector, or cell array of character vectors.'
I've tried Table2array but that just creates a 60x1 categorical variable without separating each B or Number into a column,
Is there anything else I could try?
  9 Comments
Athrey Ranjith Krishnanunni
Try this, with filename replaced with the full name of your file, and nHeaderLines replaced with the number of header (non-data) lines at the top of your csv file:
opts = detectImportOptions(filename,'NumHeaderLines',nHeaderLines);
Table = readtable(filename,opts);

Sign in to comment.

Accepted Answer

Cris LaPierre
Cris LaPierre on 7 Jan 2021
Because your columns all contain Bs, they are going to be read in as text instead of numbers. You mention you want to turn them into NaNs. In that case, you can force MATLAB to read the columns as doubles, causing the Bs to automatically be NaNs.
However, in your case, DATA_TYPE is one variable in a table of other variables.
load Datafile.mat
head(Frame)
ans = 8x11 table
DATA_TYPE MOVIE VarName3 VarName4 VarName5 VarName6 VarName7 VarName8 VarName9 VarName10 VarName11 ______________________________________________ _____ __________ __________ __________ ________ ________ __________ __________ __________ __________ B,B,B,B,B,0,0,0,0,0,0,0,B,B,B,B,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,B,B,0,0,0,0,0,0,0,0,0,B,B,B,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,B,B,0,0,0,0,0,0,0,0,0,0,B,B,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,B,0,0,23,35,105,23,0,0,0,0,0,0,B,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,B,0,0,23,27,19,19,0,0,0,0,0,0,B,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,0,0,0,27,23,0,0,0,0,0,0,0,0,0,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,0,0,23,23,23,0,0,0,0,0,0,0,0,0,B,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char} B,B,0,0,50,54,21,0,0,0,0,0,0,0,0,0,0,B,B,B,B NaN {0×0 char} {0×0 char} {0×0 char} NaN NaN {0×0 char} {0×0 char} {0×0 char} {0×0 char}
It likely got read in this way due to the formatting of the original file. We might be able to help with the original import if you share the original file. Still, it's possible to take the table you've shared and turn DATA_TYPE into a matrix of doubles.
Frame.DATA_TYPE = string(Frame.DATA_TYPE);
DT = rowfun(@(s)strsplit(s,','),Frame(:,"DATA_TYPE"));
DT.Var1 = double(DT.Var1);
DT = table2array(DT)
DT = 60×21
NaN NaN NaN NaN NaN 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 23 35 105 23 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 23 27 19 19 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN NaN 0 0 0 27 23 0 0 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 0 0 23 23 23 0 0 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 0 0 50 54 21 0 0 0 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN NaN 0 0 45 50 33 0 0 0 0 0 0 0 0 0 0 NaN NaN NaN NaN NaN 0 0 19 36 36 19 0 0 0 0 0 0 0 0 0 0 0 NaN NaN NaN
  10 Comments
Cris LaPierre
Cris LaPierre on 7 Jan 2021
That works, too. I'd suggest using the setvartype function. This avoids needing to use repmat.
opts = setvartype(opts,'double');
Athrey Ranjith Krishnanunni
Yes, that's a good idea. Updated code:
opts = detectImportOptions('FowardWalk.csv','NumHeaderLines',33);
opts = setvartype(opts,'double');
% put below lines in a loop and change the DataLines parameter
% accordingly if you want to group the data by frames (you should
% add another index to "array" in that case)
opts.DataLines = [34 93; 96 155; 158 217; 220 279];
table = readtable('FowardWalk.csv',opts);
array = table2array(table);

Sign in to comment.

More Answers (0)

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!