How to split a string and make multiple rows from the original data for each split?
6 views (last 30 days)
Show older comments
Hi everyone,
I have an excel sheet with some students data and it has 11 columns and about 500 rows. I want to manipulate data in the grade and class columns.
The current format is the grade column is written in A,B,D,C format and class is written Chem, Math,Eng, Sci format.
Assuming the order is correct,
I want to split the grades and the classes at the comma and make each grade and class in a new row and duplicating the rest of the data like the name, count etc.
For example: the first row has Smith and his grades A,B,D,C as shown below
I want it to make 4 rows for Smith, each row represent one grade and one class like this
My code:
data = readtable("New Microsoft Excel Worksheet.xlsx")
Conv =string(data.Grades);
convList = [];
for i = 1:length(Conv)
convList = [convList; split(Conv(i), ",")];
end
Conv2 =string(data.Class);
convList2 = [];
for i = 1:length(Conv2)
convList2 = [convList2; split(Conv2(i), ",")];
end
convList2
This code will split string at the comma and I'm stuck trying to duplicate the data for each run. I would really appreciate any help
0 Comments
Accepted Answer
Karim
on 16 Aug 2022
Edited: Karim
on 16 Aug 2022
Hi, there are several methods to do this. One way is to create an index vector and use this to copy the variable. See below :)
data = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1099480/New%20Microsoft%20Excel%20Worksheet.xlsx");
% split the grades and class into columns
Grades_split = split(string(data.Grades),',');
Class_split = split(string(data.Class),',');
% detele spaces in the 'class' strings
Class_split = strtrim(Class_split);
% reshape into a row vector
Grades_split = reshape(pagetranspose(Grades_split),[],1);
Class_split = reshape(pagetranspose(Class_split),[],1);
% create an index vector
MyIndex = repmat(1:height(data),4,1);
MyIndex = reshape(MyIndex,[],1);
% create the new table based on the index
NewTable = data(MyIndex,:);
% insert splitted data
NewTable.Grades = Grades_split;
NewTable.Class = Class_split;
NewTable
More Answers (1)
Cris LaPierre
on 16 Aug 2022
I think you could get there using the stack function. There is some preprocessing needed first, namely splitting the Grades into their own variables, but that can be done using split and splitvars. To work with strings, set your 'TextType' property to 'string' in readtable.
Here is a brief example.
data = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1099480/New%20Microsoft%20Excel%20Worksheet.xlsx",...
'TextType','string')
data.Grades = split(data.Grades,",")
dataNew = splitvars(data,'Grades')
dataFinal = stack(dataNew,["Grades_"+(1:4)])
See Also
Categories
Find more on Text Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!