Speed Up String Conversion
8 views (last 30 days)
Show older comments
Hi all. I am trying to speed up string conversion of a table field as below :-
GoingUC=string(table2cell(Inps(:,5)));
Inps is a table with approximately 730000 records with 13 fields. I've got 6 categorical fields to convert and it is taking over 2.5 hours so I wondered if there was a quicker way to do this. I need a string array for the following code which converts the categorical strings to numbers in a map (which is quick) :-
[Unique_GoingU,~,GoingU_Numeric_Cats] = unique(GoingUC);
CTNM_GoingU=containers.Map(Unique_GoingU,num2cell(1:length(Unique_GoingU)));
NTD_GoingU=cell2mat(values(CTNM_GoingU,num2cell(GoingUC)));
It all works perfectly for my use but it's just if I can speed it up that would be great.
Steve Gray
2 Comments
Voss
on 1 May 2024
The third output from unique is the same as the end result (or the transpose of the end result, if GoingUC is a row vector), so using a Map is unnecessary.
GoingUC = string(randi(10,10000,1))
[Unique_GoingU,~,GoingU_Numeric_Cats] = unique(GoingUC)
CTNM_GoingU=containers.Map(Unique_GoingU,num2cell(1:length(Unique_GoingU)));
NTD_GoingU=cell2mat(values(CTNM_GoingU,num2cell(GoingUC)))
isequal(GoingU_Numeric_Cats,NTD_GoingU)
Accepted Answer
Voss
on 1 May 2024
Avoid using table2cell for this; instead, access the table data directly (using curly braces {}, or, even better, dot indexing)
% 100000x1 table of categoricals
Inps = table(categorical(randi(10,100000,1)))
% using table2cell
tic
str1 = string(table2cell(Inps(:,1)));
toc
% using curly brace indexing
tic
str2 = string(Inps{:,1});
toc
% using dot indexing
tic
str3 = string(Inps.(1));
toc
Accessing the table data directly is > 100 times faster, and produces the same result:
isequal(str2,str2,str3)
3 Comments
Voss
on 1 May 2024
Edited: Voss
on 1 May 2024
You're welcome!
table2cell could be useful for collecting multiple variables of a table into a cell array, particularly if the variables contain different classes of data. Although I would most likely just keep the data in table form.
T = table(rand(10,1),cellstr(char(65+randi([0,9],10,5))),string(rand(10,1)))
% table to cell keeps the data classes as they are in the table
C = table2cell(T(:,[1 2 3]))
% but the concatenation required when accessing directly converts
% numeric and cell char to string, in order to combine the
% numeric and cell char table variables with the string variable
T{:,[1 2 3]}
C = [T.(1) T.(2) T.(3)]
More Answers (0)
See Also
Categories
Find more on Cell Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!