How to apply a set of operations to each field in a structured array?

2 views (last 30 days)
Hi,
Lets say I have a structured array with multiple fields in it. Each of those can have more fields or a timetable. I attached an example mat file for you to see, but here is the general idea:
Struct_arr: -Field1: -Subfield1: -Table1
-Table2
-Subfield2: -Table1
-Table2
-Field2: -Subfield1: -Table1
-Table2
-Subfield2: -Table1
-Table2
Lets say I would like to perform the follwoing operation on each of the tables, where I get the average of a certain table variable grouped by the month in this case (specified in the tstamp of the timetable):
groups = findgroups(extractAfter(string(eg_struct.Field1.SubField1.E1.a),3));
grouped_avg = splitapply(@mean,eg_struct.Field1.SubField1.E1.c,groups); %Same table variable is used (in this case lets call it c)
How can I repeat this operation for each table in the structures in an iterative manner?
Any help is appreciated,
Thanks!
  1 Comment
Eric Sofen
Eric Sofen on 13 Aug 2021
Dealing with structured deep hierarchical data like this can be tricky. I wrote out several ways of doing this and a few code snippets, and then ultimately decided that the recursive function is the most general way to tackle it.
But first, a few questions. Are all of the tables the same height (I think so, based on reusing groups)? Do they have the same variables? If the tables are all the same height, one option is to dig the timetables out of the struct and horzcat them together. The wrinkle here is you'll need to rename variables so their unique.
Another option (my preferred solution), especially if all the tables have the same variables, you could dig them out, add the fieldnames as grouping variables and vertcat them together, then do your findgroups/splitapply with multiple grouping variables.
You've probably found that structfun only works on the top layer of fields. Rather than structfun(@(s) structfun(@...)), it's probably more readable and easier to write an explicit nested for loop using fieldnames and dynamic field names (that is, the name is in a variable and passed to the struct using parenthesis: s.(fld)) to get the things to loop over.
But even better, write something to do it recursively! The tricky bit is we don't necessarily know the depth of the nested structs in general, so we need to keep track of fieldnames and occasionally patch up the tables so that they always have the same variables.
load("~/Downloads/example.mat")
t = eg_struct.Field2.SubField2.E1([],:);
prevNames = string([]);
% put this in to make the struct an un-balanced hierarchy to make sure the
% recursion code handles it.
eg_struct.Field3 = eg_struct.Field2.SubField2.E1(1:10,:);
t = recursiveTableFromStruct(t,eg_struct,prevNames);
% With this many grouping variables, I think you're right that
% findgroups/splitapply is conceptually easiest.
t.Mon = t.a.Month;
t.Yr = t.a.Year;
groups = findgroups(t(:,["Mon","Yr","f1","f2","f3"]));
tmean = splitapply(@mean, t.c, groups)
function [t, prevNames] = recursiveTableFromStruct(t, s, prevNames)
for f = string(fieldnames(s))'
if istimetable(s.(f))
t1 = s.(f);
names = [prevNames, f];
for i = 1:numel(names)
t1.("f"+i)(:) = categorical(names(i));
end
% before vertcatting, need to make sure the original table and the
% new table aren't missing some variables tracking deeper levels of
% fieldnames.
if width(t) < width(t1)
% t needs higher-numered "fN" varnames.
newvars = setdiff(string(t1.Properties.VariableNames),string(t.Properties.VariableNames));
for i = newvars
t.(i)(:) = missing;
end
elseif width(t1) < width(t)
% t1 needs higher-numbered "fN" names
newvars = setdiff(string(t.Properties.VariableNames),string(t1.Properties.VariableNames));
for i = newvars
t1.(i)(:) = missing;
end
end
t = [t;t1];
elseif isstruct(s.(f))
% recurse
prevNames = [prevNames, f];
[t, prevNames] = recursiveTableFromStruct(t,s.(f),prevNames);
else
error(message("oops"))
end
end
% remove a level of prevNames, because going up a level.
prevNames = prevNames(1:end-1);
end
If you don't want to rearrange the layout of your data, you could to the splitapply in the loop and assign it to an output struct that's also built up using dynamic fieldnames.
A couple related examples from other Answers posts: here and here.

Sign in to comment.

Answers (0)

Categories

Find more on Structures in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!