Efficiently assign data into a struct?

Question

Mack on 16 Oct 2025

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/2180657-efficiently-assign-data-into-a-struct

Commented: dpb on 20 Oct 2025

I've got a really large set of kinematic data that I'm trying to organize into structs and sub-structs. Currently, I'm assigning each struct and sub-struct manually by calling from my data set. I'm doing this because in some instances, the data set was not recorded in the coordinate frame that I want the output to be in. For the purposes of this example, I'm only including one parent struct, however I have multiple (T1, T2, T3). Is there a better/more efficient way of doing this?

% Random data set
T1.data = randi(1000, 19);
% Finding size of data set (in my code it varies...)
T1.size = size(T1.data);
% Assign data to appropriate field
T1.var1.data = T1.data(1:end, [1 2:4]);
T1.var2.data = T1.data(1:end, [1 5:7]);
T1.var3.data = T1.data(1:end, [1 8:13]);
T1.var4.data = T1.data(1:end, [1 14:end]);
% assigning data into structs
T1.var1.x = zeros(T1.size(1),4);
T1.var1.x = T1.var1.data(1:end, 2:4);
T1.var2.x = zeros(T1.size(1),4);
T1.var2.x = T1.var2.data(1:end, 2:4);
T1.var3.x = zeros(T1.size(1),4);
T1.var3.x = T1.var3.data(1:end, 2:4);
T1.var3.y = zeros(T1.size(1),4);
T1.var3.y = T1.var3.data(1:end, 5:7);
% Redefining coordinate frame for var4
T1.var4.x = zeros(T1.size(1), 4);
T1.var4.x(1:end,1) = T1.var4.data(1:end, 3);
T1.var4.x(1:end,2) = T1.var4.data(1:end, 2);
T1.var4.x(1:end,3) = T1.var4.data(1:end, 4);
T1.var4.y = zeros(T1.size(1), 4);
T1.var4.y(1:end, 1) = T1.var4.data(1:end, 6);
T1.var4.y(1:end, 2) = T1.var4.data(1:end, 5);
T1.var4.y(1:end, 3) = T1.var4.data(1:end, 7);

My end goal is to take derivatives of the kinematics, find resultants and max/mins, and plot. But I'm hoping to organize the initial data in a better way. If anybody has any tips on how to make this code more efficient please let me know!

3 Comments
Show 1 older commentHide 1 older comment

dpb on 17 Oct 2025

"Use one non-scalar structure."

Or a table with the additional variable of which variable to move the meta-data from variable names to actual data. You would then be able to use rowfun and/or groupsummary and/or friends and do all the desired calculations as vector operations without having to iterate through the variable names.

dpb on 17 Oct 2025

Edited: dpb on 18 Oct 2025

Open in MATLAB Online

To try wrap head around what your struct is trying to represent, I rearranged your code by varN which then looks like--

% Random data set
T1.data = randi(1000, 19);
% Finding size of data set (in my code it varies...)
[R,C]=size(data);
Z(R,4)=0;       % a temporary initializing array size Rx4
% Assign data to appropriate field
T1.var1.data = T1.data(:, [1 2:4]);
T1.var1.x = Z;
T1.var1.x = T1.var1.data(:, 2:4);
T1.var2.data = T1.data(:, [1 5:7]);
T1.var2.x = Z;
T1.var2.x = T1.var2.data(:, 2:4);
T1.var3.data = T1.data(:, [1 8:13]);
T1.var3.x = Z;
T1.var3.x = T1.var3.data(:, 2:4);
T1.var3.y = zeros(T1.size(1),4);
T1.var3.y = T1.var3.data(:, 5:7);
T1.var4.data = T1.data(:, [1 14:end]);
% Redefining coordinate frame for var4
T1.var4.x = Z;
T1.var4.x(:,1:3) = T1.var4.data(:, [3 2 4]);
T1.var1.y = Z;
T1.var4.y(:,1:3) = T1.var4.data(:,[6 5 7]);

I think this could be simplified significantly; you're make multiple copies of the same data over and over which is quite ineffcient memory usage plus adding to the complexity of addressing what it is you want.

You set up a 4-column array in which you later said the 4th column was for results; so one can assume from the subscripting of using 1 as the first column what you have are time, x,y,result? Excepting I don't see y for var1, var2 so they're only single-axis measurements, not 2-axis?

What are varN; as noted it appears to me you would be better off with a flat table with each of those as a (perhaps categorical) indicator variable rather than storing meta-data in variable names forcing variable addressing in one form or another. While it is possible to use variables as fieldnames, it may not be as convenient as grouping variables, but we can't appreciate enough about what the data are and what is the end analysis of them to be to do more than conjecture. But certainly storing all the data and then copies of it multiple times seems pointless.

Sign in to comment.

Sign in to answer this question.

Answer 1

dpb on 18 Oct 2025

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2180657-efficiently-assign-data-into-a-struct#answer_1571252

Edited: dpb on 18 Oct 2025

Open in MATLAB Online

If I were doing this, I think I'd approach it more like

% Random data set
data = randi(1000, 19);
% Finding size of data set (in my code it varies...)
[R,C]=size(data);
vnames={'Time','Var','X','Y','Z','R'};                              % variable names for table
H=R*2+2*(R*2);                                                      % overall height 
tData=table('Size',[H,numel(vnames)], ...
            'VariableTypes',repmat({'double'},1,numel(vnames)), ...
            'VariableNames',vnames);                                % preallocate the table
% Assign data to appropriate field
tData.Time=repmat(data(:,1),6,1);                                   % time vector for each Var set
tData.Var=[ones(R,1);2*ones(R,1);3*ones(2*R,1);4*ones(2*R,1)];      % varN indicator variable each var
i1=2;
tData.X=reshape(data(:,i1:3:end),[],1);
i1=i1+1;
tData.Y=reshape(data(:,i1:3:end),[],1);
i1=i1+1;
tData.Z=reshape(data(:,i1:3:end),[],1);
tData.R=nan(height(tData),1);
[head(tData,5); tail(tData,5)]
ans = 10×6 table
    Time    Var     X      Y      Z      R 
    ____    ___    ___    ___    ___    ___

    721      1     165    406     91    NaN
    676      1     525    173    727    NaN
    516      1     513    345    828    NaN
    275      1      23     89    478    NaN
    205      1     657    941    195    NaN
    196      4      11    583    998    NaN
    963      4     949    708    643    NaN
    504      4     762    486    960    NaN
    331      4     659    532    124    NaN
    780      4     654    845    422    NaN

With the above, one can process with groupsummary or varfun and many other builtin tools for working with tables and grouping variables.

I've certainly guessed at what some things are; particularly in making the assumption that having six columns for Var3 and Var4 instead of 3 was just a duplicated set of data; if they are something else, then create the proper variable for them. As well, for the initial demo I didn't do the Var4 reordering; that is cetainly doable as in my earlier comment by creating the custom sequencing vector. Or, it might be simpler to do that reordering first in the raw data table if it is consistent.

But, this removes all the duplicated data storage of your structure at the expense of one additional variable, Var that could be categorical. If you have multiple datasets as you mention with more than one T, then add that indicator variable as to which T it is (can be numeric or could be an identifiable ID, whatever you choose) and you again replace the meta-data and the very complicated storage pattern with a flat table with very simple addressing modes for virtually anything you care to do.

3 Comments
Show 1 older commentHide 1 older comment

Matt J on 20 Oct 2025

Edited: Matt J on 20 Oct 2025

I tend to think you're better off keeping things in struct form (flat or otherwise), rather than using tables. Tables have some unfortunate performance defects:

https://www.mathworks.com/matlabcentral/answers/2152875-should-table-indexing-be-faster

dpb on 20 Oct 2025

My tendency is to use a table until I'm shown performance for the particular application isn't "good enough"...if knew were going to be really, really huge a priori, mayhaps would change in that instance.

Definitely would try to have it be as flat as possible with struct...trying to handle metadata in naming variables or fields (or file naming conventions is another fairly frequent proposed alternative) adds complexity however one chooses to try to access them.

Sign in to comment.

Answer 2

Matt J on 17 Oct 2025

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2180657-efficiently-assign-data-into-a-struct#answer_1571201

Edited: Matt J on 17 Oct 2025

Open in MATLAB Online

You can replace all occurences of '1:end' with ':' and condense your indexing operations. This, for example,

T1.var4.y = zeros(T1.size(1), 4);
T1.var4.y(1:end, 1) = T1.var4.data(1:end, 6);
T1.var4.y(1:end, 2) = T1.var4.data(1:end, 5);
T1.var4.y(1:end, 3) = T1.var4.data(1:end, 7);

can be replaced with,

T1.var4.y = zeros(T1.size(1), 4);
T1.var4.y(:,1:3)= T1.var4.y(:,6:7);

Also, things like this don't make sense,

% assigning data into structs
T1.var1.x = zeros(T1.size(1),4);   %remove this line?
T1.var1.x = T1.var1.data(1:end, 2:4);

The first line isn't accomplishing anything except expending CPU time, since you then overwrite T1.var1.x completely with a different matrix, of a different size.

2 Comments
Show NoneHide None

Mack on 17 Oct 2025

Sorry, I should have been more clear on that. I'm setting up the T1.var1.x structs and so on to store kinematic data in the x, y, and z directions, and leaving a 4th column to later calculate the resultants.

dpb on 17 Oct 2025

Open in MATLAB Online

But what @Matt J is pointing out is that when you write

T% assigning data into structs
T1.var1.x = zeros(T1.size(1),4);   %remove this line?
T1.var1.x = T1.var1.data(1:end, 2:4);

what you end up isn't what you think it is ...

T1.var1.data=randi(100,4);      % and arbitrary 4x4 array 
T1.var1.x=zeros(4);
T1.var1.x=T1.var1.data(:,2:4);  % assign into it
T1.var1
ans = struct with fields:
    data: [4×4 double]
       x: [4×3 double]

you see the resultant array is, as Matt says, overwritten in its entirety and is just the Nx3 array. You would have to write

T1.var1.x=zeros(4);                     % reinitialize the zero array
T1.var1.x(:,1:3)=T1.var1.data(:,2:4);   % assign into it
T1.var1
ans = struct with fields:
    data: [4×4 double]
       x: [4×4 double]

to overwrite only the three first columns.

As a side note that probably isn't particularly pertinent to your end problem with real data, I suspect that

% Random data set
T1.data = randi(1000, 19);

isn't doing precisely what you may think; the first argument is the maximum of the range of random integers between 1 and 1000, the second is the size that will by default be 19x19. This may be deliberate but one presumes in real life the height of the actual data will be much larger although you may have 19 data columns consistently?

Sign in to comment.

Efficiently assign data into a struct?

3 Comments
Show 1 older commentHide 1 older comment

Answers (2)

3 Comments
Show 1 older commentHide 1 older comment

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Efficiently assign data into a struct?

3 Comments Show 1 older commentHide 1 older comment

Answers (2)

3 Comments Show 1 older commentHide 1 older comment

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

3 Comments
Show 1 older commentHide 1 older comment

2 Comments
Show NoneHide None