Splitting a table into smaller ones based on two columns

Question

Caroline on 2 Oct 2019

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/483259-splitting-a-table-into-smaller-ones-based-on-two-columns

Commented: Adam Danz on 3 Oct 2019

Hello,

The code I use produces a master table with 8 columns and around 800 rows. I would like to sort the data in the following steps:

Take first line of data from master table and look at the values in column six and eight.
Find all other rows in the table with a column six value within 0.5 of the line one column six value and a column eight value within 2 of the line one column eight value.
Create new table with this data.
Delete sorted data from master table.
Repeat 1-> 4 until no data left.
Be left with multiple tables.

Is there a way to do this?

Thank you for your help.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Adam Danz on 2 Oct 2019

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/483259-splitting-a-table-into-smaller-ones-based-on-two-columns#answer_394475

Edited: Adam Danz on 3 Oct 2019

Open in MATLAB Online

That can easily be done in a loop but it doesn't sound like a good idea. Breaking apart well-organized tabular data into sub-tables is like moving into a new house by unpacking each box in the driveway and carrying in each item from the box individually rather than just carying in the box. Keep the data together whenever possible.

Instead, each row of the table can be assigned a subgroup number and then you can use those row numbers to pull out data as needed.

Here's a functional demo with comments to illustrate this method. 'rowGroup' is used to identify subtable rows.

% Create demo data
T = array2table(rand(20,8).*2); 
T{:,8} = T{:,8} * 5; 
% Identify the group number of each row based on
% col 6 & 8 values and their given tolerance levels.
rowGroup = zeros(size(T,1),1); % This will store the group number for each row
while any(rowGroup==0)
    % find next unassigned row, starting at the top
    rowNum = find(rowGroup==0,1,'first'); 
    % find all rows in col 6 that are within tolerance 
    group1idx = abs(T{rowNum,6} - T{:,6}) <= 0.5; 
    % find all rows in col 8 that are within tolerance 
    group2idx = abs(T{rowNum,8} - T{:,8}) <= 2.0; 
    % identify the rows that fit into this group
    rowGroup(group1idx & group2idx & rowGroup==0) = max(rowGroup)+1;    
end
% rowGroup is a column vector of row numbers that identify the subgroups. 
% Your values will differ due to using random data
% >> rowGroup(1:5)
% ans =
%      1 
%      2
%      3
%      4
%      4
% now you can access sub-groups of data like this 
T(rowGroup==1,:) % for group 1

To see the number of subgroups and the number of rows within each subgroup,

subgroupSummary = table((min(rowGroup):max(rowGroup))', ...
    histcounts(rowGroup,min(rowGroup):max(rowGroup)+1)', ...
    'VariableNames', {'Group', 'nRows'})

2 Comments
Show NoneHide None

Caroline on 3 Oct 2019

Edited: Caroline on 3 Oct 2019

Thank you very much for your help this solves my problem.

Adam Danz on 3 Oct 2019

Glad I could help out!

Sign in to comment.

Answer 2

David K. on 2 Oct 2019

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/483259-splitting-a-table-into-smaller-ones-based-on-two-columns#answer_394476

Edited: David K. on 2 Oct 2019

Open in MATLAB Online

I would do it as such:

n = 1;
a = your table% I used table([1;2;3;4;5;],[1;5;1;3;1]) to sort of test
while ~isempty(a) % while your table is still empty
    % here is finding the index of the logic you did - the second & might be an or (|) if 
        % you want all the rows that satisfy the col 7 or 8 requirements instead of col 7 AND 8 reqs
        % these indices can also more cleanly be done the way adam does.
    toRem = find(a.col7<a.col7(1)+0.5 & a.col7>a.col7(1)-0.5 & a.col8<a.col8(1)+2 & a.col8>a.col8(1)-2);
    
    newTables{n} = a(toRem,:); % Put the found values into a new table inside a cell array
        % putting them into a cell array is the easiest way to have different sized variables created in a loop
    a(toRem,:)=[];  % remove from original table
    n=n+1;    % increment n 
end

All of your resultant tables will now be in newTables.

I agree with Adam that this would not be a great idea in theory, but here is how you could do it if you still really want to sort them this way.

1 Comment
Show -1 older commentsHide -1 older comments

Caroline on 3 Oct 2019

Thank you very much for your help.

Sign in to comment.

Splitting a table into smaller ones based on two columns

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

Splitting a table into smaller ones based on two columns

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None

1 Comment
Show -1 older commentsHide -1 older comments