Adding an id for similar rows
4 views (last 30 days)
Show older comments
Hello,
I have a data set with row and column numbers in reference to a grid, and then a value for that cell.
Example of data set (Attached .csv titled "input")
row col value;
124 95 7983;
124 96 16142;
124 97 94846;
124 98 4557;
124 95 8935;
124 96 496746;
124 97 352734;
124 98 25374;
124 95 790574;
124 96 346;
124 97 3745;
124 98 19774;
What I'd like to do is add a column and for all the first unique row/column combinations, and populate the new column with 1. And then for the next repetition of those row/column combinations, use a 2 and so on. So this example data set would look like the following
ID row col value;
1 124 95 7983;
1 124 96 16142;
1 124 97 94846;
1 124 98 4557;
2 124 95 8935;
2 124 96 496746;
2 124 97 352734;
2 124 98 25374;
3 124 95 790574;
3 124 96 346;
3 124 97 3745;
3 124 98 19774;
I figured I could use the unique function to accomplish this, but I've been unsuccessful thus far. Does anyone have an easy way in mind to do this? Thanks so much. Oh, and the actual data files I'll be trying to apply this two are around 7 million lines, with around 600 different repetitions of the same row/column combinations, but different values going with the combinations each time. Thanks so much for your time!
0 Comments
Accepted Answer
Guillaume
on 21 Apr 2015
One way to do it:
data = [124 95 7983; 124 96 16142; 124 97 94846; 124 98 4557; 124 95 8935; 124 96 496746; 124 97 352734; 124 98 25374; 124 95 790574; 124 96 346; 124 97 3745; 124 98 19774];
%prepare array to receive ids:
ids = zeros(size(data, 1), 1);
%find out where the unique rows of data are distributed:
[~, ~, uid] = unique(data(:, [1 2]), 'rows');
%group together the row indices with the same uid:
grouprows = accumarray(uid, 1:size(data, 1), [], @(v) {sort(v)});
%for each cell in grouprow create array 1,2,3... of final ids:
groupids = cellfun(@(ix) (1:numel(ix))', grouprows, 'UniformOutput', false);
%fill up ids using grouprows as indices and groupids as values:
ids(vertcat(grouprows{:})) = vertcat(groupids{:});
%append ids to existing data:
newdata = [ids data]
More Answers (0)
See Also
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!