Comparing each vector in a cell array to another vector
8 views (last 30 days)
Show older comments
Hello. I am wondering if there is an easier way to check vectors in my cell array are subsets of another given vector. I want to keep track of how many times the vectors in C are found in T, for many different T's. For example:
T = [1 2 3 4 5 6]
C = { [1 2], [3 4], [5,6], [7 8], [9, 10] }
% loop through each candidate
for j = 1:numCands
c = Ck{j};
% Candidate Counts. Check if candidate is a subset of the
% current transaction
if ismember(c,t)
candCounts(j) = candCounts(j) + 1;
end
% Candidate Tail Counts. Check if the tail of the candidate is
% a subset of the current transaction
if ismember(c(2:end),t)
candTailCounts(j) = candTailCounts(j) + 1;
end
end
What I have done is loop through each member of C and check using ismember() for each element. But this gets really long when C becomes very large. I was wondering if there was any clever way of vectorization which would result in a faster runtime. I would also like to add that T can be varying sized vectors. Not sure if that information is relevant or not.
Thanks in advance!
Edit: For future references here is what I ended up doing, which is similar to what the answers suggested:
% Make all candidates into matrix for faster calculation
allCands = vertcat(Ck{:});
% Get number of columns to get where there are subsets.
[~,col] = size(allCands);
% Find locations where the candidates are subsets of the
% transactions for both the candidates and tail counts.
memb1 = ismember(allCands,t);
memb2 = ismember(allCands(:,2:end),t);
% Subset check
subCheck1 = find(sum(memb1,2) == col);
subCheck2 = find(sum(memb2,2) == col - 1);
% Increment counts
candCounts(subCheck1) = candCounts(subCheck1) + 1;
candTailCounts(subCheck2) = candTailCounts(subCheck2) + 1;
1 Comment
Walter Roberson
on 5 Feb 2017
Edited: Walter Roberson
on 5 Feb 2017
Does it happen to be the case that C and T only have integer elements in the range 0 to 65535 ?
Are all of the entries in C the same length? Are they always row vectors?
Accepted Answer
Stephen23
on 5 Feb 2017
Edited: Stephen23
on 5 Feb 2017
Method one: one ismember call: the function ismember is going to to be slowest operation, so one simple speedup is to reduce how many times you call it: Note that a vector c will only be a member if the tail c(2:end) is a member, so you can test for the tail first, and only if the tail is a member do you need to check if the first element is also a member:
T = [1,2,3,4,5,6];
C = {[1,2],[3,4],[5,6],[7,8],[9,10]};
cntC = 0;
cntT = 0;
for k = 1:numel(C)
if ismember(C{k}(2:end),T)
cntT = cntT+1;
cntC = cntC+any(C{k}(1)==T);
end
end
Method two: Convert to a matrix: if all members of C are the same size vectors, then convert C to a matrix and perform one simple comparison using bsxfun:
M = vertcat(C{:});
U = reshape(T,1,1,[]);
X = any(bsxfun(@eq,M,U),3);
xT = all(X(:,2:end),2);
xC = xT&X(:,1);
cntT = sum(xT)
cntC = sum(xC)
5 Comments
Stephen23
on 5 Feb 2017
"The problem with that is if the elements occur out of order, you would still get a match."
That is what a subset is, and the original question specifically asks for "check vectors in my cell array are subsets of another given vector".
For a subset the order is irrelevant.
Stephen23
on 5 Feb 2017
Edited: Stephen23
on 5 Feb 2017
@Christopher Tran: I am happy to take a look. Please clarify one point: do you want to match the values in the vectors in the same order, or is the order irrelevant (as per set theory and the definition of subset). For example, is [2,1] a subset of [1,2,3] or not?
More Answers (1)
Walter Roberson
on 5 Feb 2017
If all members of C are the same length then you can use buffer() with overlap to create an array containing all the subsequences of that length, and transpose that and then convert C to a row matrix and then ismember() with rows option
1 Comment
See Also
Categories
Find more on Operators and Elementary Operations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!