Function 'contains' did not work
14 views (last 30 days)
Show older comments
The background is CODY problem 95
The purpose is to confirm whether the vector s1 contains the vector s2. (or s2 cover s1)
The problem is that when the size of the vector is very large, the function ’contains‘ cannot be judged correctly.
case 1
clear
s1 = 1:100;
s2 = [50 51];
s1str=num2str(s1);
s2str=num2str(s2);
tf1=contains(s1str,s2str)
tf2=contains(s2str,s1str)
tf1 SHOULD be 1, since [50 51] is part of s1. but the result is not correct.
case 2
s1 = 40:60;
s2 = [50 51];
s1str=num2str(s1);
s2str=num2str(s2);
tf1=contains(s1str,s2str)
tf2=contains(s2str,s1str)
When the size of the vector S1 decreases, the function works normally.
3 Comments
Dyuman Joshi
on 30 Jan 2024
"The background is CODY problem 95
The purpose is to confirm whether the vector s1 contains the vector s2. (or s2 cover s1)"
If you are talking about this problem - https://in.mathworks.com/matlabcentral/cody/problems/95, then you have misunderstood what is being asked. I suggest you go through the question and the test cases once again.
Also, if you want to check whether elements of a numeric array are present or not in another, you should use ismember or ismembertol, instead of converting to a char array and using contains().
Accepted Answer
Dinesh
on 30 Jan 2024
Hi Alex,
The issue arises from the default behavior of "num2str" when converting arrays to strings. When "s1" is converted to a string, each element in the array is separated by a set number of spaces, regardless of the number of digits in each number. This means '50' and '51' in "s1str" are separated by more spaces than they are in "s2str", where only two spaces are used between '50' and '51'.
The "contains" function checks for exact substrings, and since the spacing differs, it does not consider '50 51' (with two spaces) to be present in "s1str" (where there are 3 spaces between '50' and '51'). To reliably check for the presence of the sequence of numbers, you would need to account for the variable spacing introduced by "num2str".
To consistently check if a vector contains another, regardless of size, use numerical operations like "ismember":
s1 = 1:100;
s2 = [50 51];
tf1 = all(ismember(s2, s1));
This code will return a logical "1" because "s2" is contained within "s1", which is the correct behavior for your case. When using numerical arrays, it's more reliable to use numerical comparisons rather than string functions. For the CODY problem 95, this method should give you the correct result even for very large vectors.
3 Comments
Dinesh
on 30 Jan 2024
@Alex, The judgement function was restored because the space between 50 and 51 for case 2 in s1str are just 2 spaces which is same as the number of spaces between 50 and 51 in s1str. That is why it returns 1. But for case 1, in s1str, there are 3 spaces between 50 and 51. Therefore, string matching in case 1 didn't give the expected result.
The inconsistency arises because the string representation of numerical vectors with "num2str" does not maintain a consistent number of spaces between numbers, particularly when the numbers vary in length (single-digit versus double-digit). This spacing issue is less pronounced in smaller vectors or vectors with numbers of similar lengths, hence the different outcomes in Case 1 and Case 2.
Thanks for pointing out the critical requirement of preserving the order of elements in the vectors. To meet your requirements, we should implement a custom solution that checks for "s2" as a contiguous subsequence within "s1". Here's an adjusted MATLAB function:
function isSubsequence = checkSubsequence(s1, s2)
isSubsequence = false;
for i = 1:(length(s1) - length(s2) + 1)
if isequal(s1(i:i+length(s2)-1), s2)
isSubsequence = true;
break;
end
end
end
s1 = 1:100;
s2 = [50 51];
tf1 = checkSubsequence(s1, s2);
Hope this helps!
More Answers (1)
VBBV
on 30 Jan 2024
if you use compose function nstead of num2str then contains works correctly,
clear
s1 = 1:100;
s2 = [50 51];
s1str=compose('%d',(s1)); % converts array elements into discrete array elements
s2str=compose('%d',(s2));
tf1=contains(s1str,s2str)
tf2=contains(s2str,s1str)
% case 2
s1 = 40:60;
s2 = [50 51];
s1str=compose('%d',(s1));
s2str=compose('%d',(s2));
tf1=contains(s1str,s2str)
tf2=contains(s2str,s1str)
num2str converts entire array into one continous string, so when comparing commonalitiy of elements in two different arrays, its better to avoid num2str
2 Comments
See Also
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!