How to compare words in a line
3 views (last 30 days)
Show older comments
Actually I have got a text file in which there are keywords in various lines(assume rows columns). If first keyword is matched with required word , then there is no need to check other keywords in same line (or next column of the line). In similar way , the keywords in second, third and so on lines should also verify just like in first line. So how to write loop for this task. One can assume these keywords are in a matrix . I think , for loop is useful but how exactly I have to proceed
2 Comments
Adam Danz
on 14 Aug 2019
Edited: Adam Danz
on 14 Aug 2019
We can easily make generic suggestions but without more detail or an example, the answer may not be specific enough to actually help.
After you read in your data and split the text by line, you can use ismember() or regexp() or contains() to determine if a list of key words exist within each line.
If you need more specific advice, please provide a sample of your text and a sample of the list of key words.
Do you need case sensitivity? Is your only goal to determine if any of the key words exist in each line (True/False output) ?
Accepted Answer
Adam Danz
on 15 Aug 2019
Edited: Adam Danz
on 16 Aug 2019
Attached are 2 text files. "txt.txt" contains the text you shared in your comment under your question. "dcm.txt" is a vertical list of key words.
The 2 blocks of code below read in each file and separate the text by line. It returns a logical column vector "hasKey" where TRUE values indicate rows of txt that contain a key word.
For prior to r2016b
% Read in data
txt = fileread('txt.txt'); % Better to use a full path
key = fileread('dcm.txt'); % Better to use a full path
% separate text by line & remove newline char
txt = strtrim(regexp(txt,'\n','split').');
key = strtrim(regexp(key,'\n','split').');
% Loop through each line of txt and detect a key word
hasKey = false(size(txt));
for i = 1:numel(txt)
lineParse = strtrim(regexp(txt{i},' +','split'));
hasKey(i) = any(cellfun(@(x)any(strcmp(x,key)),lineParse));
% hasKey(i) = any(ismember(lineParse, key)); OLD VERSION, DON'T USE THIS LINE
end
For r2016b or later
Instead of strtrim(regexp()) you can use splitlines()).
% separate text by line
txt = splitlines(txt); % r2016b or later
key = splitlines(key); % r2016b or later
Result for both methods
hasKey =
3×1 logical array
1
0
1
% Row 1 and 3 contain a key word
5 Comments
per isakson
on 15 Aug 2019
Edited: per isakson
on 15 Aug 2019
Warning:
>> any(ismember( 'embedded words', 'bed' ))
ans =
logical
1
and
>> contains( 'embedded words', 'bed' )
ans =
logical
1
Is this what you want?
Adam Danz
on 16 Aug 2019
Edited: Adam Danz
on 16 Aug 2019
@ nagasai thumati, you should definitely use the updated code. I just had to replace the last line within the for-loop.
More Answers (0)
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!