Least Frequent Words in document
    2 views (last 30 days)
  
       Show older comments
    
If I use topkwords to find the most-frequent words, what code can I use to show the 10-least frequent words?
Answers (1)
  Snehal
 on 29 Jan 2025
        Hi,
I understand that you want to display the 10-least frequent words from a given set of words or sentences. 
This can be achieved using the 'topkwords' function. Pass the input to 'topkwords', setting the k value to 'inf'. Then, sort the output of 'topkwords' in ascending order and display the top 10 words. 
Refer to the sample code below for better understanding: 
% Sample text data 
textData = "This is a sample text. This text is for testing if our approach can display the least frequent words correctly or not"; 
% before using the ‘topkwords’ function, we need to convert the text into bag-of-words format 
documents = tokenizedDocument(textData); 
docs = bagOfWords(documents); 
table = topkwords(docs, inf);  
sortedTable = sortrows(table,'Count'); 
% Select the 10 least frequent words 
numLeastFrequent = 10; 
leastFrequentWords = sortedTable.Word(1:numLeastFrequent); 
leastFrequentCounts = sortedTable.Count(1:numLeastFrequent); 
% Display the 10 least frequent words and their counts 
disp(leastFrequentWords); 
Refer to the following documentations for more details: 
- https://www.mathworks.com/help/textanalytics/ref/bagofwords.topkwords.html
- https://www.mathworks.com/help/textanalytics/ref/bagofwords.html
- https://www.mathworks.com/help/matlab/ref/double.sortrows.html
- https://www.mathworks.com/help/textanalytics/ref/tokenizeddocument.html
Hope this helps.
0 Comments
See Also
Categories
				Find more on Cell Arrays in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

