newBag = removeInfrequentWords(bag,count)
removes the words that appear at most count times in total from
the bag-of-words model bag. The function, by default, is case
sensitive.
newBag = removeInfrequentWords(bag,count,'IgnoreCase',true)
removes the words that appear at most count times in total
ignoring case. If words differ only by case, then the corresponding counts are
merged.
Remove the words that appear two times or fewer from a bag-of-words model.
Create a bag-of-words model from an array of tokenized documents.
documents = tokenizedDocument([
"an example of a short sentence""a second short sentence""another example""a short example"]);
bag = bagOfWords(documents)
bag =
bagOfWords with properties:
Counts: [4x8 double]
Vocabulary: ["an" "example" "of" "a" "short" "sentence" "second" "another"]
NumWords: 8
NumDocuments: 4
Remove the words that appear two times or fewer from the bag-of-words model.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.