This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Train a Sentiment Classifier

This example shows how to train a classifier for sentiment analysis using an annotated list of positive and negative sentiment words and a pretrained word embedding.

The pretrained word embedding plays several roles in this workflow. It converts words into numeric vectors and forms the basis for a classifier. You can then use the classifier to predict the sentiment of other words using their vector representation, and use these classifications to calculate the sentiment of a piece of text. There are four steps in training and using the sentiment classifier:

  • Load a pretrained word embedding.

  • Load an opinion lexicon listing positive and negative words.

  • Train a sentiment classifier using the word vectors of the positive and negative words.

  • Calculate the mean sentiment scores of the words in a piece of text.

To reproduce the results in this example, set rng to 'default'.

rng('default')

Load Pretrained Word Embedding

Word embeddings map words in a vocabulary to numeric vectors. These embeddings can capture semantic details of the words so that similar words have similar vectors. They also model relationships between words through vector arithmetic. For example, the relationship king is to queen as man is to woman is described by the equation king man + woman = queen.

Load a pretrained word embedding using the fastTextWordEmbedding function. This function requires Text Analytics Toolbox™ Model for fastText English 16 Billion Token Word Embedding support package. If this support package is not installed, then the function provides a download link.

emb = fastTextWordEmbedding;

Load Opinion Lexicon

Load the positive and negative words from the opinion lexicon (also known as a sentiment lexicon) from https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html. First, extract the files from the .rar file into a folder named opinion-lexicon-English, and then import the text.

Load the data using the function readLexicon listed at the end of this example. The output data is a table with variables Word containing the words, and Label containing a categorical sentiment label, Positive or Negative.

data = readLexicon;

View the first few words labeled as positive.

idx = data.Label == "Positive";
head(data(idx,:))
ans=8×2 table
        Word         Label  
    ____________    ________

    "a+"            Positive
    "abound"        Positive
    "abounds"       Positive
    "abundance"     Positive
    "abundant"      Positive
    "accessable"    Positive
    "accessible"    Positive
    "acclaim"       Positive

View the first few words labeled as negative.

idx = data.Label == "Negative";
head(data(idx,:))
ans=8×2 table
        Word          Label  
    _____________    ________

    "2-faced"        Negative
    "2-faces"        Negative
    "abnormal"       Negative
    "abolish"        Negative
    "abominable"     Negative
    "abominably"     Negative
    "abominate"      Negative
    "abomination"    Negative

Prepare Data for Training

To train the sentiment classifier, convert the words to word vectors using the pretrained word embedding emb. First remove the words that do not appear in the word embedding emb.

idx = ~isVocabularyWord(emb,data.Word);
data(idx,:) = [];

Set aside 10% of the words at random for testing.

numWords = size(data,1);
cvp = cvpartition(numWords,'HoldOut',0.1);
dataTrain = data(training(cvp),:);
dataTest = data(test(cvp),:);

Convert the words in the training data to word vectors using word2vec.

wordsTrain = dataTrain.Word;
XTrain = word2vec(emb,wordsTrain);
YTrain = dataTrain.Label;

Train Sentiment Classifier

Train a support vector machine (SVM) classifier which classifies word vectors into positive and negative categories.

mdl = fitcsvm(XTrain,YTrain);

Test Classifier

Convert the words in the test data to word vectors using word2vec.

wordsTest = dataTest.Word;
XTest = word2vec(emb,wordsTest);
YTest = dataTest.Label;

Predict the sentiment labels of the test word vectors.

[YPred,scores] = predict(mdl,XTest);

Visualize the classification accuracy in a confusion matrix.

figure
confusionchart(YTest,YPred);

Visualize the classifications in word clouds. Plot the words with positive and negative sentiments in word clouds with word sizes corresponding to the prediction scores.

figure
subplot(1,2,1)
idx = YPred == "Positive";
wordcloud(wordsTest(idx),scores(idx,1));
title("Predicted Positive Sentiment")

subplot(1,2,2)
wordcloud(wordsTest(~idx),scores(~idx,2));
title("Predicted Negative Sentiment")

Calculate Sentiment of Collections of Text

To calculate the sentiment of a piece of text, for example a review, predict the sentiment score of each word in the text and take the mean sentiment score.

Load the Airbnb Summary Review data (Boston, Massachusetts, United States, 06 October, 2017) from http://insideairbnb.com/get-the-data.html. Read the data into a table and specify to read the text data as string.

filename = "reviews.csv";
dataReviews = readtable(filename,'TextType','string');

Extract the text data from the comments variable and view the first few reviews.

textData = dataReviews.comments;
textData(1:10)
ans = 10×1 string array
    "Pretty nice, quiet, cozy place to stay. Toiletries, snacks, coffee, WiFi, cable TV, iron was all included. One of the best things for me is how quiet it was even in the daytime. Coded door locks so no need for keys, my belongings were always safe and Andre and his wife are really good host. I stayed 7 days and never had a problem. I'll stay again if and when I had the chance."
    "The host was extremely welcoming and obliging. The neighborhood is quiet and charming, perfect for a quiet visit. Short walk to MBTA transportation."
    "Nice and easy stay - with good accommodations especially the cable TV "
    "The host has been very accommodating and helpful. The description in the ad is accurate. The room is very clean and the neighborhood is quiet."
    "It's a great quiet stay."
    "Couldn't have been happier. The apartment was well renovated, very clean and convenient to great spots. The kitchen was stocked with all the basics and a huge grocery store was around the corner so we were able to easily cook at the house. Estee also provided some great local recommendations. Wine, snacks, coffee and games were great extras. Uber ride to downtown was $8. Would most definitely stay here again."
    "The apartment is very nice- as described and very convenient. The real superstar of the listing though is the host; Estee was  phenomenal. She was very responsive and even let us know when she might not be able to be reached for a short duration of time. She provided great recommendations and tips for getting around. We had a MINOR issue, which she went out of her way to resolve very quickly. ↵↵Both bedrooms are a good size, and one has a lovely vanity. Everything is brand new - bathroom and kitchen. Estee had the kitchen stocked with staples (salt, pepper, olive oil, ketchup) and treats too! There are so many details throughout the place where she goes above and beyond. Parking on the street was easy. We hardly needed to move the car though because there was so much within walking distance. The description of a 10 minute walk to the T is accurate. ↵↵100% would stay here again. Thank you for a wonderful stay, Estee!"
    "This is a brand new gorgeous place, very clean, bright and welcoming. Estee especially knows how to make guests comfortable, there were many thoughtful touches and she recommended a delicious Indian restaurant.    There is a supermarket within 5 minutes walking distance and we used Uber to get around - downtown Boston took less than 15 minutes. Best place I have stayed in so far. Thank you Estee!"
    "Estee and Josh are great hosts. Very welcoming. Made us feel like we were staying with long time friends. Apartment very centrally located. Off street parking surprisingly easy (for Boston). Loads of restaurants within walking distance"
    "Estee was super sweet and so very accommodating! The apartment was nicely renovated and the kitchen had all our basic needs + treats as well! My family and I stayed here because of a college graduation and because street parking in front of her place was easy and everything was within walking distance, it made our stay a lot easier! Would definitely stay here again!  "

Create a function which tokenizes and preprocesses the text data so it can be used for analysis. The function preprocessReviews, listed at the end of the example, performs the following steps in order:

  1. Convert the text data to lowercase using lower.

  2. Tokenize the text using tokenizedDocument.

  3. Erase punctuation using erasePunctuation.

  4. Remove stop words (such as "and", "of", and "the") using removeStopWords.

Use the preprocessing function preprocessReviews to prepare the text data. This step can take a few minutes to run.

documents = preprocessReviews(textData);

Remove the words from the documents that do not appear in the word embedding emb.

idx = ~isVocabularyWord(emb,documents.Vocabulary);
documents = removeWords(documents,idx);

To visualize how well the sentiment classifier generalizes to the reviews, classify the sentiments on the words that occur in the reviews, but not in the training data and visualize them in word clouds. Use the word clouds to manually check that the classifier behaves as expected.

words = documents.Vocabulary;
words(ismember(words,wordsTrain)) = [];

vec = word2vec(emb,words);
[YPred,scores] = predict(mdl,vec);

figure
subplot(1,2,1)
idx = YPred == "Positive";
wordcloud(words(idx),scores(idx,1));
title("Predicted Positive Sentiment")

subplot(1,2,2)
wordcloud(words(~idx),scores(~idx,2));
title("Predicted Negative Sentiment")

To calculate the sentiment of a given piece of text, compute the sentiment score for each word in the text and calculate the mean sentiment score.

For a selection of the documents, calculate the mean sentiment score. For each document, convert the words to word vectors, predict the sentiment score on the word vectors, transform the scores using the score-to-posterior transform function and then calculate the mean sentiment score.

idx = [7 34 331 1788 1820 1831 2185 21892 63734 76832 113276 120210];
for i = 1:numel(idx)
    words = string(documents(idx(i)));
    vec = word2vec(emb,words);
    [~,scores] = predict(mdl,vec);
    sentimentScore(i) = mean(scores(:,1));
end

View the predicted sentiment scores with the text data. Scores greater than 0 correspond to positive sentiment, scores less than 0 correspond to negative sentiment, and scores close to 0 correspond to neutral sentiment.

[sentimentScore' textData(idx)]
ans = 12×2 string array
    "0.85721"      "The apartment is very nice- as described and very convenient. The real superstar of the listing though is the host; Estee was  phenomenal. She was very responsive and even let us know when she might not be able to be reached for a short duration of time. She provided great recommendations and tips for getting around. We had a MINOR issue, which she went out of her way to resolve very quickly. ↵↵Both bedrooms are a good size, and one has a lovely vanity. Everything is brand new - bathroom and kitchen. Estee had the kitchen stocked with staples (salt, pepper, olive oil, ketchup) and treats too! There are so many details throughout the place where she goes above and beyond. Parking on the street was easy. We hardly needed to move the car though because there was so much within walking distance. The description of a 10 minute walk to the T is accurate. ↵↵100% would stay here again. Thank you for a wonderful stay, Estee!"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
    "2.0453"       "Estee was the perfect Airbnb host. The apartment was comfortable, spacious, and convenient, and Estee went to great lengths to make sure that we felt at home. She also provided great tips for us about the area. Would definitely love to stay here again."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
    "-0.37918"     "The apartment is not apropriate for 5 people. Is too little and We were no comfortable. The bathroom was no clean. There was a door in the kitchen Broken.  Is Too noisy. The elevator is Too small just for 2 people. "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
    "0.94799"      "Truly a quaint place in Beacon Hill. Comfortable walking distance from MGH, Boston Common, and Suffolk University. The studio type place is great for a couple's weekend.  The wifi was excellent as was the tv and comfort of the bed.  ↵The limitations and recommendations for improvement include:↵1- improving in cleanliness as the floor was dirty enough that you couldn't walk around without shoes↵2- would recommend bringing your own basic toiletries as there was no hand soap in the bathroom.  (We were too busy to contact jj but he is quick to respond to other matters so maybe he would have made arrangements to provide you with it.)↵3- storage space is limited so a prolonged stay would be challenging↵Overall, this a very functional stay."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
    "-0.077053"    "the neibourhood is perfect!!!!!. as it is very close to Bowdoin T STATION and Park T station, walking distance from everywhere we wanted to go, quincy market, downtown, chinatown, newsbury street and every thing. the appartment IT IS NOT on Temple street rear... it is on Coolidge st, facing a quite silent and lonely and big parking lot. (:/)  it was ok...though. coming and going was easy. JJ was really quick responsive when internet and CAble TV suddenly stopped but he was very helpful trying to solve it. that was very good. ... Other issues: By the house roules and the descriptions of previews guests I supposed the appartment was inmaculated and the cleaning was really fond... BUT IT WASN´T. we found previous litter in the trash bins... kitchen and bathroom... the brown chocolate cuchions didn´t smell as if they were clean. There were uncovered sheets, and blanquets and who knows what else under the bed, that I tried not to  sweep the floor in its direction in case  I made them dirty with the dust and gravel that was already inside when we got into the apparment. I went to the closet looking for the broom and shovel and I found them... the broom plenty of dirt and lint and entangled long hairs and stuff, and the shovel broken... very discusting. It is a pitty such an amazing location dealing with all these ackward details that are not ok at all. I think everything I mentioned can be  solved easily in a very simple and cheap and loving form so the place becomes the perfect spot to spent your vacations in Boston."
    "0.17846"      "Although we didn't meet JJ, we felt he was very quick to respond. Checking in and out was a breeze. The location is convenient and walkable to everything we wanted to see.  Studio was small but certainly comfortable for the 2 of us. ↵We didn't see any info regarding wifi (nor did we ask) so we didn't use it. The bathroom sink drain looked pretty dirty, no hand soap or wash cloths. It's a basement so there is not much natural light which was fine but even the lights in the space still made it feel very dark. We also found no place to hang our stuff or stash our clothes. A lot of the drawers had clothes in them already. ↵All in all, we liked the place. Would recommend it if a few changeable things were addressed. "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
    "-0.31603"     "In the apartment it was very dirty .↵we walked in and there instincts to rotten melon .↵the sink was full of dirty dishes.↵the microwave and mini oven were dirty you could make it nothing to eat.↵on the herd was a coffeepot with moldy coffee.↵on the first day of our "vacation" we first had to make everything clean that we can feel comfortable.↵carissa wanted the city to show us what to do something .↵But in the week she was not at home , and when they came home she was for days in her room she only came out to make himself something to eat and dishes piled up again."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
    "-4.0895"      "Blackmail!"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
    "1.658"        "Outstanding stay.  The apartment is world-class - very, very nice.  Tremendous views of the harbor from a very cool apartment in a very cool, brand new retro building.  I had not spent time in the South Boston waterfront neighborhood previously and loved it - great cafes, restaurants, pubs, renovated lofts.  A terrific area.↵↵In addition, John was an ideal host.  Incredibly responsive and helpful.  Provided excellent recommendations in terms of spots to visit in the neighborhood as well as very clear directions relating to the logistics of checking in, wifi access, heating/cooling, etc.↵↵Finally, John is an engaging and interesting person who is an absolute pleasure to spend time with."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
    "1.7102"       "I had an amazing stay at Carney's. The hosts are friendly and very meticulous. They made sure everything was proper from the kitchen needs to the bedroom needs. Also, they made us feel like we are at home. My parents had come for my graduation and they were pleased that I did not book a hotel and instead chose to stay here. I would recommend everyone to book a room if ever they plan to come to Boston and enjoy an enriching confortable experience. "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
    "0.67654"      "My husband and I came to Boston for our 1 year anniversary. We're so glad we found Elisabeth's place! Immediately when we got dropped off by our cab, Elisabeth came outside and walked us to the house.  She's incredibly nice, personable, and her place was beautiful and very clean.  We felt very comfortable staying with her and she was nice enough to give us a few recommendations around town.  Her place is a short walk to the Red Line T station and the neighborhood where she lives has everything you need close by.  Gas station and convenience store was literally down the street and lots of smaller mom and pop restaurants.  We hated to leave so early but would definitely love to come back! "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
    "-0.21651"     "My fiancé and I had just gotten engaged and wanted to stay somewhere a bit more upscale for our last night in Boston. We looked and found this "penthouse" and from arrival were let down. While the host was pleasant enough, she was hard to contact, the address was wrong, and she even had to have a neighbor show us around the place. Which would not have been weird if he wasn't doing laundry during our stay. We were promised the entire condo but the host stopped by as well, not that we minded that part, but it added to the weirdness. We were not able to use the refrigerator to store leftovers due to the HORRIBLE smell coming from it. It was so bad we turned the air off and opened the little balcony door. The condo looked too loved in to justify paying what we did. Also very confused about the $50 cleaning fee that was obviously not used before our stay, so a bit unhappy that we overpaid for a dirty place. "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              

Sentiment Lexicon Reading Function

This function reads the positive and negative words from the sentiment lexicon and returns a table. The table contains variables Word and Label, where Label contains categorical values Positive and Negative corresponding to the sentiment of each word.

function data = readLexicon

% Read positive words
fidPositive = fopen(fullfile('opinion-lexicon-English','positive-words.txt'));
C = textscan(fidPositive,'%s','CommentStyle',';');
wordsPositive = string(C{1});

% Read negative words
fidNegative = fopen(fullfile('opinion-lexicon-English','negative-words.txt'));
C = textscan(fidNegative,'%s','CommentStyle',';');
wordsNegative = string(C{1});
fclose all;

% Create table of labeled words
words = [wordsPositive;wordsNegative];
labels = categorical(nan(numel(words),1));
labels(1:numel(wordsPositive)) = "Positive";
labels(numel(wordsPositive)+1:end) = "Negative";

data = table(words,labels,'VariableNames',{'Word','Label'});

end

Preprocessing Function

The function preprocessReviews performs the following steps:

  1. Convert the text data to lowercase using lower.

  2. Tokenize the text using tokenizedDocument.

  3. Erase punctuation using erasePunctuation.

  4. Remove stop words (such as "and", "of", and "the") using removeStopWords.

function [documents] = preprocessReviews(textData)

% Convert the text data to lowercase.
cleanTextData = lower(textData);

% Tokenize the text.
documents = tokenizedDocument(cleanTextData);

% Erase punctuation.
documents = erasePunctuation(documents);

% Remove a list of stop words.
documents = removeStopWords(documents);

end

See Also

| | | | | | |

Related Topics