Getting error message "Index exceeds the number of array elements. Index must not exceed 0."

3 views (last 30 days)
T = readtable("Data.xlsx");
data = readtable('Data.xlsx','TextType','string');
textData = data.Properties.Description;
textData(1:10)
cleanedDocuments = tokenizedDocument(textData);
cleanedDocuments(1:10)
cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);
cleanedDocuments = removeStopWords(cleanedDocuments);
cleanedDocuments(1:10)
cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma');
cleanedDocuments(1:10)
cleanedDocuments = erasePunctuation(cleanedDocuments);
cleanedDocuments(1:10)
cleanedBag = bagOfWords(cleanedDocuments);
cleanedBag = removeInfrequentWords(cleanedBag,2);
[cleanedBag,idx] = removeEmptyDocuments(cleanedBag);
labels(idx) = [];
cleanedBag;
  3 Comments
the cyclist
the cyclist on 9 Sep 2023
Thousands of data points in an Excel file is not too many to upload, and that's the fastest way for us to help you.
You could also just upload a few rows of the file, if that gives the same error. (If that does not give the same error, then you've taken a step toward debugging the problem.)
Also, which line gives that error?

Sign in to comment.

Answers (2)

Walter Roberson
Walter Roberson on 9 Sep 2023
Edited: Walter Roberson on 9 Sep 2023
readtable() by default uses detectImportOptions or one of its variations. For an xlsx file, a spreadsheetImportOptions object would get created. That kind of import options object has no property that can control where to look in the xlsx file to find information to store in the table Description property
readtable() in turn has no option to indicate where to look to find information to store in the table Description property.
Which is to say that the table property 'Description' is initialized to empty. But your code expects that it has at least 10 elements to it.
There is a property with a related name, data.Properties.VariableDescriptions which potentially contains a description for each variable. The VariableDescriptions property can be set by readtable() under at least some conditions. Conditions have to be just right for automatic detection of variable descriptions.... That or the detected variable names have to include at least one variable name that is not a valid MATLAB identifier: in that case the default is to generate valid MATLAB variable names for the columns and to write the detected variable names into the VariableDescriptions property...
Note that data.Properties.Description is not the same as data.Description -- which would be what would be used if you had a variable whose name was Description .

david cowan
david cowan on 19 Nov 2023
[cleanedBag,idx] = removeEmptyDocuments(cleanedBag);
labels(idx) = [];
no empty documents?
labels not same size as cleanedBag?

Categories

Find more on Debugging and Analysis in Help Center and File Exchange

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!