Difference between readmatrix() and readtable()

290 views (last 30 days)
Robert Kugler
Robert Kugler on 24 Apr 2019
Commented: dpb on 6 May 2019
Hi community,
After reading that xlsread() is not recommended anymore, I tried to get my excel table read through the readmatrix() function. However, I don't seem to get my excel file read in the correct way. The output that readmatrix gives me is never what what I can read in the excel file. I assume this has something to do with the coding, so I tried to get the output right through the "OutputType" option, but this did not yield satisfactory results either.
Later then, I found the readtable() function, and this seems to give me what I want. However, I still need to understand a bit more why it does so, and that brings me to my basic question:
I read the documentations on both functions and from what I understand the most basic difference between them seems to be that readmatrix() returns an array and readtable() returns a table. But why does this cause me trouble in getting the data from the excel file returned correctly/the way I need it when I use readmatrix()? What is the underlying difference in how Matlab handles the data in this case? And, if readtable() actually turns out to the function that is better suited in this case, for what prupose would readmatrix() be better suited then? Could you give an example?
Note on the data I am intending to import from the excel file: it is a relatively simple database-like structured sheet where each row is an item and each column represents a propoerty of that item (e.g.: item may be a certain material, and the colums can be "price", "weight", etc.). The data contained in the property colums can be of both text or numeric type, or empty.
Thanks in advance for your kind support!
  4 Comments
dpb
dpb on 6 May 2019
If there are some user tools for defining and then cleaning up import option objects, that's a big step, potentially...(I'm stuck at R2017b for the time being for other reasons so can't go investigate just now).
While it's doable as is, the interface to the object through the supplied functions is very painful to deal with for complex files with large numbers of variables that may have issues such as outlined earlier (been there, done that number of times on the forum). Being able to make such changes interactively to specific variables would be great.
The same caveat as above must still be observed, however, and isn't documented nor discussed at all that I am aware of -- once one does this, one MUST retain that import object for use with all files of the type; rescanning another file may break the association thus so carefully and painfully created.

Sign in to comment.

Accepted Answer

dpb
dpb on 24 Apr 2019
Edited: dpb on 26 Apr 2019
It appears readmatrix is limited to returning one type of data in the output array as the 'OutputType' named parameter is limited to a scalar string/cell string. Therefore, there's no way to tell it to return the various types of data that may be in different columns in a spreadsheet(*).
readtable on the other hand, can and does support variables of different type by column and so can handle a spreadsheet similar to your description and is certainly the best option for that type of a spreadsheet when you want all the data and not just a subset of a given data type. A table, however, is limited to have the same number of rows in every variable so if the spreadsheet isn't perfectly regular, it will fill in a bunch of missing values down to the furtherest extent in the spreadsheet unless you limit the range. Sometimes this is ok, other times it's a pain.
xlsread is slow but has the facility to return all of what is in the spreadsheet; it does so by returning numeric and text fields separately, however, whereas a table contains them in one data structure retaining their type (with some help, sometimes from an import object with, particularly, dates). If you want the full spreadsheet unadulterated, the last optional raw output array is a cell array of the whole spreadsheet from which one can retrieve any piece of information from any row/col cell regardless of its type or whether it is consistent with the type of those cells around it. Sometimes, this feature is also invaluable. xlsread is the one option that existed before the introduction of any of the more general data classes except for cell arrays which is why it returns separate variables; there was no other option at the time.
The doc implies readmatrix will return a table if one passes the optional opts structure but I see little reason then for it in lieu of readtable which also can use an opts object.
Seems to me TMW is continuing to just add stuff just to be adding it without any clearly defined (to the user base, anyway) direction of where they're headed and why they don't just improve existing functions instead of overloading namespace and bloating code size with multiple essentially duplicated functionalities.
(*) And, of course, for an array other than a cell array, there can only be one data class for the array, anyway. Without the importoptions optional input, readmatrix just punts and returns only what numeric data it can find without offering the option of returning the text data or the combined that does xlsread. This, somehow, seems a step backwards and why not just rewrite xlsread if want to improve it? Instead, for historical reasons, it can never be actually removed and we've just got yet another eventually more or less orphaned function hanging around like textread (which, incidentally, still has some features that make it more useful on occasion than the favored textscan).

More Answers (0)

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!