readtable error : read all columns into a single column

I am using readtable to read a *.txt file into table
data = readtable('test.txt', 'Format','auto')
the data file have 10 columns, but in the loaded data table, all columns are cancatecated into a single column, any ideas how to fix this?
thank you.

4 Comments

The data looks like this:
After table in Matlab looks like this:
Please attach a sample data file.
I suspect the issue is that, since there are variable numbers of items in each row, it has not been able to successfully determine the number of variables in the file.
A first attempt might be to specify the expected number of variables.
data = readtable('test.txt', "ExpectedNumVariables",10)
Just a comment that I count more than 10 columns of data in the screenshot you shared.
You forgot to attach the data file so we have to guess...
First, try
data = readtable('test.txt');
with no additional parameters; generally the input scanner is pretty competent about finding the delimiter on its own.
If that fails, then telling it what the delimiter is will be next; the image you attached is from Excel where you imported it, it is NOT the text file; the Q? is whether Excel was able to detect the delimiter on its own or did you have to use the whizard to separate columns -- if the text file is fixed-width, that is generally the case and would be one likely cause for your result in MATLAB.
Please see the attached .txt file, adding "ExpectedNumVariables" won't solve the issue.
I also tried "Format"
readtable(filepath,'Format','%s%s%s%f%f%f%f%f%f%f%f%f%f')
This wont work either.

Sign in to comment.

Answers (2)

It looks like MATLAB is not correctly identifying the delimitar, which is a tab. You can manually specify that.
Some fo the datatypes are also incorrectly selected. You can use theImport Tool to ineratively set the import parameters and then generate a script.
You could also do that manually.
opts = delimitedTextImportOptions("NumVariables",13);
opts.Delimiter = "\t";
opts.VariableNamesLine = 1;
opts.DataLines = [2,inf];
opts = setvartype(opts,["datetime", "string", "string", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double"]);
data = readtable("data_20091130.txt",opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
data = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 "03216N103" <missing> NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 30-Nov-2009 "016275109" <missing> NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002034957" <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 30-Nov-2009 <missing> <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 30-Nov-2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 30-Nov-2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 30-Nov-2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 30-Nov-2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 30-Nov-2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 30-Nov-2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386

1 Comment

Using your second approach also works, but with some additional specifications.
readtable("data_20091130.txt","Format",'%{MM/dd/yyyy}D%s%s%f%f%f%f%f%f%f%f%f%f',...
'Delimiter','\t','TextType','string')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly __________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 11/30/2009 "03216N103" "" NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 11/30/2009 "016275109" "" NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002034957" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 11/30/2009 "" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 11/30/2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 11/30/2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 11/30/2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 11/30/2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 11/30/2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 11/30/2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386

Sign in to comment.

filename = 'data_20091130.txt';
% detect import options, specifying tab as delimiter:
opts = detectImportOptions(filename,'Delimiter','\t');
% set variable 2 and 3's data to be interpreted as character vectors:
opts = setvartype(opts,opts.VariableNames([2 3]),{'char','char'});
% read the file:
T = readtable(filename,opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ _____________ ___________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 {'03216N103'} {0×0 char } NaN NaN -0.26873 NaN NaN 1 NaN {0×0 char} NaN NaN 30-Nov-2009 {'016275109'} {0×0 char } NaN NaN 0.028927 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'860341106'} {'B298752'} NaN NaN -0.19644 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'816074405'} {'B01RQH8'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002034957'} {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'00846U101'} {'2520153'} 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 {0×0 char} 7 NaN 30-Nov-2009 {0×0 char } {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'004403101'} {'2005650'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'043436104'} {'2855855'} 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 {0×0 char} 10 NaN 30-Nov-2009 {'013104104'} {'2012467'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002824100'} {'2002305'} -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 {0×0 char} 10 0.49791 30-Nov-2009 {'00081T108'} {'B0G7SZ5'} -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 {0×0 char} 3 NaN 30-Nov-2009 {'G0450A105'} {'2740542'} -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 {0×0 char} 5 NaN 30-Nov-2009 {'025195405'} {'B064919'} -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 {0×0 char} 1 NaN 30-Nov-2009 {'008190100'} {'2034524'} -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 {0×0 char} 9 NaN 30-Nov-2009 {'032654105'} {'2032067'} 1.3453 -0.11685 0.14799 2.7512 0 1 0 {0×0 char} 9 0.65386

Tags

Asked:

on 9 Nov 2023

Commented:

on 10 Nov 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!