Read file and find maximum value
    4 views (last 30 days)
  
       Show older comments
    
Hello all,
I have files that contain numeric and text values. What I want is to read them and find the maximum value of the second column. I have attached a file for better understanding. What I have tried so far is:
files = dir('*.txt');
i=1;
for file = 'files'
      csv = dlmread(file.name,'');
end
But I am getting the error "Mismatch between file and format string. Trouble reading 'Numeric' field from file"
Thank you in advance.
1 Comment
  dpb
      
      
 on 5 Nov 2015
				Is this one file or a whole bunch of files concatenated together?
Either way, per
>> help dlmread
dlmread Read ASCII delimited file.
   ...
  All data in the input file must be numeric. dlmread does not operate 
  on files containing nonnumeric data, even if the specified rows and
  columns for the read contain numeric data only.
The simplest way to read depends on the answer to the above question, however...
Answers (1)
  per isakson
      
      
 on 5 Nov 2015
        
      Edited: per isakson
      
      
 on 5 Nov 2015
  
      The file, 11a-LEP.txt, contains multiple blocks. The blocks in turn consists of a header and two columns of numerical data. The first row of each data block consists of two zeros,   0 0.
The function, cssm, concatenates the numerical data vertically. Try
>> [ num, buf ] = cssm( '11a-LEP.txt' );
>> whos
  Name        Size            Bytes  Class     Attributes
  buf        39x1             11232  cell                
  num       429x2              6864  double              
>>
where
function    [ num, buf ] = cssm( filespec )
    str = fileread( filespec );
%   cac = regexp( str, '(?<=\s+X +LEP\s+).+?(?=(\s+X +LEP\s+)|$)', 'match' );
    cac = regexp( str, '(?<=LEP\s+).+?(?=(\s+X)|$)', 'match' );
%
    buf = cell( length( cac ), 1 );
    for jj = 1 : length( cac )    % loop over all blocks
        buf(jj) = textscan( cac{jj}, '%f%f', 'CollectOutput',true );
    end
    num = cell2mat( buf );
end
Note:   The code contains to alternate calls of regexp. The first, which is commented out, is significantly slower. Another fast alternative is
    cac = regexp( str, '\s+X +LEP\s+', 'split' ); cac(1)=[];
If it's okay to concatenate the blocks there is a simple solution:
    fid = fopen('11a-LEP.txt');
    cac = textscan( fid, '%f%f', 'CollectOutput',true, 'CommentStyle','X' );
    fclose( fid );
 
"maximum value of the second column"   Block-wise or over all blocks? And what about the zero, which is the maximum value?
0 Comments
See Also
Categories
				Find more on Large Files and Big Data in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

