Reading a complex text file and building a matrix
    6 views (last 30 days)
  
       Show older comments
    
    Sanchit Sharma
 on 30 Jul 2019
  
    
    
    
    
    Commented: per isakson
      
      
 on 2 Aug 2019
            Hello MATLAB experts,
I am stuck at a typical problem and would appreciate your help a lot. I am trying to read a complex file(attached - - example.txt). This file has millions of lines I truncated it to only 2000.
My aim is simple:
- If a column is '--- detector1 ---'.
- increament 'numberofgammaclusters'.
- Then X = first numerical digit, Y = second numerical digit, and A(X,Y) = third numerical digit.
- read this till next '--- detector1 ---' is encountered. On the this encounter repeat the steps 2 and 3 are repeated.
The sample code that I am trying is below. Please let me know. Any help regarding the improvements in the code or any advice in the approach is hugely appreciated.
A= zeros(256, 256);
E = importdata('gamma.txt', ' ');
numberofgammaclusters=0;
for i=1:1082952
    if E.textdata(:,2)==contains('detector1')
        numberofgammaclusters=numberofgammaclusters+1;
        A()= % The values at second last column % Part of the code I don know how to write
    end
end
Thanks very much in advance.
Regards,
Sanchit Sharma
3 Comments
Accepted Answer
  per isakson
      
      
 on 30 Jul 2019
        
      Edited: per isakson
      
      
 on 31 Jul 2019
  
      Try this
%%
chr = fileread( 'example.txt' );
clusters = strsplit( chr, '--- detector1 ---\r\n' );
clusters(1) = [];
clear('chr');
numberofgammaclusters = length( clusters );
A = nan( 256, 256 );
for jj = 1 : numberofgammaclusters
    cac = strsplit( clusters{jj},'\r\n' );
    for ii = 1 : length( cac )
        if not( contains( cac{ii}, '===' ) )
            vec = textscan( cac{ii}, 'PixelHit%f%f%f%f', 'Delimiter',',' );
            A(vec{1},vec{2}) = vec{3}; 
        else
            break
        end
    end
end
This script requires some memory, but I think it will be ok. 
Second thought. Replace
            vec = textscan( cac{ii}, 'PixelHit%f%f%f%f', 'Delimiter',',' );
            A(vec{1},vec{2}) = vec{3}; 
by
            vec = sscanf( cac{ii}, 'PixelHit%f,%f,%f,%f' );
            A(vec(1),vec(2)) = vec(3); 
to avoid vec being a cell array
In response to comments
Here is a script that is somewhat more robust. Matlab's indexing is one-based. In your file X and maybe Y takes the value zero. I added "+1".
%%
chr = fileread( 'gamma.txt' );
clusters = regexp( chr, '--- detector1 ---[ ]*\r*\n', 'split' );
clusters(1) = [];
clear('chr');
numberofgammaclusters = length( clusters );
A = zeros( 256, 256 );
for jj = 1 : numberofgammaclusters
    cac = regexp( clusters{jj},'\r*\n','split' );
    for ii = 1 : length( cac )
        if not( contains( cac{ii}, '===' ) )
            vec = sscanf( cac{ii}, 'PixelHit%f,%f,%f,%f' );
            A(vec(1)+1,vec(2)+1) = A(vec(1)+1,vec(2)+1) + vec(3); 
        else
            break
        end
    end
end
imagesc( A );
% pick a colormap and show "zero" (approx. A(X,Y)<1) as white
mymap = colormap( parula(1e5) );
mymap(1,:)=1;
colormap( mymap )
colorbar
% flip the YAxis
ax = gca;
ax.YAxis.Direction = 'normal';
outputs

14 Comments
  per isakson
      
      
 on 2 Aug 2019
				[ ]* stands for zero or more spaces. It's easy to miss trailing spaces, since they don't show in the editor.  
More Answers (1)
  Bob Thompson
      
 on 30 Jul 2019
        
      Edited: Bob Thompson
      
 on 30 Jul 2019
  
      I have not been able to utilize your example file, it's a limitation on my end.
That being said, this is how I would look at doing what I understand you're looking for.
fid = fopen('gamma.txt');
line = fgetl(fid);
c = 1;
while isnumeric(line)
    if length(line) > 8 & strcmp(line(1:8),'PixelHit')
        tmp = regexp(line,' ','split');
        A(str2num(tmp{2}),str2num(tmp{3})) = str2num(tmp{4})
    end
    line = fgetl(fid);
    c = c+1;
end
Might need to do some minor editing, because I couldn't use your example file, but the basic concept is sound. If you're looking to capture other data, just add an elseif condition.
This will take some time, but any method (as far as I know) for reading a 2mil line text file is going to take some time.
0 Comments
See Also
Categories
				Find more on Text Data Preparation in Help Center and File Exchange
			
	Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!





