How to read text files form sub-sub folders

4 views (last 30 days)
Hi,
I want to read text files from sub-sub folders:
Architecture:
Mainfolder
Tool1
sub-subFolder1
sub-subFolder2
.....
.....
Tool2
sub-subFolder1
sub-subFolder2
.....
.....
......
1. Read text files by each sub-folder(i.e, Tool1, Tool2, etc)
2. Output
Tool1.xlsx, Tool2.xlsx
I use the following code, but I can access sub-sub folders.
% - Define output header.
header = {'RainFallID', 'IINT', 'Rain Result', 'Start Time', 'Param1.pipe', ...
'10 Un Para2.pipe', 'Verti 2 mixing.dis', 'Rate.alarm times'} ;
Mainfolder='Mainfolder';
outLocatorFolder='OutputFolder';
nHeaderCols = numel( header ) ;
% - Build listing sub-folders of main folder.
% D_main = dir( 'D:\Mekala_Backupdata\Matlab2010\Mainfolder' ) ;
D_main = dir(Mainfolder ) ;
D_main = D_main(3:end) ; % Eliminate "." and ".."
% - Iterate through sub-folders and process.
for dId = 1 : numel( D_main )
% - Build listing files of sub-folder.
D_sub = dir( fullfile(Mainfolder, D_main(dId).name, '*.txt' )) ;
nFiles = numel( D_sub ) ;
keyboard
% - Prealloc output cell array.
data = cell( nFiles, nHeaderCols ) ;
% - Iterate through files and process.
for fId = 1 : nFiles
% - Read input text file.
inLocator = fullfile(Mainfolder, D_main(dId).name, D_sub(fId).name ) ;
content = fileread( inLocator ) ;
% - Extract relevant data.
rainfallId = str2double( regexp( content, '(?<=RainFallID\s+:\s*)\d+', 'match', 'once' )) ;
iint = regexp( content, '(?<=IINT\s+:\s*)\S+', 'match', 'once' ) ;
rainResult = regexp( content, '(?<=Rain Result\s+:\s*)\S+', 'match', 'once' ) ;
startTime = strtrim( regexp( content, '(?<=Start Time\s+:\s*).*?(?= -)', 'match', 'once' )) ;
param1Pipe = str2double( regexp( content, '(?<=Param1.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
tenUn = str2double( regexp( content, '(?<=10 Un Para2.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
verti2 = regexp( content, '(?<=Verti 2 mixing.dis\s+\S+\s%\s+)\S+', 'match', 'once' ) ;
rateAlarm = strtrim( regexp( content, '(?<=Rate.alarm times\s+\S+\s+)[^\r\n]+', 'match', 'once' )) ;
% - Populate data cell array.
data(fId,:) = {rainfallId, iint, rainResult, startTime, ...
param1Pipe, tenUn, verti2, rateAlarm} ;
end
% - Output to XLSX.
% outLocator = fullfile( 'D:\Mekala_Backupdata\Matlab2010\OutputFolder', sprintf( '%s.xlsx', D_main(dId).name )) ;
outLocator = fullfile(outLocatorFolder, sprintf( '%s.xlsx', D_main(dId).name )) ;
fprintf( 'Output XLSX: %s ..\n', outLocator ) ;
xlswrite( outLocator, [header; data] ) ;
end
many thanks in advance,

Accepted Answer

Image Analyst
Image Analyst on 4 Oct 2017
You need to use in dir() instead of *. See attached demo.

More Answers (1)

Cedric Wannaz
Cedric Wannaz on 4 Oct 2017
Edited: Cedric Wannaz on 4 Oct 2017
Look at the EDIT 4:09pm block in the thread:
update the pseudo-code
Iterate through sub folders of 'Mainfolder'
Iterate through files of sub folder
Extract data from file and store in data array
Export data array to relevant Excel file
specifically for your new problem, and it should show you how to restructure and update the former code. At first remove all the code that is not necessary to crawling through the folders and files, and run it to check that it is crawling as desired.
Big hint: you should be able to add a level of FOR loop. Define D_sub at a strategic place:
for dmId = 1 : numel( D_main )
D_sub = dir( fullfile( Mainfolder, D_main(dmId).name )) ;
D_sub = D_sub(3:end) ; % Eliminate "." and ".."
iterate through its elements (sub-sub-folders):
for dsId = 1 : numel( D_sub )
D_subsub = dir( fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, '*.txt' )) ;
nFiles = numel( D_subsub ) ;
and finally iterate through D_subsub elements (the text files):
for fId = 1 : nFiles
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
content = fileread( inLocator ) ;
Note that if you have a recent version of MATLAB, you can replace most calls to FULLFILE by the value of the folder field of the relevant output of a former DIR, e.g.:
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
could be replaced by:
inLocator = fullfile( D_subsub(fId).folder, D_subsub(fId).name ) ;
Finally, note that if you have a lot of different situations with varying depths of nested folders, a better approach would be to build a recursive crawler, but this is a bit more complex.
  4 Comments

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!