Load multiple files, but it only returns the data in last file

2 views (last 30 days)
I have code to load a data array for a single file. Now I need to modify the code to load several file in a same format. In this example, same number of rows (1600), but each of them have different number columns.
So far I can only get to the point that it recoginzes there are 5 files in the directory, but it only returns the data in the last file, but not all of them in a combined data array.
Original Code for one file:
filename='A Socorro 1200 Jul 26 0800.mlog';
fid=fopen(filename,'r'); %Before performing a read or write operation, obj (a file in this case) must be connected to the instrument with the fopen function.
for k=1:8;
tline=fgetl(fid); %tline = fgetl(fileID) reads and returns the next line of the specified file, removing the newline characters. fileID is an integer file identifier obtained from fopen.
end
data=zeros(1600,2459);
k=1;
while tline~=-1 %tline is a character vector unless the line contains only the end-of-file marker. In this case, tline is the numeric value -1.
datastr=regexp(tline,'(?<=">).*(?=<D)','match'); %Match regular expression (case sensitive)
tempdata=textscan(datastr{1},'%f','delimiter',','); %Read formatted data from text file or string. %f specify a string of floating-point numbers.
data(:,k)=tempdata{1};
k=k+1;
tline = fgetl(fid);
end
Here is the modified code:
files=dir('/Users/ichen/Documents/MATLAB/PNT/WSMR_July 2015 MLogs/Vehicle A/A Socorro 1200 Jul 26 *.mlog');
for i=1:length(files)
filename = files(i).name;
fid=fopen(filename,'r'); %Before performing a read or write operation, obj (a file in this case) must be connected to the instrument with the fopen function.
for k=1:8;
tline=fgetl(fid); %tline = fgetl(fileID) reads and returns the next line of the specified file, removing the newline characters. fileID is an integer file identifier obtained from fopen.
end
ncols=length(files);
nrows=1600; %Number of data points for each test set
data=zeros(nrows,ncols);
k=1;
while tline~=-1 %tline is a character vector unless the line contains only the end-of-file marker. In this case, tline is the numeric value -1.
datastr=regexp(tline,'(?<=">).*(?=<D)','match'); %Match regular expression (case sensitive)
tempdata=textscan(datastr{1},'%f','delimiter',','); %Read formatted data from text file or string. %f specify a string of floating-point numbers.
data(:,k)=tempdata{1};
k=k+1;
tline = fgetl(fid);
end
Here are the data files structure. So at the end, I am trying to get a 1600x10220 data array. thanks!
  • A Socorro 1200 Jul 26 0800.mlog 1600 x 2459
  • A Socorro 1200 Jul 26 0900.mlog 1600 x 2460
  • A Socorro 1200 Jul 26 1000.mlog 1600 x 2462
  • A Socorro 1200 Jul 26 1100.mlog 1600 x 2460
  • A Socorro 1200 Jul 26 1200.mlog 1600 x 379

Accepted Answer

Geoff Hayes
Geoff Hayes on 27 Jul 2017
Ivy - on every iteration of your for loop, you are resetting data
data=zeros(1600,2459);
which then will only retain the data from the last file. You will need to do this just once before the for loop begins. But...look closely at the following
ncols=length(files);
nrows=1600; %Number of data points for each test set
data=zeros(nrows,ncols);
You are setting the number of columns to the number of files that you have...which is not the number of columns in each file. You will need to either set this to the sum of all the columns (for all files) or dynamically ad a column on each iteration of your while loop...which is kind of what you are doing right now with your k and
data(:,k)=tempdata{1};
  2 Comments
Geoff Hayes
Geoff Hayes on 28 Jul 2017
Ivy's answer moved here
Geoff, thanks for the quick suggestion. I now update the code by moving the data=zeros (sample, sets) up to front before the for loop, and specify the total number column as the sum from all files. However, it now returns 1600X2462, which the most records have on one file, and I suspect other files are still overwritten by later files.
Any suggestion at this point?
testfiles=dir('/Users/ichen/Documents/MATLAB/PNT/WSMR_July 2015 MLogs/Vehicle A/A Socorro 1200 Jul 26 *.mlog');
sets=10220; %Total Number of test sets from specifed *.mlog files
sample=1600; %Number of data points for each test set
data=zeros(sample,sets);
for i=1:length(testfiles);
filename = testfiles(i).name;
fid=fopen(filename,'r'); %Before performing a read or write operation, obj (a file in this case) must be connected to the instrument with the fopen function.
for k=1:8;
tline=fgetl(fid); %tline = fgetl(fileID) reads and returns the next line of the specified file, removing the newline characters. fileID is an integer file identifier obtained from fopen.
end
k=1;
while tline~=-1 %tline is a character vector unless the line contains only the end-of-file marker. In this case, tline is the numeric value -1.
datastr=regexp(tline,'(?<=">).*(?=<D)','match'); %Match regular expression (case sensitive)
tempdata=textscan(datastr{1},'%f','delimiter',','); %Read formatted data from text file or string. %f specify a string of floating-point numbers.
data(:,k)=tempdata{1};
k=k+1;
tline = fgetl(fid);
end
end
Geoff Hayes
Geoff Hayes on 28 Jul 2017
Ivy - k is being reset to one on each iteration of the for loop. Since you are using this as your column index, then you are going to overwrite each column from the previous iterations. Like with the initialization of data, do this outside of the for loop
data = zeros(sample, sets);
k = 1;
Try this and see what happens!

Sign in to comment.

More Answers (1)

Ivy Chen
Ivy Chen on 28 Jul 2017
Got it, Really appreciate your help on this!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!