how to concatenate multiple CSV files with mix of numeric / non-numeric into one final table

1 view (last 30 days)
Hi all,
I have 60 CSV files that contain one column of text and the remaining 2 columns are numeric.
The first row in all CSV files contains 2 headers. The names of the CSV files have different lengths with this type of sequence: Test1.csv, Test2.csv, Test3.csv, Test4.csv, and so on. Below is the code that I'm using, but somehow I couldn't get rid of the 2nd header.
Code:
INfolder='C:\Users\SamanthaTzeeSan\Documents\MATLAB\Example_3';
cd(INfolder);
% get a list of the source files, which shall be concatenated
csvFiles=dir('Test*.csv');
numfiles=length(csvFiles); % total number of the source files
% create an empty target file to store the result
fid_out=fopen('concatenated_files3.csv','w');
%
for jj=1:numfiles % loop over all source files
fid_in=fopen(csvFiles(jj).name,'r'); % open the current source file
% read the file, interpret the bytes as characters, input the result in the
% variable, str
str=transpose(fread(fid_in,'*char'));
fclose(fid_in); % close the current source file
% keep the header lines of the first source file and remove it for the other
% source files
if not (jj==1)
% line breaks are indicated with either the two bytes "\r\n" or by single
% byte "\n". ix1 is the start position of the first group of "\n" and
% "\r" (any number and any order)
ix1=regexp(str,'[\r\n]++','once');
% keep the bytes from the position, ix1, to the end (strip off the
% header lines)
str=str(ix1:end);
end
%
% find the starting position, ix2, of the trailing group of "\n" and "\r"
ix2=regexp(str,'[\r\n]++$','once');
% there might not be any line breaks at the end of the file
if not (isempty(ix2))
str(ix2:end)=[]; %strip off the line breaks at the end
end
% write the remaining row of characters to the target file
fwrite(fid_out,str,'*char');
end
fclose('all'); %close the target file (and others)
Thanks, Sam
  2 Comments
Star Strider
Star Strider on 10 Jan 2016
Can you use xlsread to read your .csv files? (You have to have Microsoft Excel installed.) If so, that would make your task significantly easier.
per isakson
per isakson on 10 Jan 2016
  • It's much easier to help if you upload a couple of sample files.
  • Do you rather want a working solution or help to debug your code?

Sign in to comment.

Answers (1)

Walter Roberson
Walter Roberson on 11 Jan 2016
textscan() each file with HeaderLines 2 and format '%[^,],%f,%f' and default delimiter (*not* ',')
Unless, that is, your first column of text can contain commas. In that case we need to know which escaping mechanism you are using. %q for the case of double-quote delimited text.
If you are sure the first column does not contain blanks then you can use a format of '%s%f%f' with 'Delimiter',','

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!