Combine two excel files with different row and column lengths into one excel file

5 views (last 30 days)
Good day, everyone.
Attached are two excel files that I plan to combine the selected data into one excel file.
From Dummy_A, I only want: Day, Hour, Time. Basically, Dummy_A will serve as the base data.
From Dummy_B, I only want: Data_A, Data_B and Data_C. But the data must be based on their time in Dummy_B.
At the end, I expect my final excel file will be like Dummy_Example.
I tried to use xlsread and xlswrite but it can't be done because the dimension of array each excel file is not the same. I don't want to do manually copy paste from excel because attached are only one day data and I need to manage 365 days data. Hopefully can get some helps from the community. Thank you in advanced.

Accepted Answer

Cris LaPierre
Cris LaPierre on 14 Feb 2022
Edited: Cris LaPierre on 14 Feb 2022
Using readtable and outerjoin, I was able to create the table in MATLAB. Now you can just write it back to Excel using writetable. When joining tables, I find it easiest to use the Join Tables live task in a live script to figure out the correct settings. Once obtained, I just turned the task into editable code.
Here is what I came up with.
% Load the data
dataA = readtable("Dummy_A.xlsx");
dataA.TIME = timeofday(datetime(dataA.TIME,'ConvertFrom','datenum'));
dataA.TIME.Format = 'hh:mm'
dataA = 1440×3 table
DAY HOUR TIME ___ ________ _____ 1 {'12AM'} 00:00 1 {'12AM'} 00:01 1 {'12AM'} 00:02 1 {'12AM'} 00:03 1 {'12AM'} 00:04 1 {'12AM'} 00:05 1 {'12AM'} 00:06 1 {'12AM'} 00:07 1 {'12AM'} 00:08 1 {'12AM'} 00:09 1 {'12AM'} 00:10 1 {'12AM'} 00:11 1 {'12AM'} 00:12 1 {'12AM'} 00:13 1 {'12AM'} 00:14 1 {'12AM'} 00:15
dataB = readtable("Dummy_B.xlsx");
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
dataB.TIME = timeofday(datetime(dataB.TIME,'ConvertFrom','datenum','Format','HH:mm'))
dataB = 46×6 table
DAY GPSTOW TIME Data_A Data_B Data_C ___ __________ ________ ______ ______ ________ 1 3.2196e+05 17:26:00 1 27.71 0.023809 1 3.2244e+05 17:34:00 1 26.376 0.024555 1 3.2292e+05 17:42:00 1 25.094 0.029246 1 3.2376e+05 17:56:00 1 21.652 0.026394 1 3.2382e+05 17:57:00 1 21.314 0.022237 1 3.2394e+05 17:59:00 1 20.656 0.028268 1 3.2406e+05 18:01:00 1 20.139 0.028141 1 3.2442e+05 18:07:00 1 18.528 0.019971 1 3.2514e+05 18:19:00 1 15.439 0.028761 1 3.252e+05 18:20:00 1 15.271 0.10344 1 3.264e+05 18:40:00 1 10.845 0.11038 1 3.2646e+05 18:41:00 1 10.635 0.015652 1 3.2682e+05 18:47:00 1 9.3665 0.026989 1 3.273e+05 18:55:00 1 8.1648 0.026735 1 3.2778e+05 19:03:00 1 7.1181 0.10343 1 3.2784e+05 19:04:00 1 6.992 0.011088
% Merge the two tables
joinedData = outerjoin(dataA,dataB,"Type","left","Keys",["DAY","TIME"],...
"MergeKeys",true,"RightVariables",["Data_A","Data_B","Data_C"])
joinedData = 1440×6 table
DAY HOUR TIME Data_A Data_B Data_C ___ ________ _____ ______ ______ ______ 1 {'12AM'} 00:00 NaN NaN NaN 1 {'12AM'} 00:01 NaN NaN NaN 1 {'12AM'} 00:02 NaN NaN NaN 1 {'12AM'} 00:03 NaN NaN NaN 1 {'12AM'} 00:04 NaN NaN NaN 1 {'12AM'} 00:05 NaN NaN NaN 1 {'12AM'} 00:06 NaN NaN NaN 1 {'12AM'} 00:07 NaN NaN NaN 1 {'12AM'} 00:08 NaN NaN NaN 1 {'12AM'} 00:09 NaN NaN NaN 1 {'12AM'} 00:10 NaN NaN NaN 1 {'12AM'} 00:11 NaN NaN NaN 1 {'12AM'} 00:12 NaN NaN NaN 1 {'12AM'} 00:13 NaN NaN NaN 1 {'12AM'} 00:14 NaN NaN NaN 1 {'12AM'} 00:15 NaN NaN NaN
writetable(joinedData,'Dummy_C.xlsx')
Once comment. The NaN values don't get written to Excel. The one difference to point out, then, is that your example file does not contain values for Data_A while this example code does.
  4 Comments
Cris LaPierre
Cris LaPierre on 15 Feb 2022
Edited: Cris LaPierre on 15 Feb 2022
It looks like you should use the 'excel' option rather than 'datenum'. Currently those that look like duplicates are read in as 00:17:59 and 00:20:59, but the code only displays 'hh:mm' (it doesn't round when setting display format).
See below for how to update the code:
dataA = readtable("Dummy_A.xlsx");
dataA.TIME = timeofday(datetime(dataA.TIME,'ConvertFrom','excel'));
% visualize the table using the updated option
dataA(15:25,:)
ans = 11×3 table
DAY HOUR TIME ___ ________ ________ 1 {'12AM'} 00:14:00 1 {'12AM'} 00:15:00 1 {'12AM'} 00:16:00 1 {'12AM'} 00:17:00 1 {'12AM'} 00:18:00 1 {'12AM'} 00:19:00 1 {'12AM'} 00:20:00 1 {'12AM'} 00:21:00 1 {'12AM'} 00:22:00 1 {'12AM'} 00:23:00 1 {'12AM'} 00:24:00
Ann
Ann on 16 Feb 2022
Hi Cris, converting the 'excel' turns out great. Thanks for sharing the knowledge because I also overlooked on the different seconds in the time series. Thank you so much and have a great day!

Sign in to comment.

More Answers (1)

KSSV
KSSV on 14 Feb 2022
T1 = readtable('Dummy_A.xlsx') ;
T2 = readtable('Dummy_B.xlsx') ;
T3 = readtable('Dummy_Example.xlsx') ;
DAY = T1.DAY ;
HOUR = T1.HOUR ;
TIME = T1.TIME ;
Data_A = interp1(T2.TIME,T2.Data_A,T1.TIME) ;
Data_B = interp1(T2.TIME,T2.Data_B,T1.TIME) ;
Data_C = interp1(T2.TIME,T2.Data_B,T1.TIME) ;
T = table(DAY,HOUR,TIME,Data_A,Data_B,Data_C) ;
writetable(T,'test.xlsx')
  1 Comment
Ann
Ann on 14 Feb 2022
Hi KSSV, thanks for your reply and I find that this just merge not according to the specific time. Thank you for your effort to try. Have a nice day.

Sign in to comment.

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!