How to match two different matrices

2 views (last 30 days)
Mohamed Nedal
Mohamed Nedal on 17 Jul 2018
Commented: jonas on 19 Jul 2018
Hello, I have two matrices of different lengths and this is what the scenario looks like ..
x = [...];
y = [...];
size(x) = 5800 * 16
size(y) = 450 * 14
% X & Y have dates & times in the first six columns in this form:
% year, month, day, hour, minute, second
% Each column represents a variable
% Each row represents a data sample
% A model to predict a variable in (X) after some time
...
X_time + some_time = predicted_time; % in hours
% "X_time" is the time of (X)
% "Y_time" is the time of (Y)
% Match that predicted time with the time of (Y) within a range of +/- 11 hours
for i = 1:length(x)
for j = 1:length(y)
if (predicted_time >= Y_time-11) && (Y_time+11 >= predicted_time) is True
MATCHED = [x(i,:) y(j,:) predicted_time];
end
end
end
Please, I want to know how to make this work as I tried a lot but it didn't work properly.
  10 Comments
jonas
jonas on 17 Jul 2018
Edited: jonas on 17 Jul 2018
So, what you need to do is:
  1. Loop through each storm in SET2
  2. Calculate the corresponding time until it reaches the location of SET1
  3. Find the storm that is closest in time to this value
Right?
These are simple steps, and it seems to me that this is almost what Albert Fan proposed some comments ago. If you provide some sample data to work with, I'm sure someone will give you code now that the problem is clearly stated. I guess we also need your model though, unless you can provide the modelled time-slots.
After this discussion, the initial code actually makes some sense :)
Mohamed Nedal
Mohamed Nedal on 17 Jul 2018
@jonas. Yes, I guess as you said I need to loop through each storm in SET1, calculate the corresponding time until it reaches the location of SET2, and finally find the storm in SET2 that is closest in time to that value.
  • Kindly find the attached files. The following is the code I wrote so far but still it doesn't work as it should be. The "matched set" gives zeros.
tic
close all; clear; clc
%%Read Data
soho = xlsread('start.xlsx'); % initial storms data
shocks = xlsread('end.xlsx', 1); % final storms data
%%Start Time
yr1 = soho(4:end,1);
M1 = soho(4:end,2);
d1 = soho(4:end,3);
hh1 = soho(4:end,4);
mm1 = soho(4:end,5);
ss1 = soho(4:end,6);
%%End Time (start of Shocks)
yr2 = shocks(4:end,1);
M2 = shocks(4:end,2);
d2 = shocks(4:end,3);
hh2 = shocks(4:end,4);
mm2 = shocks(4:end,5);
ss2 = shocks(4:end,6);
%%Parameters
% CMEs
CPA = soho(4:end,7);
w = soho(4:end,8);
vl = soho(4:end,9);
vi = soho(4:end,10);
vf1 = soho(4:end,11);
v20Rs = soho(4:end,12);
a1 = soho(4:end,13);
mass = soho(4:end,14);
KE = soho(4:end,15);
MPA = soho(4:end,16);
% Shocks
vfinal = shocks(4:end,11);
T =shocks(4:end,13);
N = shocks(4:end,14);
%%Inistial Values
AU = 149599999.99979659915; % Sun-Earth distance in km
d = 0.76 * AU; % cessation distance in km
%%Pre-Allocating Variables
a_calc = zeros(size(vl));
squareRoot = zeros(size(vl));
A = zeros(size(vl));
B = zeros(size(vl));
ts = zeros(size(vl));
t_hrs = zeros(size(vl)); % predicted transit time in hours
t_mn = zeros(size(vl)); % predicted transit time in minutes
%%G2001 Model
% calculations
for i = 1:length(vl)
a_calc(i) = power(-10,-3) * ((0.0054*vl(i)) - 2.2); % in km/s2
squareRoot(i) = sqrt(power(vl(i),2) + (2*a_calc(i)*d));
A(i) = (-vl(i) + squareRoot(i)) / a_calc(i);
B(i) = (AU - d) / squareRoot(i);
ts(i) = A(i) + B(i); % in seconds
t_mn(i) = ts(i) / 60; % in minutes
end
clear i;
%%Show the predicted travel time
% CME-ICME Matching
matchedSet = zeros(length(shocks), 33); % the final set of CME-ICME pairs
for n = 1:length(yr2)
if yr2(n) == yr1(n)
for m = 1:length(M2)
if M2(m) == M1(m)
for k = 1:length(d2)
if d2(k) == d1(k)
if (t_mn(k) >= (((hh2(k)*60)+mm2(k)+(ss2(k)/60))-11)) && ((((hh2(k)*60)+mm2(k)+(ss2(k)/60))+11) >= t_mn(k))
matchedSet(k, 1:16) = soho(k,:);
matchedSet(k, 18:32) = shocks(k,:);
end
end
end
end
end
end
end
clear n; clear m; clear k;
toc
I really appreciate that :)

Sign in to comment.

Accepted Answer

jonas
jonas on 18 Jul 2018
Edited: jonas on 18 Jul 2018
I've made an attempt to fix your code and match the two time-vectors. I've converted your time-vectors to datetime format and fixed your matching-algorithm. The matching works by looping through the modelled time-vector, which is based on the longer time-vector (SET1), and finding the closest match in the smaller time-vector (SET2). A match is only stored if the absolute difference is smaller than 11 hours.
The output, id, is a vector with two columns, where each row [id1 id2] shows the matched indices, i.e. the row of SET1 with the corresponding row of SET2.
NOTE: id is longer than SET2, which would indicate that some elements of SET2 are matched twice.
%%Read Data
soho = xlsread('start.xlsx'); % initial storms data
shocks = xlsread('end.xlsx', 1); % final storms data
%%Start Time
%%EDITED %%
t1=datetime(soho(4:end,1:6));
t2=datetime(shocks(4:end,1:6));
%%ORIGINAL CODE %%
%%Parameters
% CMEs
CPA = soho(4:end,7);
w = soho(4:end,8);
vl = soho(4:end,9);
vi = soho(4:end,10);
vf1 = soho(4:end,11);
v20Rs = soho(4:end,12);
a1 = soho(4:end,13);
mass = soho(4:end,14);
KE = soho(4:end,15);
MPA = soho(4:end,16);
% Shocks
vfinal = shocks(4:end,11);
T =shocks(4:end,13);
N = shocks(4:end,14);
%%Inistial Values
AU = 149599999.99979659915; % Sun-Earth distance in km
d = 0.76 * AU; % cessation distance in km
%%Pre-Allocating Variables
a_calc = zeros(size(vl));
squareRoot = zeros(size(vl));
A = zeros(size(vl));
B = zeros(size(vl));
ts = zeros(size(vl));
t_hrs = zeros(size(vl)); % predicted transit time in hours
t_mn = zeros(size(vl)); % predicted transit time in minutes
%%G2001 Model
% calculations
for i = 1:length(vl)
a_calc(i) = power(-10,-3) * ((0.0054*vl(i)) - 2.2); % in km/s2
squareRoot(i) = sqrt(power(vl(i),2) + (2*a_calc(i)*d));
A(i) = (-vl(i) + squareRoot(i)) / a_calc(i);
B(i) = (AU - d) / squareRoot(i);
ts(i) = A(i) + B(i); % in seconds
t_mn(i) = ts(i) / 60; % in minutes
end
clear i;
t_model=t1+minutes(t_mn);
%%Show the predicted travel time
% CME-ICME Matching
%%EDITED FROM HERE AND ON %%
id=nan(numel(t_model),2);
for i=1:numel(t_model)
[MinDiff ind]=min(abs(t2-t_model(i)));
if MinDiff<hours(11)
id(i,1:2)=[i ind];
else
id(i,1:2)=[i NaN];
end
end
id(isnan(id(:,2)),:)=[];
plot(id(:,1),id(:,2),'.')
  8 Comments
Mohamed Nedal
Mohamed Nedal on 19 Jul 2018
Okay, I'll do that.
Thank you so much for your help.
I'll use this code in the analysis phase of my paper and I was thinking of acknowledging you if you don't mind.
If it's okay with you, please send me your information such as the first name, the last name, the organization, the specialty, and the email address.
jonas
jonas on 19 Jul 2018
That's really kind of you, I'm flattered. However, this is fairly standard stuff so there is absolutely no need for me to take up space in your paper. I'm a final year PhD student myself and, although I have zero questions asked, this forum has helped me a ton throughout my work. I'm just happy to give something back.
Btw, as a final note. Since this is for a scientific paper, don't forget that if the error is larger than 11 hours, then it's not included as a match. Also double-check why the "matched" number of entries is larger than the total number of entries in SET2 (I think it's about 550 matches compared to 450 unique entries in SET2).
Good luck in your work and let me know if you need more help!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!