How do I implement a genetic algorithm for parameter fitting to an already existing ODE system>

So I am working on expanding an already working ODE system. I have my rate equations and fluxes for three additional differential equations. I would like to estimate 8 unique constants in the 3 rate equations which is all building upon a 20 equation ODE system. I also have vectorized how I would like my new 3 equations to behave for a specific input. How can I implement the genetic algorithm to focus on parameter fitting for this. I can't reveal specifically each portion of my code but i can put some filler code to demonstrate my issue. thanks in advance!
%raw data of how i want the model to behave
minutes = [10, 15, 20, 40, 60, 80]';
neweq1= [60,20,10,6,5,4.9 ]';
neweq2=[40,90,95,99,99.5]';
neweq3= [60,70,20,19.5,19]';
%fitfun= norm(fun(t)-expected value)
coeff= ga(fitfun,8);
function o= fitness (x,t,y)
y0=[200 0 .17 209.9 94.83 483.73 1.51 14.76 0 200 184.7 15.31 750 0 0 0 300 0 3000 0 100 0 0];
tspan=[0 100];
[t,y]=ode45 (@(t,y) fun, tspan, y0)
function dy=fun(t,y,input)
dy=zeros(23,0);
%20 rate equations
%my new 3 equations
%x is an array populated with the parameters i want to estimate and is a part of my new 3 equations
end
end

 Accepted Answer

Here is some prototype code I developed a while ago that you can adapt to your system —
t=[0.1
0.2
0.4
0.6
0.8
1
1.5
2
3
4
5
6];
c=[0.902 0.06997 0.02463 0.00218
0.8072 0.1353 0.0482 0.008192
0.6757 0.2123 0.0864 0.0289
0.5569 0.2789 0.1063 0.06233
0.4297 0.3292 0.1476 0.09756
0.3774 0.3457 0.1485 0.1255
0.2149 0.3486 0.1821 0.2526
0.141 0.3254 0.194 0.3401
0.04921 0.2445 0.1742 0.5277
0.0178 0.1728 0.1732 0.6323
0.006431 0.1091 0.1137 0.7702
0.002595 0.08301 0.08224 0.835];
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 50;
Parms = 10;
optsAns = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10); % Options Structure For 'Answers' Problems
% opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
% optshf = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'HybridFcn',@fmincon, 'PlotFcn',@gaplotbestf, 'PlotInterval',1); % With 'HybridFcn'
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
Start Time: 2022-10-14 13:00:28.3645
[theta,fval,exitflag,output,population,scores] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],optsAns);
Optimization terminated: average change in the fitness value less than options.FunctionTolerance.
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
Stop Time: 2022-10-14 13:01:13.5609
GA_Time = etime(t1,t0)
GA_Time = 45.1964
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
Elapsed Time: 4.519638500000000E+01 00:00:45.196
fprintf('Fitness value at convergence = %.4f\nGenerations \t\t = %d\n\n',fval,output.generations)
Fitness value at convergence = 0.3752 Generations = 2919
fprintf(1,'\tRate Constants:\n')
Rate Constants:
for k1 = 1:length(theta)
fprintf(1, '\t\tTheta(%2d) = %8.5f\n', k1, theta(k1))
end
Theta( 1) = 0.00063 Theta( 2) = 1.18633 Theta( 3) = 3.51726 Theta( 4) = 11.19108 Theta( 5) = 1.47414 Theta( 6) = 0.63178 Theta( 7) = 1.10958 Theta( 8) = 0.02015 Theta( 9) = 0.04379 Theta(10) = 0.00001
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, compose('C_%d',1:size(c,2)), 'Location','N')
function C=kinetics(theta,t)
c0 = theta(7:10);
% c0=[1;0;0;0];
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(4,1);
dcdt(1)=-theta(1).*c(1)-theta(2).*c(1);
dcdt(2)= theta(1).*c(1)+theta(4).*c(3)-theta(3).*c(2)-theta(5).*c(2);
dcdt(3)= theta(2).*c(1)+theta(3).*c(2)-theta(4).*c(3)+theta(6).*c(4);
dcdt(4)= theta(5).*c(2)-theta(6).*c(4);
dC=dcdt;
end
C=Cv;
end
I estimate the initial conditions for the differential equations as the last ‘n’ values of the parameter vector, where ‘n’ are the number of differential equations in the systen, since it is easier to estimate them as parameters than to fix them as constants. It is also more robust.
I set ‘PopSz’ to 50 so it will run here in the time permitted. Increase it to at least 100 to get better initial results.
Use the ‘optshf’ options to use a hybrid function to fine-tune the estimated parameters using the fmincon function. (I use ‘optsAns’ here because some options I use offline are not permitted with the online Run feature.) Use ‘opts’ or ‘optshf’ instead offline.
.

6 Comments

ok! thank you so much. for my "c" values, do i need a set of values for every equation of my dy function or are the last three good enough? is there any way i can make this work so that i don't have to put a value for every equation in my 23 dy equations?
My pleasure!
for my "c" values, do i need a set of values for every equation of my dy function or are the last three good enough?
You have to have one ‘dy’ equation for every column of ‘c’. that you want to fit. You can have more ‘dy’ equations of you want, providing they all share the same st of parameters, however the ones used to fit the data have to match the columns in ‘c’.
is there any way i can make this work so that i don't have to put a value for every equation in my 23 dy equations?
I have no idea because I do not know what you are doing. As I mentioned, there must be one ‘dy’ equation for every column of ‘c’ that you want to fit. If you only want to fit three columns, you only need three differential equations. If all the equations in the system share the same parameters, and all those parameters can be estimated accurately by fitting three columns with three differential equations, then fitting three columns to estimate the parameters is good enough, providing they will accurately estimate all the necessary system parameters.
You will have to fit as many columns of ‘c’ as necessary to estimate all the necessary system parameters. However it would be best to fit all 23 of the colums with 23 separate ‘dy’ equations for the most optimal results.
.
Ok that sounds good!
How can I modify my fitness function and nested functions so I am only comparing the results of the three dy equations in which i am using. Also, would you happen to know what this error messages mean?
Arrays have incompatible sizes for this operation.
Error in hippo_wntnew_yap_ga>@(theta)norm(expected-fitness(time,theta)) (line 31)
ftns= @(theta) norm (expected-fitness(time,theta));
Error in createAnonymousFcn>@(x)fcn(x,FcnArgs{:}) (line 11)
fcn_handle = @(x) fcn(x,FcnArgs{:});
Error in makeState (line 58)
firstMemberScore = FitnessFcn(state.Population(initScoreProvided+1,:));
Error in galincon (line 24)
state = makeState(GenomeLength,FitnessFcn,Iterate,output.problemtype,options);
Error in ga (line 416)
[x,fval,exitFlag,output,population,scores] = galincon(FitnessFcn,nvars, ...
Error in hippo_wntnew_yap_ga (line 37)
[theta,fval,exitflag,output, population, scores]= ga(ftns, Parms, [], [], [],[],zeros(Parms,1),Inf(Parms,1),[],[],optsans);
Caused by:
Failure in initial user-supplied fitness function evaluation. GA cannot
continue.
thanks
To return only the equations you want to fit, change this:
C=Cv;
so if you only wnated to return the last three (in this example), it might be:
C=Cv(:,[2 3 4]);
since the ordinary differential equation integration functions return column-wise solutions.
The error messages mean that the dimensions of the data and the objective function output do not match. If you only want to fit three columns of data, and your data have more columns than that, or you are fitting more columns than you have data for them, that error would be thrown.
Another possibility is that even if the column dimensions match, the row dimensions may not match if the differential equation integrating function encountered a singularity and stopped before it had finished the integration. In that event, just start the ga process over. If it keeps occurring, it may be necessary to look closely at the initial population to be certain that the parameter ranges are realistic.
I cannot tell from the posted error messages exactly what the problem is.
Ah ok. Thank you!
I fixed my issue witht the code as I needed to interpolate the answer from the ODE so I could more easily compare it with my expected data and that got me an answer!

Sign in to comment.

More Answers (1)

Hi,
As per my understanding, you would like to implement a genetic algorithm to fit parameters.
Following are some resources that might help you with your issue.
  1. Genetic Algorithm - MATLAB & Simulink (mathworks.com)
  2. What Is a Genetic Algorithm? - Video - MATLAB (mathworks.com) - This contains how various approaches of genetic algorithm can be applied to given equations.
  3. Simple code for genetic algorithm - File Exchange - MATLAB Central (mathworks.com)- This contains a sample code which shows how genetic algorithm can be used to minimize or maximize equations.

Products

Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!