# Specifying a cost function to optimize a simbiology model response to plausible ranges

1 view (last 30 days)

Show older comments

Fearass Zieneddin
on 27 May 2022

Commented: Arthur Goldsipe
on 7 Jun 2022

I am trying to generate a virual population simulation for my model using a uniform distribution for some parameter set from my simbiology model. Then I specify some if statements to reduce the simdata to fall within biological bounds and perform some calculations such as: mean, median, max, and min. Next, I would like to speicfy the cost funciton to optimize my model response with respect to the chosen parameter set. How can I correctly specify the cost function in terms of the simulaiton data? The cost functions is given as: and ; where l and u are lower and upper bounds for plausible ranges of model states. I've provided the simdataReduced and timeVector to help with solving the problem. How can I specify Mi(p) in this case?

%% New script to perform analysis on the simdataReduced

stopTime = simdataReduced(1).Time(end);

timeVector = linspace(0,stopTime,700);

simdataResampled = resample(simdataReduced, timeVector);

% "Stack" the matrices of simulation results and average them

% Calculate the 25th and 75th percentiles

stackedData = cat(3, simdataResampled.Data);

meanData = mean(stackedData, 3);

maxData = max(stackedData, [],3);

minData = min(stackedData, [],3);

medianData= median(stackedData, 3);

prc75= prctile(stackedData, 75, 3);

prc25= prctile(stackedData, 25, 3);

%%--------------- function script---------------------------------------------------

function J = computeCost(X, y, theta)

%COMPUTECOST Compute cost for linear regression

% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the

% parameter for linear regression to fit the data points in X and y

% Initialize some useful values

m = length(y); % number of training examples (length of simdataResampled)

% You need to return the following variables correctly

J = 0;

% ====================== YOUR CODE HERE ======================

% Instructions: Compute the cost of a particular choice of theta

% You should set J to the cost.

prediction = X.*theta;

%sqrError = (prediction - y).^2;

c= 0.5*(lb+up)

sqrError= (((prediction- c).^2) - (ub-c).^2);

J = sum(max(sqrError), 0);

% =========================================================================

end

##### 3 Comments

Arthur Goldsipe
on 27 May 2022

### Accepted Answer

Arthur Goldsipe
on 1 Jun 2022

Fearass and I had a chat about this. I figured out that the cost function is essentially a modification of some of squares of residuals. The are reduced so that they are zero when the simulated values are within the plausible range.

With that information, I realized that represents the simulation results of a particular state at a particular time and for a particular parameter set. In code, this would be simdataResampled(p).Data(t,i) for timepoint t, state i, and parameter sample p. The code to calculate the cost function could take advantage of "implicit expansion", and using the variables defined in the example code it might be written something like this: sum(max(0,(stackedData-c).^2-(u-c).^2),'all')

##### 6 Comments

Arthur Goldsipe
on 7 Jun 2022

I can think of several ways to do this. The key thing is that you need to do something to keep track of the parameter mapping when you filter down the SimData. Let me first offer a simple change but then suggest a more "MATLAB-y" way to approach this that might help you long term.

The simplest thing you could do would be to store additional information when you add an element to simdataResampled. You create a variable that helps you map back to the sample number (for example, sampleIndex(j) = i;). You could also store the parameter values themselves in the UserData property of the SimData. For example, if you use sampleValues from my previous comment, you could write simdataResampled(j).UserData = sampleValues(i,:).

But a more MATLAB way to approach this is to vectorize as much of your code as possible, and to avoid "growing" vectors one element at a time the way you currently create simdataReduced. This will have the added benefit. Specifically, I would create a logical vector isPlausible to indicate whether each of your original samples is plausible. I would then use that vector (after your for loop) to create simdataReduced and optionally to identify the parameter values associated with the reduced sample set. Specifically:

%% 2. Set up plausible bounds for biological outcomes

% reduce the population size based on constraints

n = length(simdata);

isPlausible = false(n,1);

for i = 1:length(simdata) % Looping over total number of subjects (n in the present case)

[~,x]= max(simdata(i, 1).Data(:,7)); % concentrations

[~,z]= max(simdata(i, 1).Data(:,4)); %blood levels for some biological outcome

idx= find(simdata(i,1).Time >=15 & simdata(i,1).Time <=20)

%Define the index for recovery after 15-20 days

% recover at least %90 after 15-20 days we REJECT the parameter sets

isPlausible(i) = simdata(i,1).Data(end,1) >= 0.90 && simdata(i, 1).Data(x, 7)<= 15100 ...

&& simdata(i, 1).Data(x, 7)>=1500 ...

&& simdata(i,1).Data(z,4)>=232 && simdata(i,1).Data(z,4)<= 341 ...

&& simdata(i,1).Data(idx(1),1) >= 0.90;

end

sampleValues = generate(samples);

simdataReduced = simdata(isPlausible);

sampleValuesReduced = sampleValues(isPlausible,:);

### More Answers (0)

### Communities

More Answers in the SimBiology Community

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!