# Apply function row-wise on a tall array

9 views (last 30 days)
Clemens Gersch on 8 Apr 2020
Commented: Edric Ellis on 8 Apr 2020
Hi,
I have a tall array T with some variables, let's say A, B and C. I also have a function f(a,b,c), that only works with scalars a, b and c and cannot be vectorized.
Now I would like to run this function on every row of the tall array yielding a column D.
I tried something like:
% Fill D with nans.
nanLike = @(in) nan(size(in),'like',in);
T.D = matlab.tall.transform(nanLike, T.A);
% Iterate through rows and overwrite nan with function value.
for i = 1:gather(size(T,1))
a = T.A(i);
b = T.B(i);
c = T.C(i);
[a,b,c] = gather(a,b,c);
T.D(i) = f(a,b,c);
end
But that neither works, nor can be efficient.
Do you have any idea?
Here is my real function f just that you see, that it cannot be vectorized so easily (of course, if you can help my vectorize it, the problem would also be solved):
function [opprice] = opprice(S, K, r, TtM, sigma, d, Type, ExerciseType, steps, inputcheck)
%OPPRICE calculates the price of a stock option given the specified
%parameters using BS for European and CRR for American options.
% S: stock price
% K: strike price
% r: risk-free rate
% TtM: time to Maturity in years
% sigma: volatility, (0, inf)
% d: dividend yield, [0, inf)
% Type: {'C', 'P'}
% ExerciseType: {'A', 'E'}
% steps: height of the binomial tree
% inputcheck: if input validity should be checked {true, false}
%% Input Validity Check.
if nargin <= 9
inputcheck = true;
end
if inputcheck == true
if nargin < 9
error('Not enough inputs specified.');
end
if numel(S)~= 1 || ~isnumeric(S) || isnan(S) || S <= 0
opprice = NaN;
return;
end
if numel(K)~= 1 || ~isnumeric(K) || isnan(K) || K <= 0
opprice = NaN;
return;
end
if numel(r)~= 1 || ~isnumeric(r) || isnan(r)
opprice = NaN;
return;
end
if numel(TtM)~= 1 || ~isnumeric(TtM) || isnan(TtM) || TtM <= 0
opprice = NaN;
return;
end
if numel(sigma)~= 1 || ~isnumeric(sigma) || isnan(sigma) || sigma <= 0
opprice = NaN;
return;
end
if numel(d)~= 1 || ~isnumeric(d) || isnan(d) || d < 0
opprice = NaN;
return;
end
if ~any([strcmp(Type,'P'), strcmp(Type,'C')])
opprice = NaN;
return;
end
if ~any([strcmp(ExerciseType,'A'), strcmp(ExerciseType,'E')])
opprice = NaN;
return;
end
if strcmp(ExerciseType,'A')
if steps < 1 || floor(steps) ~= steps
opprice = NaN;
return;
end
end
end
%% Do Pricing.
switch ExerciseType
case {'American', 'A'}
% Precalculate components.
dT = TtM ./ steps;
u = exp(sigma .* sqrt(dT));
p0 = (u.*exp(-d .* dT) - exp(-r .* dT)) ./ (u.^2 - 1);
p1 = exp(-r .* dT) - p0;
% Stock prices at time T.
S_T = S.*u.^(2*(0:steps)'-steps);
% Option type related binomial calculations.
switch Type
case {'Call','C'}
% Option prices at time T.
p = max(S_T - K, 0);
% Move from latest to earliest node.
for j = steps:-1:1
% Binomial value.
p = p0 * p(2:j+1) + p1 * p(1:j);
% Exercise value.
exercise_value = S.*u.^(2*(0:(j-1))'-(j-1)) - K;
p = max(p, exercise_value);
end
case {'Put','P'}
% Option prices at time T.
p = max(K - S_T, 0);
% Move from latest to earliest node.
for j = steps:-1:1
% Binomial value.
p = p0 * p(2:j+1) + p1 * p(1:j);
% Exercise value.
exercise_value = K - S.*u.^(2*(0:(j-1))'-(j-1));
p = max(p, exercise_value);
end
end
% Get price today from tree
opprice = p(1);
case {'European', 'E'}
divisor = sigma * sqrt(TtM);
d_1 = (log(S/K) + (r - d ...
+ sigma^2/2) * TtM) / divisor;
d_2 = d_1 - divisor;
switch Type
case {'Call','C'}
opprice = exp(-d*TtM) * S * normcdf(d_1) ...
- exp(-r*TtM) * K * normcdf(d_2);
case {'Put','P'}
opprice = exp(-r*TtM) * K * normcdf(-d_2) ...
- exp(-d*TtM) * S * normcdf(-d_1);
end
end
end

Edric Ellis on 8 Apr 2020
You're on the right track with matlab.tall.transform, but you should call your function in that context. The function invoked by matlab.tall.transform gets given blocks of the underlying data, so you can iterate over them (relatively) efficiently. Something like this perhaps:
% Make a tall table
T = tall(table(rand(100,1), rand(100,1), rand(100,1), 'VariableNames', {'A', 'B', 'C'}));
% Use tall transform to create variable D.
T.D = matlab.tall.transform(@iCallRowWiseFcn, T, 'OutputsLike', {1});
% Display summary of T with new variable
summary(T);
function d = iCallRowWiseFcn(t)
% Here, t is a block of our original table. So, we need to call
% our function on each row.
d = NaN(height(t), 1);
[a,b,c] = deal(t.A, t.B, t.C); % extract variables for convenience
for idx = 1:height(t)
d(idx) = a(idx) + b(idx) + c(idx);
end
end
##### 2 CommentsShowHide 1 older comment
Edric Ellis on 8 Apr 2020
I'm not an expert, but I think it might make a difference in recent releases depending on the underlying datastore. You could of course simply pass in T(:,{'A', 'B', 'C'}) to make a temporary tall table with only those variables - it certainly shouldn't do any harm.