matlab.tall.reduce behavior

1 view (last 30 days)
Théo Ranson
Théo Ranson on 5 May 2022
Answered: Githin George on 2 Nov 2023
Hi,
I'm trying to use the matlab.tall.reduce function to analyse a large set of csv File. As the 'reducefcn' i want to use 'mean' in order to process the mean of all the partial result from a custom function which filtering data. So i have commands like these :
dataStore = tabularTextDatastore('my_csv_path\*csv','ReadSize',N);
data = tall(dataStore);
signal1 = data.mySignal1;
signal2 = data.mySignal2;
signal = signal1 - signal2;
time = data.Time;
myOutput1 = matlab.tall.reduce(@myCustomFunction,@mean,signal,time);
myOutput1Gathered = gather(myOutput1);
% Where myCustomFunction is function which process data by block
function myOutput = myCustomFunction(signal,time)
% ...
end
I expect the result to be the same as :
myOutput2 = matlab.tall.transform(@myCustomFunction,signal,time); % Same custom function
myOutput2Gathered = gather(myOutput2);
myOutput2Gathered = mean(myOutput2Gathered);
and the result differed by a lot, and the result from the second block is exact while the first is false.
Is there something that i missed in the behavior of the reduce algorithm ?
thanks,
Théo

Answers (1)

Githin George
Githin George on 2 Nov 2023
Hello,
I understand you are facing an issue when trying to obtain the mean of a transformed tall array through the 2 different methods “matlab.tall.reduce” and “matlab.tall.transform”.
The “matlab.tall.reduce” function performs a reduction operation on the tall array by dividing it into partitions and applying the reduction function to each partition. The partial results are then combined using the “mean” function in your case until a single result is obtained.
On the other hand, when you use “matlab.tall.transform” followed by gather and mean, you're applying the custom function to the entire tall array, gathering the results into memory, and then calculating the mean.
You can use the following code example to obtain the same results for mean when using “matlab.tall.reduce”.
% Change reduce function to output the sum across dim 1 - Once partial
% outputs are reduced we will obtain total sum and total number of elements
% in a 2x1 matrix
reducefcn = @(x) sum(x,1);
myOutput1 = matlab.tall.reduce(@sumcount, reducefcn, signal, time);
myOutput1Gathered = gather(myOutput1);
% Calculating Mean
myOutput1Mean = myOutput1Gathered(1)/myOutput1Gathered(2);
% Create Partial output containing sum and num of elements
function out = sumcount(signal,time)
out = [sum(CustomFunction(signal,time)) numel(signal)];
end
I Hope this helps.

Categories

Find more on Statistics and Linear Algebra in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!