How to parallelize MATLAB function on the CPU

Question

SH on 10 Mar 2023

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/1926220-how-to-parallelize-matlab-function-on-the-cpu

Answered: Raymond Norris on 10 Mar 2023

Dataset.mat

I have a MATLAB function and a dataset that I would like to run in parallel on the CPU. Specifically, I would like to compare the performance of running the function with and without parallelization

I also attached my Dataset below

How Can i do that in MATLAB

load('Dataset.mat')
clusternumber = 6;
function [Score] = Scorefunction(Dataset,clusternumber)
dataset_len = size(Dataset,1);
Score = zeros(1,clusternumber);
for j=1:clusternumber
    [cluster_assignments,centroids] = kmeans(Dataset,j);
    distance_within=zeros(dataset_len,1);
    distance_between=Inf(dataset_len,j);
    for i=1:dataset_len
        for jj=1:j
            boo=cluster_assignments==cluster_assignments(i);
            Xsamecluster=Dataset(boo,:);
            if size(Xsamecluster,1)>1
                distance_within(i)=sum(sum((Dataset(i,:)-Xsamecluster).^2,2))/(size(Xsamecluster,1)-1);
            end
            boo1= cluster_assignments~=cluster_assignments(i);
            Xdifferentcluster=Dataset(boo1 & cluster_assignments ==jj,:);
            if ~isempty(Xdifferentcluster)
                distance_between(i,jj)=mean(sum((Dataset(i,:)-Xdifferentcluster).^2,2));
            end
        end
    end
   
    minavgDBetween = min(distance_between, [], 2);
    silh = (minavgDBetween - distance_within) ./ max(distance_within,minavgDBetween);
    Score(j) =mean(silh);
end
end

1 Comment
Show -1 older commentsHide -1 older comments

SH on 10 Mar 2023

@Image Analyst @Walter Roberson Can you please help?

Sign in to comment.

Sign in to answer this question.

Answer 1

Arka on 10 Mar 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1926220-how-to-parallelize-matlab-function-on-the-cpu#answer_1189945

Open in MATLAB Online

Hi,

You can benefit from parallel computing by using the functions provided in the Parallel Computing Toolbox.

I have modified the code to include the required functions:

load('Dataset.mat')
clusternumber = 6;
% Create independent job on cluster
c = parcluster;
job = createJob(c);
% Create new task in the job
%for i = 1:10
    task = createTask(job, @Scorefunction, 1, {Dataset, clusternumber});
%end
submit(job); % run the job
wait(job); % wait for the job to end
results = fetchOutputs(job); % get the results
disp(results);
delete(job); % delete the job
function [Score] = Scorefunction(Dataset,clusternumber)
dataset_len = size(Dataset,1);
Score = zeros(1,clusternumber);
for j=1:clusternumber
    [cluster_assignments,centroids] = kmeans(Dataset,j);
    distance_within=zeros(dataset_len,1);
    distance_between=Inf(dataset_len,j);
    for i=1:dataset_len
        for jj=1:j
            boo=cluster_assignments==cluster_assignments(i);
            Xsamecluster=Dataset(boo,:);
            if size(Xsamecluster,1)>1
                distance_within(i)=sum(sum((Dataset(i,:)-Xsamecluster).^2,2))/(size(Xsamecluster,1)-1);
            end
            boo1= cluster_assignments~=cluster_assignments(i);
            Xdifferentcluster=Dataset(boo1 & cluster_assignments ==jj,:);
            if ~isempty(Xdifferentcluster)
                distance_between(i,jj)=mean(sum((Dataset(i,:)-Xdifferentcluster).^2,2));
            end
        end
    end
   
    minavgDBetween = min(distance_between, [], 2);
    silh = (minavgDBetween - distance_within) ./ max(distance_within,minavgDBetween);
    Score(j) =mean(silh);
end
end

To learn more about Parallel Computing Toolbox and its functions, please check out the MathWorks documentation links below:

https://www.mathworks.com/products/parallel-computing.html

https://www.mathworks.com/help/parallel-computing/createtask.html

https://www.mathworks.com/help/parallel-computing/createjob.html

4 Comments
Show 2 older commentsHide 2 older comments

Arka on 10 Mar 2023

Edited: Arka on 10 Mar 2023

Open in MATLAB Online

You can use parfor as well, but my implementation took up ~10 seconds less time than parfor. I assume the pool of workers takes some extra time to get created.

I have altered the code to add the stopwatch timer like so:

tic
task = createTask(job, @Scorefunction, 1, {Dataset, clusternumber});
submit(job); % run the job
wait(job); % wait for the job to end
results = fetchOutputs(job); % get the results
toc

This will print the time elapsed between the start of the timer and end of the timer. You can do the same thing for the sequential processing function call, and then compare the times.

To learn more about the stopwatch timer, please check out the MathWorks documentation links below:

https://www.mathworks.com/help/matlab/ref/tic.html

SH on 10 Mar 2023

@Arka But parallel take more time then actual which is not correct.

Sign in to comment.

Answer 2

Raymond Norris on 10 Mar 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1926220-how-to-parallelize-matlab-function-on-the-cpu#answer_1190505

Open in MATLAB Online

As @Arka mentioned, consider using the Parallel Computing Toolbox. In your case, I would suggest looking at rewriting your outer for-loop as a parfor.

To begin with, let's write a helper script

function t = run_me
load('Dataset.mat')
clusternumber = 6;
t0 = tic;
Scorefunction(Dataset,clusternumber);
t = toc(t0);

I'll run it once as is, then modify Scorefunction as such

for j=1:clusternumber

to

parfor j=1:clusternumber

Before I run it a second time, I started a parallel pool of 6 workers (the size of the pool should be a factor of the size of the parfor loop -- 6 in this case).

Here's my complete run, showing a speed up of 3.2 (the actual tic/toc should be directly around the parfor-loop, but I didn't want to modify your code more than I needed to).

>> t = run_me
t =
15.5585
>> 
>> parpool('local',6);
Starting parallel pool (parpool) using the 'local' profile ...
    Connected to the parallel pool (number of workers: 6).
>> tparallel = run_me
tparallel =
4.8057
>> 
>> t/tparallel
ans =
3.2375

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to parallelize MATLAB function on the CPU

1 Comment
Show -1 older commentsHide -1 older comments

Answers (2)

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to parallelize MATLAB function on the CPU

1 Comment Show -1 older commentsHide -1 older comments

Answers (2)

4 Comments Show 2 older commentsHide 2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments