Why is the Matches variable too large?

I was playing around with a code that produces a simulator for the birthday paradox. Here's the piece of code:
ready = false;
while ~ ready
% User inputs and defining variables
birthday_repeats = input('select the number of birthday repeats from 10-500:');
if birthday_repeats > 500 || birthday_repeats < 10 || isempty(birthday_repeats) || round(birthday_repeats) ~= birthday_repeats|| isnan(birthday_repeats)
disp('error')
continue
end
sample_size = input('Select the sample size from 2-365: ');
if sample_size > 365 || sample_size < 2|| isempty(sample_size) || round(sample_size) ~= sample_size || isnan(sample_size)
disp('error')
continue
end
matches = zeros(1,sample_size);
N = 1000;
days = randi(1,365);
% simulation run
for k = 1:birthday_repeats % desired number of trials
matches = 0;
for j = 1:days
if test(days)
matches(sample_size) = matches(sample_size) + 1;
end
end
match_tally = matches(sample_size)/N;
end
I'm not sure why after each iteration of the loop the variable 'matches(sample_size)' is too large and here's the the 'test' function:
function out = test(data)
out = false;
n = length(data);
for k = 1:n
for i = k+1:n
if data(k) == data(i)
out = true;
break
end
end
end
end

 Accepted Answer

Because you initialise the variable matches two times
You first set it as a vector, and then for each loop iteration you set it as a scalar.
matches = zeros(1,sample_size);
N = 1000;
days = randi(1,365);
for k = 1:birthday_repeats
matches = 0; %-------------------- DELETE THIS LINE --------------------%
for j = 1:days
if test(days)
matches(sample_size) = matches(sample_size) + 1;
end
end
match_tally = matches(sample_size)/N;
end

5 Comments

If I just set matches to 0 will the loop work?
No, if you set it to zero you can matches as a 1x1 vector.
If sample_size is, for example, 10, by doing:
matches(sample_size)
You are trying to access the 10th element of a 1x1 vector.
I don't how th rest of the code continues so I can't really suggest you the right path to take, but if you need the variable matches to be a number and not a vector, do the following:
matches = 0;
N = 1000;
days = randi(1,365);
for k = 1:birthday_repeats
for j = 1:days
if test(days)
matches = matches + 1;
end
end
match_tally = matches/N;
end
Note that I initialise
matches = 0;
Outside the loop, not inside as you have done before, because otherwise it would reset to zerto at every iteration.
Here's the actual code:
% User inputs and defining variables
disp('Welcome to the Birtday Paradox')
ready = false;
while ~ ready
% User inputs and defining variables
birthday_repeats = input('select the number of birthday repeats from 10-500:');
if birthday_repeats > 500 || birthday_repeats < 10 || isempty(birthday_repeats) || round(birthday_repeats) ~= birthday_repeats|| isnan(birthday_repeats)
disp('error')
continue
end
sample_size = input('Select the sample size from 2-365: ');
if sample_size > 365 || sample_size < 2|| isempty(sample_size) || round(sample_size) ~= sample_size || isnan(sample_size)
disp('error')
continue
end
days = randi(365,sample_size);
tally = zeros(1,sample_size);
match = 0;
% simulation run
for k = 2:sample_size
birthdays = randi(365,1,k);
for j = 1:birthday_repeats% desired number of trials
if test(days)
match(sample_size) = match(sample_size) + 1;
end
end
match_tally = match/N;
end
% plotting graphs
domain = 2:sample_size;
figure(1)
plot(domain,match_tally(2:sample_size))
xlabel('Number of people Chosen')
title('Birthday Paradox simulation')
ylabel('Probability')
break
end
I tried your change and the match variable is still too large.
You are doing a lot of unecessary for loops where each iteration just overwrites the results of the previous.
For this reason, it is not quite clear what you want to plot in th end.
Is this what you are looking for:
%% Input:
% disp('Welcome to the Birtday Paradox')
% birthday_repeats = input('select the number of birthday repeats from 10-500:');
% if birthday_repeats > 500 || birthday_repeats < 10 || isempty(birthday_repeats) || round(birthday_repeats) ~= birthday_repeats|| isnan(birthday_repeats)
% error('Wrong input');
% end
% sample_size = input('Select the sample size from 2-365: ');
% if sample_size > 365 || sample_size < 2|| isempty(sample_size) || round(sample_size) ~= sample_size || isnan(sample_size)
% error('Wrong input');
% end
% Uncomment previous and comment this:
birthday_repeats = 500;
sample_size = 23;
%% Process:
domain = 2 : sample_size;
match_tally = zeros(1,sample_size-1);
for j = 2 : sample_size
matches = zeros(1,j);
for k = 1 : birthday_repeats
birthdays = randi(365,1,j);
matches(k) = any(histcounts(birthdays,1:365) > 1);
end
match_tally(j-1) = sum(matches)/birthday_repeats;
end
expected_probability = 1 - prod(((365-sample_size+1):365)./365);
%% Plot:
figure;
hold on;
plot(domain,match_tally)
yline(expected_probability)
text(2,.97*expected_probability,['Expected probability for ' num2str(sample_size) ' people: ' num2str(expected_probability)])
grid on;
xlabel('Number of people chosen')
ylabel('Probability')
title('Birthday Paradox simulation')
subtitle([num2str(birthday_repeats) ' evaluations for each sample'])
Yeah, I just didn't understand how get my loop working as well as the formula to find the probability of matches. Thank you so much G.

Sign in to comment.

More Answers (1)

days = randi(1,365);
That does not request a random number between 1 and 365. That requests 365 x 365 random numbers in the range 1 to 1.

2 Comments

Will this:
days = randi(365,1,365)
days = 1×365
10 154 189 303 94 360 307 61 191 200 359 232 126 224 303 144 130 192 122 130 171 283 142 151 33 306 9 362 210 111
generate a 1x365 matrix containing random numbers
Yes, you can see from the summary output
days = 1x365
that it has generated a 1 x 365 array.
Note that many elements in days will be repeated. If you need to have a random permutation of the day numbers, use randperm

Sign in to comment.

Categories

Products

Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!