Tricky randomization issue - help needed!

Dear Matlab folks,
I am struggling with the following issue:
I created 80 parings of numbers. These pairings should stay the same but the order of the pairings should be randomized.
So, I have a variable containing these pairings: sequence = [pairing_1 pairing_2 pairing_3 and so forth].
How can I randomize the order of these pairings? For example, it would look like this then: randomized_sequence = [pairing_2 pairing_3 pairing_1 .....]
And now for the tricky part:
The numbers of the pairings reach from 1 to 40. Every number from 1 to 40 is combined with one random even number between 1 and 40 and one random uneven number between 1 and 40. For 1, I could have the pairings 1 & 18 and 1 & 9 for example. But 1 must not be paired with 1! No number can be matched with itself. Furthermore, 1 must not be paired with 2 and 2 must not be paired with 1, the sequence doesn't matter. 3 must not be paired with 4, 5 must not be paired with 6 and so forth until 39 must not be paired with 40 and every time, the sequence doesn't matter. So, 39 and 40 is not possible as well as 40 and 39.
The numbers that must not be paired are:
1 2 | 3 4 | 5 6 | 7 8 | 9 10 | 11 12 | 13 14 | 15 16 | 17 18 | 19 20 | 21 22 | 23 24 | 25 26 | 27 28 | 29 30 | 31 32 | 33 34 | 35 36 | 37 38 | 39 40
So 2 and 3 can be paired, for example!
Now the problem is, if I have these 2 pairings 14 16 and 16 9 and after the shuffling of the pairings 14 16 lands before 16 9 then I have a 16 16 pairing as well which must not happen. Such unwanted pairings can happen a lot through the shuffling. For example, the 14 of 14 16 must not be preceeded by 14 or 13 and the 16 of 14 16 must not be followed by 16 or 15.
I was thinking about maybe coding something like: "If 1 is followed/preceded by 1 or 2 then shuffle again. If 2 is followed/preceded by 1 or 2 then shuffle again. If 3 is followed/preceded by 3 or 4 then shuffle again...." and so forth until "If 40 is followed/preceded by 40 or 39 then shuffle again."
What do you think, is there a way to do it like that? Or even a way easier solution, I did not think about?
I very much appreciate any help, this is bothering me since quite some time now.
Thanks in advance,
Rupert

7 Comments

Rik
Rik on 25 Aug 2020
Edited: Rik on 25 Aug 2020
It may take a long time to generate valid shiuffle if you need to exclude many of them.
How did you store the pairs? I hope you used an array instead of numbered variables.
Can you give a small example of input and valid output? You can make the example smaller (e.g. 6 instead of 40).
Thanks for your answer, Rik!
I stored the pairs in 80 variables (firstpairing= 1 5; secondpairing = 1 16; firstpairing_1 = 2 15; secondpairing_1 = 2 8 and so forth...) and all these 80 variables in another variable called sequence. I don't really know the difference between an array and a variable, honestly.
I created the pairings like this (reduced the numbers to 6):
congruent=[1 3 5];
incongruent=[2 4 6];
% The first 4 stimuli: Stimulus 1 is paired with one con and one incon stimulus
p1_1=congruent(2:length(congruent));
p1_2=incongruent(2:length(incongruent));
pick1=randi(length(p1_1));
pick2=randi(length(p1_2));
nowpick1=p1_1(pick1);
nowpick2=p1_2(pick2);
firstpair=(1);
firstpair(2)=nowpick1; % Con pairing
secondpair=(1);
secondpair(2)=nowpick2; % Incon pairing
Then I have two pairings like 1 5 and 1 4. Repeated that until 6.
1 3 | 1 6 | 2 5 | 2 6 | 3 1 | 3 6 | 4 1 | 4 2 | 5 1 | 5 4 | 6 1 | 6 2
Would be a possible outcome.
Now this sequence has to be shuffled while the pairs stay the same and in the shuffled result 1 is not followed/preceded by 1 or 2, 2 is not followed preceded by 1 or 2, 3 is not followed/preceded by 3 or 4, 4 is not followed preceded by 3 or 4, 5 is not followed/preceded by 5 or 6 and 6 is not followed/preceded by 5 or 6.
Of course, with only 6 numbers there are not too many valid sequences left, but there are way more possibilities with 40 numbers.
OK, let's first create an array with all pairs:
N=6;
all_ids=(1:N)';
odd=all_ids(1:2:end);
even=all_ids(2:2:end);
pair_with_even=even(randi(end,N,1));
pair_with_odd=odd(randi(end,N,1));
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
Now you want to shuffle this. What exactly do you mean by preceding? Should the list be made linear, like you wrote in text?
Also note that it is generally a bad idea to use variables with numbers in them if you plan to do something with these numbers. Try to use arrays (i.e. matrices) instead of a long list of variables. And if you don't know the difference (after seeing my code): I would urge you to do the Matlab OnRamp tutorial. It is free and will teach you the Matlab basics.
(also: avoid the length function; you should use numel or size instead, because length can cause unexpected behavior if your input is not a vector)
Thank you for your help!
The list should be linear (I guess) since the goal is that Matlab, specifically the Psychtoolbox, will present the stimuli (pictures) in the computed order. I have 40 stimuli which I named 1 to 40.
So, if I have a list that fulfills the requirements of the pseudorandomization (like 1 5 22 24 35 19 6 4 and so forth), I want to tell Matlab to present the pictures in exactly that order.
By "preceding" I mean, that the numbers in that list must not be preceded by certain numbers. For example, 20 must not be preceded by 20 or 19.
So by preceding you mean preceding, well, that cleared up your definition.
Let's linearize the array I generated:
reshape(list_of_pairs.',1,[])
%returns this list if rng(1) is put at the start of the code:
%1 1 1 4 2 3 2 6 3 3 3 2 4 3 4 2 5 3 5 2 6 5 6 2
Does it matter your horizontal bars are gone?
Can you try to write a function that will check if a sequence is valid? At least in words?
No, the bars were just there to make it easier to see which numbers are meant as pairs!
My idea was to let Matlab check If 1 is followed/preceded by 1 or 2. If so, then shuffle again. If 2 is followed/preceded by 1 or 2 then shuffle again. If 3 is followed/preceded by 3 or 4 then shuffle again....and so forth until if 40 is followed/preceded by 40 or 39 then shuffle again.
This way Matlab would shuffle the sequence until it's valid. Of course, if there are only 6 numbers this might take a while. But with 40 numbers, there are many possibilities. What do you think of it?
It does take a while: 20370 tries if you shuffle the list, 884 tries if you don't.

Sign in to comment.

 Accepted Answer

Rik
Rik on 26 Aug 2020
Edited: Rik on 10 Sep 2020
Because we now have a proper definition of the problem we can start writing the code for the solution:
rng(1)%set a random seed so the results are repeatable
N=6;%define the problem size here
all_ids=(1:N)';
odd=1:2:N;even=2:2:N;
% %these are the steps in the while condition:
% step1=sequence+mod(sequence,2);%'round' to an even number: [1 2 3] --> [2 2 4]
% step2=diff(step1);%find delta between each successive number
% step3=step2==0;%if the delta was zero, that means either [1 2] (or [2 1]) or [2 2]
% sequence_is_invalid=any(step3);%any invalid pair makes the whole sequence invalid
n=0;
sequence=[1 1];%enter loop by using an invalid sequence
while any(diff(sequence+mod(sequence,2))==0)
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
sequence=reshape(list_of_pairs.',1,[]);
end
clc,fprintf('generating the sequence below took %d tries\n',n),fprintf('%d ',sequence);fprintf('\n')
You can also shuffle the order of list_of_pairs:
shuffled=list_of_pairs(randperm(end),:);

27 Comments

Thank you for your answer, I can check it out tonight or tomorrow morning at the latest.
I really appreciate your help!
Hi Rik!
After some time, I'm finally able to focus on the code again.
I'm still figuring out, what exactly your code does and what's the rationale behind it. Anyway, I noticed that it gets an error when I'm trying to run it in Matlab:
"Error using horzcat
Dimensions of arrays being concatenated are not consistent.
Error in RIKs_idea (line 16)
list_of_pairs=[all_ids pair_with_odd; all_ids pair_with_even];"
I don't get why there's an error with the command "horzcat" although this command is not even in the code. Furthermore, all arrays have the same number of values. I thought, maybe it doesn't work because "all_ids" is vertically arranged while the other arrays are horizontally arranged. But transposing "all_ids" doesn't work. Then the code just doesn't do anything and Matlab is busy until I stop it.
What do you think?
I guess, you do not get this error?
Best,
R
This code actually is calling horzcat: with the square brackets you either use horzcat or vertcat when concatenating two arrays.
My code does some assumptions about the shapes. Check if all shapes are the same at the beginning of the loop. It is also worth checking what the probablity of a correct sequence is. You may also want to use the debugger to execute the code step by step.
Thanks for the clarification concerning "horzcat"!
What do you mean by "shapes"?
And when using the debugger I just get the same information, namely that the error is in line 16.
What I meant by shape is the sizes of the arrays involved. I tested the code I posted, so if you changed anything you need to make sure that the shapes stay the same (column vectors stay column vectors, ect). So what changes did you make?
Got it. I didn't change anything!
When I try to run the code the way you posted it, I get the error message posted above. Since the error seems to be in line 16 (horzcat), and "all_ids" is vertically arranged while the other arrays are horizontally arranged, I tried to transpose "all_ids". This doesn't change the size of the arrays, but it doesn't work anyway. Then the code just doesn't do anything and Matlab is busy until I stop it.
Actually, when I transpose "all_ids", the single steps work. Maybe the while loop is just too demanding for my laptop and that's why it seems to not do anything? When I run it, I have the pause sign instead of the play sign and it stays that why until I lose my patience (few minutes).
Apparently I didn't post the final version. You need to transpose the pair_with variables, so they're column vectors. Then you get the results I mentioned in this comment: 884 tries.
I'll edit the answer.
Ah, okay! Now it works.
That solution is way easier to get the first sequence! Thank you for sharing your expertise.
Now the problem is the following:
No number can be paired with itself and additionally, the numbers that must not be paired are:
1 2 | 3 4 | 5 6 | (1 must not be paired with 2, 3 not with 4, 5 not with 6)
So, invalid sequences are 1 1, 1 2, 2 1, 2 2, 3 3, 3 4, 4 3, 4 4, 5 5, 5 6, 6 5, 6 6
So 2 and 3 can be paired, for example!
Now, if I have a sequence like that: 1 5 1 6 2 3 2 6 3 1 3 6 4 5 4 2 5 1 5 4 6 3 6 4
It still has to be shuffled, because there's an obvious pattern in it. Every number stands for a picture that is presented and the participants should not learn any pattern.
I want to keep the pairings ( 1 5, 1 6, 2 3, 2 6 and so forth) and shuffle them.
But: It is possible that 1 5 lands before 5 4. And then I have a 5 following/preceding a 5 and that must not happen. So it doesn't matter that 1 5 and 5 4 are separate pairings. In the final sequence no number can follow or precede itself and 1 cannot follow/precede 2, 3 cannot follow/precede 4 and 5 cannot follow/precede 6.
I was thinking about maybe coding something like: "If 1 is followed/preceded by 1 or 2 then shuffle again. If 2 is followed/preceded by 1 or 2 then shuffle again. If 3 is followed/preceded by 3 or 4 then shuffle again...." and so forth.
What do you think about it?
The final sequence will also be with 40 numbers instead of 6, so I think then it's way easier to find a valid sequence.
Just shuffle the pairs when they are still in a matrix format:
list_of_pairs=list_of_pairs(randperm(end),:);%shuffle pairs
Setting N to 40 results in 3075 tries, which I think is reasonable, seeing as each try takes a few miliseconds (just over 17ms on my system).
If my answer solved your question, feel free to mark it as accepted answer. If not, feel free to post a comment with your remaining issues.
Yes, now it works! Thank you so much! :)
Only one point left: I need several sequences for several participants. I would put the while loop into a for n=1:x sequence.
But how can I do that without having my sequence overwritten with every execution of the while loop?
The same way you would in any loop: store the output of each iteration in an array. If you want to make sure every run is indepentent you could even move the code that creates a sequence to a separate function:
function sequence=create_sequence(N)
%write single line explanation here
%
%write explanation of input and expected outputs here
all_ids=(1:N)';
odd=1:2:N;even=2:2:N;
% %these are the steps in the while condition:
% step1=sequence+mod(sequence,2);%'round' to an even number: [1 2 3] --> [2 2 4]
% step2=diff(step1);%find delta between each successive number
% step3=step2==0;%if the delta was zero, that means either [1 2] (or [2 1]) or [2 2]
% sequence_is_invalid=any(step3);%any invalid pair makes the whole sequence invalid
sequence=[1 1];%enter loop by using an invalid sequence
while any(diff(sequence+mod(sequence,2))==0)
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
list_of_pairs=list_of_pairs(randperm(end),:);%shuffle pairs
sequence=reshape(list_of_pairs.',1,[]);
end
end
Works perfectly fine!
Thank you very much that you took your time to help me Rik! Very kind of you. :)
I have another issue, Rik. I would very much appreciate your expertise!
I modified your code a bit, since the circumstances have changed.
Instead of 40 stimuli named 1 to 40, I renamed them 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34, [...], 101, 102, 103 and 104.
Every number has to be paired with an even and an odd number, this stays the same. But now every number must not be followed or preceded by any number to which the difference is <=3.
I adjusted your code the following way:
rng(1)%set a random seed so the results are repeatable
N=40;%define the problem size here
all_ids=[11:14, 21:24, 31:34, 41:44, 51:54, 61:64, 71:74, 81:84, 91:94, 101:104]';
odd=all_ids(1:2:end)';even=all_ids(2:2:end)';
n=0;
sequence=[1 1];
while any(diff(sequence)<=3)
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
list_of_pairs=list_of_pairs(randperm(end),:);%shuffle pairs
sequence=reshape(list_of_pairs.',1,[]);
end
clc,fprintf('generating the sequence below took %d tries\n',n),fprintf('%d ',sequence);fprintf('\n')
But whenever I run the code, nothing happens. The play button becomes the pause button and that's it. Nothing appears in the command window and no errors are found. I can only pause and quit the debugging mode and start it anew but that doesn't change anything.
Do you have any idea, what's going on here?
You forgot to format your question. I'm in a good mood, so I'm not going to ignore your comment, but next time I might.
You modified the stopping condition. Did you make sure that the code ever exits the loop? Your description doesn't sound like it. Side-note: you don't need the rng call in your production code, it is only helpful when reproducing an issue, like on this forum.
What you need to do is go through the steps again. I have posted the steps I took to get to this condition in my previous comment. You either need to addapt these steps, or you need to add a new condition. If you make your condition a scalar you can use || or && to modify the current condition.
%example (breaks the loop after 10000 iterations):
n=0;
%while this_is_true AND this_is_true
while any(diff(sequence+mod(sequence,2))==0) && n<10000
n=n+1;
%rest of the previous code here
end
I'm sorry, won't forget to format from now on!
You're right, it seems like the code does not exit the loop. But why does this code exit the loop after 140 tries :
N=40;
all_ids=(1:N)';
odd=1:2:N;even=2:2:N;
n=0;
sequence=[1 1];
while any(diff(sequence+mod(sequence,2))==0)
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
sequence=reshape(list_of_pairs.',1,[]);
end
And this one doesn't at all:
N=40;
all_ids=[11:14, 21:24, 31:34, 41:44, 51:54, 61:64, 71:74, 81:84, 91:94, 101:104]';
odd=all_ids(1:2:end)';even=all_ids(2:2:end)';
n=0;
sequence=[1 1];
while any(diff(sequence)<=3)
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
list_of_pairs=list_of_pairs(randperm(end),:);
sequence=reshape(list_of_pairs.',1,[]);
end
I assume that in your code, after 140 tries a sequence has been found in which the difference between 2 consecutive numbers is never 0. And then the while loop breaks.
Why doesn't the while loop in my code break after a sequence has been found in which the difference between 2 consecutive numbers is never smaller than or equal to 3?
You assume wrong. There is a difference between diff(sequence+mod(sequence,2)) and diff(sequence).
I would suggest you use the format I showed about how to add a condition.
So, it's wrong that the while loop shuffles the sequence 140 times (in the case of N=40) until a sequence has been computed in which the difference between 2 consecutive numbers is never 0?
I think, in my code the "any" command is not useful anymore. I thought that means "any value of the following variable". But help any tells me otherwise.
My idea was that if any value of diff(sequence) is 3, 2, 1, 0, -1, -2 or -3 (can I write these numbers in one line?) then shuffle again.
Could you give me a hint about how to code that properly?
I tried you suggestion with adding the condition n<10000.
while any(diff(sequence)<=3) && n<10000
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
list_of_pairs=list_of_pairs(randperm(end),:);%shuffle pairs
sequence=reshape(list_of_pairs.',1,[]);
end
But since any(diff(sequence)<=3) does not do what I thought it does, the loop just stops after 10000 iterations a gives me an invalid sequence.
No, you added an example second condition to your not working code. You need to add a condition to the working code. If you want to avoid values between -3 and 3 you can compare the absolute value to 3. Since sequence is an array, you need to find a way to condense that to a scalar logical; e.g. with any.
while any(diff(sequence+mod(sequence,2))==0) || any(abs(diff(sequence))<=3)
Thanks for the hint with the absolute value!
So abs(diff(sequence))<=3 gives me a 1 whenever the difference in the sequence between to numbers is <= 3 and otherwise a 0. And any(abs(diff(sequence))<=3) is true when there is a 1 in my sequence.
The while loop keeps going as long as any(abs(diff(sequence))<=3) is true(1), right?
while any(abs(diff(sequence))<=3)
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
list_of_pairs=list_of_pairs(randperm(end),:);%shuffle pairs
sequence=reshape(list_of_pairs.',1,[]);
end
But why doesn't it stop as soon as any(abs(diff(sequence))<=3) is not true anymore? The code shuffles the numbers many times, it can't be that hard to get a sequence in which the difference between two consecutive numbers is never <=3.
Your condition is surprisingly strict. The code below will result in a histogram. The median is 16 and the SD is about 3.75. Using this online calculator you can see that the probability to find fewer than 2 invalid position is 0.0001. That means you need to run 5k iterations before even getting 2 invalid positions.
rng(1)%set a random seed so the results are repeatable
N=40;%define the problem size here
all_ids=[11:14, 21:24, 31:34, 41:44, 51:54, 61:64, 71:74, 81:84, 91:94, 101:104]';
odd=all_ids(1:2:end)';even=all_ids(2:2:end)';
n=0;
sequence=[1 1];
invalid_bins=zeros(1,N);
while any(abs(diff(sequence))<=3) && n<10000
n=n+1;
pair_with_even=even(randi(end,N,1)).';
pair_with_odd=odd(randi(end,N,1)).';
list_of_pairs=[all_ids pair_with_odd;all_ids pair_with_even];
list_of_pairs([1:2:end 2:2:end],:)=list_of_pairs;
list_of_pairs=list_of_pairs(randperm(end),:);%shuffle pairs
sequence=reshape(list_of_pairs.',1,[]);
k=sum(abs(diff(sequence))<=3);
invalid_bins(k)=invalid_bins(k)+1;
end
figure(1),clf(1),bar(invalid_bins)
I really didn't assume that. Thanks for clearing that up! Without you, it would have taken me a long time to realize, that a valid sequence is just virtually impossible.
I guess I have to think about another approach.
There are factorial(40) possible sequences. I don't know how many would be valid, but I suspect many. The problem is that many more are invalid. I don't know a good way to approximate how many would be valid, but shuffling until you get what you want is only a feasible solution if the relative probability is not too skewed. If you need help to devise a method to construct a sequence you should open a new question instead of posting here, as the solutions are unlikely to be related. Feel free to post a link here though (and make sure to post a link there to this post).
Just got a valid sequence. :)
It took exactly 33 023 370 tries.
Just read your latest answer, thank you for your input!
I opened a new question!
Here's the link:
https://de.mathworks.com/matlabcentral/answers/598429-tricky-randomization-issue-who-can-help

Sign in to comment.

More Answers (1)

Mohammad Sami
Mohammad Sami on 25 Aug 2020
Edited: Rik on 25 Aug 2020
One option can be to generate a sequence longer then you need and then eliminate values until the sequence is valid. Thereafter you can take desired length of sequence. If it's shorter generate again and concatenate it.
DesiredLength = 200;
N = 2 * DesiredLength;
Seq = randi([1 40],N,1);
while(any(abs(diff(Seq))<=1))
I = [false;abs(diff(Seq)) <= 1];
Seq(I) = [];
if(length(Seq)<DesiredLength)
Seq = [Seq;randi([1 40],N,1)];
end
end
Final = Seq(1:DesiredLength);

5 Comments

I assume you were on mobile, so I took the liberty of editing your code.
Thank you very much for your answer, Mohammad!
Unfortunately, I am a Matlab novice and I have no idea what the particular steps from the 4th line onwards in your code are actually doing.
Would you care to comment that in a short way next to the lines? That would be very helpful!
Based on your description I assumed that the sequence is valid if the absolute difference between two consecutive number is > 1. So the code is essentially checking if the sequence is valid in the while loop. If there is any value where the absolute difference is less then or equal to 1 it is removed. The if statement just checks if the sequence is longer then the length you need after removing invalid values. In that case it generates and append more data. The while loop will terminate once there is no invalid pairing left.
That is a very good idea, but the sequence can also be valid if the absolute difference between two numbers is 1. For example 2 can follow or precede 3, 4 can follow or precede 5, 6 can follow or precede 7 and so forth.
The numbers that must not be paired are:
1 2 | 3 4 | 5 6 | 7 8 | 9 10 | 11 12 | 13 14 | 15 16 | 17 18 | 19 20 | 21 22 | 23 24 | 25 26 | 27 28 | 29 30 | 31 32 | 33 34 | 35 36 | 37 38 | 39 40

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!