find indices of each first repeated elements in the array (full of repeated values)

25 views (last 30 days)
The array is about millions length long and full of small segments each with repeated values. Part of the array example is here:
B=[0 0 0 0 0 0 0 0.6775 ...
0.6775 0.6775 0.6775 0.6775 0.6775 0.6775 0.6775 0.7575 ...
0.7575 0.7575 0.7575 0.7575 0.7575 0.7575 0.7575 0.7575 ...
0.7575 0.7575 0.7575 0.7575 0.7575 0.6775 0.6775 0.6775 ...
0.6775 0.6775 0.6775 0.6200 0.6275 0.6275 0.6275 0.6275 ...
0.6275 0.5900 0.5975 0.5975 0.5975 0.5975 0.5975 0.5875 ...
0.5875 0.5875 0.5875 0.5875 0.5875 0.5875 0.5875 0.5875 ...
0.5875 0.5875 0.5875 0.5875 0.5875 0.5875 0.5875 0.5875 ...
0.5875 0.5575 0.5575 0.5575 0.5575 0.5575 0.5175 0.5175 ...
0.5175 0.5175 0.5175 0.4875 0.4875 0.4875 0.4875 0.4875 ...
0.4975 0.4975 0.4975 0.4975 0.4975 0.5075 0.5075 0.5075 ...
0.5075 0.5075];
And my problem is trying to find begiing index of each segments (with >=2 same elements inside) for the whole long array. For segment only has one value, we should ignore it. And in this example, the initial solution should be as below:
indexB=[1 8 16 30 37 43 48 66 71 76 81 86];
for different segments of 0s, 0.6775s, 0.7575s, 0.6775s again, B(36)=0.62 is ignored, 0.6275s, B(42)=0.59 is ignored, 0.5975s, 0.5875s, 0.5575s, 0.5175s, 0.4875s, 0.4975s. 0.5075s.
Problem 1 solved with the help of Alex. Thx.
Problem 2 is about the find those missing index for the two segments with same repeated elements and sit next to each other. It's important to find these missing index for further data processing.
Now, we need to add the missing index in the middle of some long segment as below:
NewIndex = diff([indexB,numel(B)+1]); % NewIndex=[7 8 14 7 6 5 18 5 5 5 5 5];
ind=find(NewIndex>=mean(NewIndex*2)-2); %ind = [3 7]
%interplate new index into index B
newIndexB= [indexB(1: ind(1)), indexB(ind(1)+ round((indexB(ind(1)+1)-indexB(ind(1)))/2), indexB(ind(1)+1: ind(2)), indexB(ind(2))+round((indexB(ind(2)+1)-indexB(ind(2)))/2) , indexB(ind(2)+1:end);
The final solution should be
newindexB=[1 8 16 16+7 30 37 43 48 48+9 66 71 76 81 86];
I have used for loop to do the find the initial solution and final solution. Is there other way without using for loop to save time?
  3 Comments
Limei Cheng
Limei Cheng on 12 Aug 2020
Hi KSSV,
Thx for your suggestion. However, you didn’t understand my problem. Indexing of these repeated is tricky (see the steps between initial solution and final solutions). Also, I’m not looking for the unique elements, but looking for indices that capture the repeated pattern inside the array.

Sign in to comment.

Accepted Answer

J. Alex Lee
J. Alex Lee on 12 Aug 2020
It's too early for me to attempt to understand your 2nd problem, but for the first problem try using "diff"
indexB0 = [1,1+find(diff(B))]
This will not filter your "single" sequences, so you can filter those out by applying diff on the indexB0
  1 Comment
Limei Cheng
Limei Cheng on 12 Aug 2020
Thank you, Alex. The 2nd problem is about the find those missing index for the two segments with same repeated elements and sit next to each other. It's important to find these missing index for further data processing.

Sign in to comment.

More Answers (0)

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!