Deleting specific repeating sequence from a column vector

4 views (last 30 days)
Hello,
I have raw data from a transmission. It contains a header with several values before transmitting the actual data. I would like to remove the header information from the data. The complication is that the packet size is unknown and can change between transmissions, so the code has to be automatically adaptable.
An example of a transmission:
Data = [
0 %Packet number
0
64 % Magic word
0
10000
84
26114
-26
12697 % Data start
14701
-10130
-11550
...
1 % Packet number
0
64 % Magic word
0
10000
540
27693
-165
12697 % Data start
14699
-10132
-11552
...
2 % Packet number
0
64 % Magic word
0
10000
625
-32028
-191
12697 % Data start
14699
-10132
];
and so on. Each packet begins with a header containing packet number, then 0, followed by 64 and several other values, before moving to information (last 4 values in the example)
I want to remove all the header sequence to get to the raw data, which I then can reshape into 4 columns containing each of the data values (i.e. Nx4 matrix)
I can easily identify the magic word (64) and remove it from the array:
idx = find(DataBIN == 64); % This results in idx of 468278x1
DataBIN(idx) = [];
But how can I do the same for a specific sequence? I can't figure out how to make this work in an elegant and fast way, especially since the only constants are several zeroes and 64, the rest of the values can change.
The slow way is:
Data = []
for i = 1:length(idx)-1
Data = [Data ; DataRAW(idx(i)+6:idx(i+1)-3)];
end
But this takes ages.
A faster way, I believe, would be to find all indexes of 64 as an anchor, then remove everything starting from idx-2 to idx+5. I guess I could somehow expand the idx array with a for loop, where for each index number, I expand it from idx-2 to idx+5, then the next idx and so on. So for each existing index I should add 7 more in-between, resulting in a 3277946x1 idx array (which I'm also not yet sure how to do). This would be similar to DataBIN(idxNEW) = [] operation and much quicker, but its back to the for loop, which will take a long time in the case of very long transmissions.
Might there be a more elegant solution?

Accepted Answer

Stephen23
Stephen23 on 26 Feb 2024
Edited: Stephen23 on 26 Feb 2024
"Might there be a more elegant solution?"
You don't need a loop, just remove them all at once e.g.:
Data = [
0 %Packet number
0
64 % Magic word
0
10000
84
26114
-26
12697 % Data start
14701
-10130
-11550
...
1 % Packet number
0
64 % Magic word
0
10000
540
27693
-165
12697 % Data start
14699
-10132
-11552
...
2 % Packet number
0
64 % Magic word
0
10000
625
-32028
-191
12697 % Data start
14699
-10132
99999
];
idx = find(Data==64);
idy = find(Data==0)-1;
idz = intersect(idx,idy);
Data((-2:5)+idz) = [];
Data = reshape(Data,[],numel(idz)).'
Data = 3×4
12697 14701 -10130 -11550 12697 14699 -10132 -11552 12697 14699 -10132 99999
  1 Comment
Egor Losev
Egor Losev on 27 Feb 2024
Edited: Egor Losev on 27 Feb 2024
Thank you very much! Exactly what I was looking for. I just couldn't figure out an effective way how to extend the indexes around the number 64.
I think idy and idz are redundant though. As long as I can get an anchor on every 64 (idx), and then do
idz = (-2:5)+idx
Data(idz) = [];
it will create a Nx7 matrix with raws of 7 indexes for each instance of "64" (a fixed-length sequence) in the data, and then remove these sequences.
Awesome, thank you!

Sign in to comment.

More Answers (0)

Categories

Find more on Resizing and Reshaping Matrices in Help Center and File Exchange

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!