Removal of duplicate data

25 views (last 30 days)
Hello everyone!
I'm working on a project where the goal is to recieve data from a flying object. The data is recieved from different stations along the flight path. The data need to be plotted in MatLab and displayed graphically. The data is stored in a .txt-file, and I have managed to import this data to MatLab. The recieved data contains information like the voltage, RSSI and times between each packet++.
The problem with multiple stations, is that the same data sent by the flying object will be collected at multiple stations. The issue i have is to remove this duplicate data, so that it is not plotted twice.
I have attached a picture from the .txt-file. Column 2 are the station numbers and column 3 are the package numbers. Red line to show separation between stations and duplicate data between the blue lines.
As I'm not very skilled in MatLab, could someone point me in the right direction on what i can do to remove this data?
Also, would it be possible to make a plot so that the graph starts with the first station, continues with the next, and so on.. ?
Sincerly Jørgen
(English is not my first language, so please excuse me)
-
  2 Comments
Walter Roberson
Walter Roberson on 24 Mar 2021
I see you have packet number 54 detected by station 170 and station 187. How do you decide which of the two to keep? For example do you want to keep the one with highest RSSI? Is there a time stamp and you want to keep the one with the earliest time stamp?
Jørgen Sørebø Myhre
Jørgen Sørebø Myhre on 24 Mar 2021
That's correct, the packages detected by station 187 from 37-54 are duplicates. I basically would like to remove these duplicates.
So it would look something like this:
Station 170 recieves packages from 1-54, station 187 recieves from 55-108, and so on for all the next stations.
The times are in LSB and MSB which is respectively column 4 and 5, these are just the times between each package sent.

Sign in to comment.

Accepted Answer

William Rose
William Rose on 24 Mar 2021
I assume your data is in an array called data() with 9 columns and many rows, and column 2 is the station number.
Sort data by the columns with priority 1,3,4,5,6,7,8,9,2. By using column 2 as the last for sorting, rows that only differ in column 2 will be adjacent after the sort. Then you compare each row to the next row. If they are identical except for column 2, delete the next row. Rpeat until the next row does not match, then proceed to the next row, etc.
  4 Comments
Jørgen Sørebø Myhre
Jørgen Sørebø Myhre on 25 Mar 2021
First of all, thanks to both of you for taking your time to help me.
I have my data saved in "Raw", should i replace "YourData" with that?
Also i get an error with the "subset", this might be something i need to replace aswell?
Walter Roberson
Walter Roberson on 25 Mar 2021
Subset = Raw(b, 2)n
[~, IA] = unique(Subset, 'rows', 'stable');
selected_entries = Raw(IA,:);
This assumes that column 2 by itself is enough to determine uniqueness.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Identification in Help Center and File Exchange

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!