How to traverse table row based on sets
2 views (last 30 days)
Show older comments
Let's say that I have a table in an excel spreadsheet that looks like this (based off of real GPS data, but for the sake of this question, all of these are just random numbers ):
|Set number| PRN | X-coordinate | y-coordinate | z-coordinate ...
| 1 | 17 | 24715 | -1470517 | 223695 |
| 1 | 17 | 24715 | -1470517 | 223695 |
...
| 2 | 17 | 24715 | -1470517 | 223695 |
| 2 | 17 | 24715 | -1470517 | 223695 |
So I know that I can read in the data using the build in MATLAB command:
gpsdata = readtable('spreadsheet.xlsx')
I also know I can loop through the table by row using the command:
for row = 1:height(gpsdata)
However, more specifically, I want to loop through all the table based off the set numbers in each row. One snag is that the number of set numbers varies. Sometimes there will be 8 rows of set 1, other times there will be 9 rows of set 2, etc.
My specific aim is to do specific operations on each of the sets. Looking at the table, normally there is a different PRN for each satellite. That means each measurement is a different satellite. The set is just a way to acknowledge that it is measuring the same thing. For example, the set 1s will have about 8 different satellites all measuring whatever is at set 1. However, each of the satellites will have a different measurement. My goal is to take all the measurements and use iterative least squares to estimate the exact measurement. However, that involves getting measurements from each of the different satellites within each row of the set.
Is there a way that I could loop through based off each of the set numbers to make sure that the ones that I am using are the same for each set? That is, what is the best way to operate on all the rows with set data 1s, then operate on all the data with set 2s, etc.?
While I could do each set manually, I feel as though there is a better, more programmatic way to approach this.
Thanks for your time and help in advanced! I really appreciate it!
15 Comments
jonas
on 8 Oct 2018
I'll write a more elaborate response in a bit. For now, just two comments. First:
grpstats(gpsdata(2:108,1:17), gpsdata.x1)
Wrong syntax. If you insert a table, then the grouping variable arg should be a string describing the variable name. If you want the column gpsdata.x1 as your grouping variable, then the correct syntax is:
grpstats(gpsdata(:,1:17), 'x1')
This should work as long as each of the 17 first columns contain numerical data only, and one variable is named x1.
Second, I assumed you had imported the table correctly but realized now that this is not the case. Follow Guillaumes advice. If it still does not work, then use the 'range' argument of readtable to skip the first line of the sheet. As for now, all your variables are imported as strings, as the first row in your table is a string. Note that it doesn't matter that you write:
gpsdata(2:108,1:17)
each column is still interpreted as strings.
Answers (2)
Guillaume
on 8 Oct 2018
It's unclear what problem you are actually facing with Jonas' examples. As Jonas says, you can easily use any of the grouping functions that work with tables. Not having the stats toolbox, I use varfun and rowfun. Here is a rowfun example to works with your attached excel file. I use it to calculate the mean 3D distance (from origin) of the satellites in each group. Adapt as required:
%function to use with rowfun. Defined as anonymous function here. Could be an m file instead
%takes 3 column vector inputs, x,y and z coordinates. Return a scalar
func = @(x, y, z) mean(sqrt(x.^2 + y.^2 + z.^2));
%import file
opts = detectImportOptions('GPS_data.xlsx'); %works better at figuring out that the header is on row 2
satdata = readtable('GPS_data.xlsx', opts);
%apply function to each set of satellites
rowfun(func, satdata, 'GroupingVariables', 'SetNumber', 'InputVariables', {'X_coordinate', 'Y_coordinate', 'Z_coordinate'}, 'OutputVariableNames', 'distance')
8 Comments
Guillaume
on 9 Oct 2018
I'm sorry but Ew! One rule that you need to put into practice immediately: numbered variables are always the wrong approach. sv1 should be sv(1), sv{1} or sv(1, :) or something similar. And every time you're repeating more or less the same lines of code, you're doing it wrong. Computers are very good at repeating things, let them do the work.
Your script, written in proper matlab:
% Given measurements
x_true = [1132049, -4903445, 3905453, 85000]; % meters
x_init = [0, 0, 0, 0];
% SV locations
sv = [15764733, -1592675, 21244655
6057534, -17186958, 19396689
4436748, -25771174, 1546041
-9701586, -19687467, 15359118
23617496, -11899369, 1492340
14540070, -12201965, 18352632];
% x_old values
num_it = 5;
x_old = zeros(num_it, 4);
% sigma
sd=5;
% Generate Measurements
rho = sqrt(sum((sv - x_true(1:3)) .^ 2, 2)) + x_true(4) + sd*randn(1);
% Estimation
for i = 1:num_it
rho_hat = sqrt(sum((sv - x_init(1:3)) .^ 2, 2)) + x_init(4);
% partial derivatives
h = [-(sv - x_init(1:3)) ./ (rho_hat - x_init(4)), ones(size(sv, 1), 1)];
% x_init - x_hat
dy = rho - rho_hat;
%x_hat = x_init + inv(H.'*inv(R)*H)*H.'*inv(R)*(y_1 - h)
dx = inv(h'*inv(sd^2)*h)*h'*inv(sd^2)*dy;
x_init = x_init + dx.';
x_old(i, :) = x_init;
end
Also note the warnings in the editor. I'm fairly certain you shouldn't be using inv but that's outside of my domain of expertise.
What I'm missing is what that script has to do with the table. What are the inputs and outputs of that scripts and how do they match the table?
jonas
on 8 Oct 2018
Edited: jonas
on 8 Oct 2018
Method 1 - grpstats
For the sake of simplicity, let's use only the first 5 columns in this example.
%%Import data
opts = detectImportOptions('GPS_data.xlsx');
T = readtable('GPS_data.xlsx', opts);
T = T(:,1:5);
T =
109×5 table
SetNumber PRN X_coordinate Y_coordinate Z_coordinate
_________ ___ ____________ ____________ ____________
1 17 2.4716e+06 -1.4705e+07 2.237e+07
1 6 -5.2026e+06 -2.5099e+07 6.9419e+06
...
Continuing...
%%Split and apply custom function
grpstats(T,'SetNumber',@func);
function out=func(x)
out=mean(x);
end
You can write your own function, just remember that the input, x, is always a column vector containing one group and a single variable. This is of course limiting, because you cannot perform complex operations on multiple variables in your function, or at least I do not know how. For this, it is better to use another method.
Method 2 - findgroups/splitapply
Read the data just like before, but transform the table into an array.
A=table2array(T);
Now, split the dataset in groups and perform some operation with all 4 variables (excluding Set Number) at once.
%%First column is grouping set
G = findgroups(A(:,1));
out1 = splitapply(@func,A(:,2:end),G);
function out1=func(vars)
vars
%%perform some calculations here
end
You can see that I have written out the variables passed to the custom function. What is nice here is that you can process all four variables in the same function. You can pass multiple outputs back to the main script if you so desire, however each output should be scalar.
vars =
1.0e+07 *
0.0000 0.2501 -1.4700 2.2370
0.0000 -0.5196 -2.5091 0.6975
...
I've never used rowfun personally but I suppose it's very similar to the second method.
8 Comments
See Also
Categories
Find more on Data Import from MATLAB in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!