How to traverse table row based on sets

Question

0 votes

GPS_data.xlsx

Let's say that I have a table in an excel spreadsheet that looks like this (based off of real GPS data, but for the sake of this question, all of these are just random numbers ):

| 1 | 17 | 24715 | -1470517 | 223695 |

...

| 2 | 17 | 24715 | -1470517 | 223695 |

So I know that I can read in the data using the build in MATLAB command:

gpsdata = readtable('spreadsheet.xlsx')

I also know I can loop through the table by row using the command:

for row = 1:height(gpsdata)

However, more specifically, I want to loop through all the table based off the set numbers in each row. One snag is that the number of set numbers varies. Sometimes there will be 8 rows of set 1, other times there will be 9 rows of set 2, etc.

My specific aim is to do specific operations on each of the sets. Looking at the table, normally there is a different PRN for each satellite. That means each measurement is a different satellite. The set is just a way to acknowledge that it is measuring the same thing. For example, the set 1s will have about 8 different satellites all measuring whatever is at set 1. However, each of the satellites will have a different measurement. My goal is to take all the measurements and use iterative least squares to estimate the exact measurement. However, that involves getting measurements from each of the different satellites within each row of the set.

Is there a way that I could loop through based off each of the set numbers to make sure that the ones that I am using are the same for each set? That is, what is the best way to operate on all the rows with set data 1s, then operate on all the data with set 2s, etc.?

While I could do each set manually, I feel as though there is a better, more programmatic way to approach this.

Thanks for your time and help in advanced! I really appreciate it!

15 Comments
Show 13 older comments Hide 13 older comments

billyjthorton on 7 Oct 2018

Edited: billyjthorton on 7 Oct 2018

My specific aim is to do specific operations on each of the sets. Looking at the table, normally there is a different PRN for each satellite. That means each measurement is a different satellite. The set is just a way to acknowledge that it is measuring the same thing. For example, the set 1s will have about 8 different satellites all measuring whatever is at set 1. However, each of the satellites will have a different measurement. My goal is to take all the measurements and use iterative least squares to estimate the exact measurement. However, that involves getting measurements from each of the different satellites within each row of the set. Does that make more sense?

As an aside, I looked into the functions and they don't work quite out of the box. That is, I can't do grpstats(gpsdata) or findgroups(gpsdata). I think grpstats is closer to what I would like to achieve, but when I do something like grpstats(gpsdata,1) I get the error "Unrecognized variable name 'Set Number' ". Any assistance you could provide given the table set up as shown above?

jonas on 8 Oct 2018

Edited: jonas on 8 Oct 2018

Open in MATLAB Online

SetNumber is the name of the variable that you want to use as grouping variable. The syntax is as follows>

grpstats(x,groupingvariable,stats)

how you specify the groupingvariable depends upon the class of x. If x is a table, then groupingvaraible is a variable name. When importing with readtable, "SetNumber" was assigned automatically as variablename to my 1st column, so that's where that came from. You can get a list of the variable names like this (T is your table):

T.Properties.VariableNames

If x is instead a matrix, then groupingvariables is also a matrix (or vector), with the same amount of rows as the matrix.

Either way, x cannot include non-numerical columns, and will return an error if your table includes a column of strings. You can take a subset of your table by using braces, e.g. T(:,1:5), will pass the first 5 columns. This subset must include the groupingvariable and should not include any column that is non-numeric, because how do you calculate the mean of a string?

Going to get some sleep, will check in tomorrow morning. Cheers.

billyjthorton on 8 Oct 2018

Edited: billyjthorton on 8 Oct 2018

Open in MATLAB Online

It took me some time, but I realized I was having trouble with what you mention above would be the issue. I'm running into non-integer values that are causing the problem. However, it seems to be a hurdle I can't quite fix.

For example:

grpstats(gpsdata(1:110,1:17), gpsdata.Var1)

This produces: "Unrecognized variable name 'Set Number'". As you can see in the attached test data, that string is present. However, even if I shift it and do:

grpstats(gpsdata(2:110,1:17), gpsdata.Var1)

The exact same error is present. Even if I take only the numeric data, copy it into another spreadsheet and try to process it with something like:

grpstats(gpsdata(2:108,1:17), gpsdata.x1)

I still get an error: "Duplicate variable name: 'x0'".

I can't quite group the data successfully. In doing more research on some MATLAB functions, I learned about the find function. It seems possible to use the "find" function to find all rows where the Set Number is equal to each number (1-12) and grouped the data that way. However, that seems to take in only numeric values as well. Given that when I use readtable to bring in the spreadsheet, it's in table format, I can't seem to convert it (e.g. table2array) into a version that is actually usable by find either.

jonas on 8 Oct 2018

Edited: jonas on 8 Oct 2018

Open in MATLAB Online

grpstats(gpsdata(1:110,1:17), gpsdata.Var1)

"Unrecognized variable name 'Set Number'"

There is no string called 'Set Number' in that line of code. Also, Set Number (with a blank) is not a valid variable name, so I'm not surprised it did not work anyway. Check your variable names in MATLAB, as readtable parse them when reading from excel.

"As you can see in the attached test data, that string is present. However, even if I shift it and do:"

grpstats(gpsdata(2:110,1:17), gpsdata.Var1)

The above is the exact same as the previous line. I don't understand what you tried.

"Even if I take only the numeric data, copy it into another spreadsheet and try to process it with something like:"

grpstats(gpsdata(2:108),1:17), gpsdata.x1)

Something wrong with this line of code. It makes no sense, the braces are placed incorrectly.

Regarding the find function, there are indeed many ways to group and work with your data. The find function (or logical indexing) could be useful would you choose to move forward with a for loop. I suggested the above approaches as they are quite effecient and elegant ways to deal with grouped data, without using a loop. If you have experience with for loops, then perhaps that is the way to go. I do not understand why you'd want to find strings, as your grouping variable is a double.

Unless you give a more detailed account of how you want to process the groups, then I do not know what more advice to give you. As of right now, we are stuck on issues related to syntax and not the actual problem in question.

billyjthorton on 8 Oct 2018

Edited: billyjthorton on 8 Oct 2018

I apologize, I made a mistake with the braces on the third command. If you look closely between the first and second command, you'll noticed that it has been shifted by one to avoid a row. I provided the exact data I'm dealing with. Look at the original question as well as a post above and you'll see an attached spreadsheet with the data. I shifted the data I was looking at by a row to avoid "Set Number", which, while not in the code, is in the data set. As for the third command, I thought I was clear on what it was doing. It was designed to run in the above dataset, without any of the strings (just grabbing the block of data that I was going to use anyways), but it's more or less just a modified version of the first two and no success on that front either. I firmly believe I should be able to do this without trying to remove the strings and just select the data that I want to use as a sub-matrix.

I apologize, I don't know how much more clear to be? I have written a more detailed explanation of exactly what I want to do. Alas, I will try again:

I would like to capture the grouping of all the data under set "1", then do a capture of all the data under set "2" and so on and so forth. Once I have other a sub array or indices, I can then take the data and perform operations. Particularly, go row by row and extract the GPS data to perform geolocation using an ILS method. Does that help answer what you are looking for?

Is there any way you'd be willing to open up the data I sent and be able to provide more specific help? I'm really trying to learn, but I'm really struggling using this data set.

jonas on 8 Oct 2018

Open in MATLAB Online

I'll write a more elaborate response in a bit. For now, just two comments. First:

grpstats(gpsdata(2:108,1:17), gpsdata.x1)

Wrong syntax. If you insert a table, then the grouping variable arg should be a string describing the variable name. If you want the column gpsdata.x1 as your grouping variable, then the correct syntax is:

grpstats(gpsdata(:,1:17), 'x1')

This should work as long as each of the 17 first columns contain numerical data only, and one variable is named x1.

Second, I assumed you had imported the table correctly but realized now that this is not the case. Follow Guillaumes advice. If it still does not work, then use the 'range' argument of readtable to skip the first line of the sheet. As for now, all your variables are imported as strings, as the first row in your table is a string. Note that it doesn't matter that you write:

gpsdata(2:108,1:17)

each column is still interpreted as strings.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 8 Oct 2018

Open in MATLAB Online

1 vote

It's unclear what problem you are actually facing with Jonas' examples. As Jonas says, you can easily use any of the grouping functions that work with tables. Not having the stats toolbox, I use varfun and rowfun. Here is a rowfun example to works with your attached excel file. I use it to calculate the mean 3D distance (from origin) of the satellites in each group. Adapt as required:

%function to use with rowfun. Defined as anonymous function here. Could be an m file instead
%takes 3 column vector inputs, x,y and z coordinates. Return a scalar
func = @(x, y, z) mean(sqrt(x.^2 + y.^2 + z.^2));
%import file
opts = detectImportOptions('GPS_data.xlsx');  %works better at figuring out that the header is on row 2
satdata = readtable('GPS_data.xlsx', opts);
%apply function to each set of satellites
rowfun(func, satdata, 'GroupingVariables', 'SetNumber', 'InputVariables', {'X_coordinate', 'Y_coordinate', 'Z_coordinate'}, 'OutputVariableNames', 'distance')

8 Comments
Show 6 older comments Hide 6 older comments

billyjthorton on 8 Oct 2018

Open in MATLAB Online

Each of the coordinates of each set will be used to feed an ILS algorithm. More particularly, it would be good to create a variable for each set of x, y and z coordinates that can be used to feed the algorithm.

Here is a program that I wrote that does it for a set amount of coordinates:

% Given measurements
x_true = [1132049; -4903445; 3905453; 85000]' % meters
x_init = [0;0;0;0]
% SV locations
sv1=[15764733;-1592675;21244655]
sv2=[6057534;-17186958;19396689]
sv3=[4436748;-25771174;1546041]
sv4=[-9701586;-19687467;15359118]
sv5=[23617496;-11899369;1492340]
sv6=[14540070;-12201965;18352632]
% x_old values
num_it = 5
x_old=zeros(num_it+1,4)
% sigma
sd=5;
% Generate Measurements
rho1=sqrt((sv1(1)-x_true(1))^2 + (sv1(2)-x_true(2))^2 + (sv1(3)-x_true(3))^2) + x_true(4)+sd*randn(1)
rho2=sqrt((sv2(1)-x_true(1))^2 + (sv2(2)-x_true(2))^2 + (sv2(3)-x_true(3))^2) + x_true(4)+sd*randn(1)
rho3=sqrt((sv3(1)-x_true(1))^2 + (sv3(2)-x_true(2))^2 + (sv3(3)-x_true(3))^2) + x_true(4)+sd*randn(1)
rho4=sqrt((sv4(1)-x_true(1))^2 + (sv4(2)-x_true(2))^2 + (sv4(3)-x_true(3))^2) + x_true(4)+sd*randn(1)
rho5=sqrt((sv5(1)-x_true(1))^2 + (sv5(2)-x_true(2))^2 + (sv5(3)-x_true(3))^2) + x_true(4)+sd*randn(1)
rho6=sqrt((sv6(1)-x_true(1))^2 + (sv6(2)-x_true(2))^2 + (sv6(3)-x_true(3))^2) + x_true(4)+sd*randn(1)
% Estimation
for i = 1:5
   rho1_hat=sqrt((sv1(1)-x_init(1))^2 + (sv1(2)-x_init(2))^2 + (sv1(3)-x_init(3))^2) + x_init(4)
   rho2_hat=sqrt((sv2(1)-x_init(1))^2 + (sv2(2)-x_init(2))^2 + (sv2(3)-x_init(3))^2) + x_init(4)
   rho3_hat=sqrt((sv3(1)-x_init(1))^2 + (sv3(2)-x_init(2))^2 + (sv3(3)-x_init(3))^2) + x_init(4)
   rho4_hat=sqrt((sv4(1)-x_init(1))^2 + (sv4(2)-x_init(2))^2 + (sv4(3)-x_init(3))^2) + x_init(4)
   rho5_hat=sqrt((sv5(1)-x_init(1))^2 + (sv5(2)-x_init(2))^2 + (sv5(3)-x_init(3))^2) + x_init(4)
   rho6_hat=sqrt((sv6(1)-x_init(1))^2 + (sv6(2)-x_init(2))^2 + (sv6(3)-x_init(3))^2) + x_init(4)
   % partial derivatives
   h1=[ -[(sv1(1)-x_init(1)) (sv1(2)-x_init(2)) (sv1(3)-x_init(3))]/(rho1_hat-x_init(4)) 1]
   h2=[ -[(sv2(1)-x_init(1)) (sv2(2)-x_init(2)) (sv2(3)-x_init(3))]/(rho2_hat-x_init(4)) 1]
   h3=[ -[(sv3(1)-x_init(1)) (sv3(2)-x_init(2)) (sv3(3)-x_init(3))]/(rho3_hat-x_init(4)) 1]
   h4=[ -[(sv4(1)-x_init(1)) (sv4(2)-x_init(2)) (sv4(3)-x_init(3))]/(rho4_hat-x_init(4)) 1]
   h5=[ -[(sv5(1)-x_init(1)) (sv5(2)-x_init(2)) (sv5(3)-x_init(3))]/(rho5_hat-x_init(4)) 1]
   h6=[ -[(sv6(1)-x_init(1)) (sv6(2)-x_init(2)) (sv6(3)-x_init(3))]/(rho6_hat-x_init(4)) 1]
   h=[h1;h2;h3;h4;h5;h6]
   % x_init - x_hat
   dy=[rho1-rho1_hat;rho2-rho2_hat;rho3-rho3_hat;rho4-rho4_hat;rho5-rho5_hat;rho6-rho6_hat]
   %x_hat = x_init + inv(H.'*inv(R)*H)*H.'*inv(R)*(y_1 - h)
   dx=inv(h'*inv(sd^2)*h)*h'*inv(sd^2)*dy
   x_init=x_init+dx
   x_old(i+1,:)=x_init'
end
% Show Results
format long
iteration_results=x_old

Now I just need to implement it so that all the SVs will be each of the coordinate pairs and then proceed into the loop as normal

billyjthorton on 9 Oct 2018

Edited: billyjthorton on 9 Oct 2018

Yes, that is correct. What is the alternative to indexed variable names in this case? Leverage each as a function of the loop variable? What would be the loop variable in this case?

Guillaume on 9 Oct 2018

Open in MATLAB Online

I'm sorry but Ew! One rule that you need to put into practice immediately: numbered variables are always the wrong approach. sv1 should be sv(1), sv{1} or sv(1, :) or something similar. And every time you're repeating more or less the same lines of code, you're doing it wrong. Computers are very good at repeating things, let them do the work.

Your script, written in proper matlab:

% Given measurements
x_true = [1132049, -4903445, 3905453, 85000]; % meters
x_init = [0, 0, 0, 0];
% SV locations
sv = [15764733, -1592675,   21244655
      6057534,  -17186958,  19396689
      4436748,  -25771174,  1546041
      -9701586, -19687467,  15359118
      23617496, -11899369,  1492340
      14540070, -12201965,  18352632];
% x_old values
num_it = 5;
x_old = zeros(num_it, 4);
% sigma
sd=5;
% Generate Measurements
rho = sqrt(sum((sv - x_true(1:3)) .^ 2, 2)) + x_true(4) + sd*randn(1);
% Estimation
for i = 1:num_it
   rho_hat = sqrt(sum((sv - x_init(1:3)) .^ 2, 2)) + x_init(4);
   % partial derivatives
   h = [-(sv - x_init(1:3)) ./ (rho_hat - x_init(4)), ones(size(sv, 1), 1)];
   % x_init - x_hat
   dy = rho - rho_hat;
   %x_hat = x_init + inv(H.'*inv(R)*H)*H.'*inv(R)*(y_1 - h)
   dx = inv(h'*inv(sd^2)*h)*h'*inv(sd^2)*dy;
   x_init = x_init + dx.';
   x_old(i, :) = x_init;
end

Also note the warnings in the editor. I'm fairly certain you shouldn't be using inv but that's outside of my domain of expertise.

What I'm missing is what that script has to do with the table. What are the inputs and outputs of that scripts and how do they match the table?

Sign in to comment.

Answer 2

jonas on 8 Oct 2018

Edited: jonas on 8 Oct 2018

Open in MATLAB Online

0 votes

Method 1 - grpstats

For the sake of simplicity, let's use only the first 5 columns in this example.

%%Import data
opts = detectImportOptions('GPS_data.xlsx'); 
T = readtable('GPS_data.xlsx', opts);
T = T(:,1:5);

T =

109×5 table
    SetNumber    PRN    X_coordinate    Y_coordinate    Z_coordinate
    _________    ___    ____________    ____________    ____________
        1        17      2.4716e+06     -1.4705e+07       2.237e+07 
        1         6     -5.2026e+06     -2.5099e+07      6.9419e+06 
    ...

Continuing...

%%Split and apply custom function
grpstats(T,'SetNumber',@func);
function out=func(x)
  out=mean(x);
end

You can write your own function, just remember that the input, x, is always a column vector containing one group and a single variable. This is of course limiting, because you cannot perform complex operations on multiple variables in your function, or at least I do not know how. For this, it is better to use another method.

Method 2 - findgroups/splitapply

Read the data just like before, but transform the table into an array.

A=table2array(T);

Now, split the dataset in groups and perform some operation with all 4 variables (excluding Set Number) at once.

%%First column is grouping set
G = findgroups(A(:,1));
out1 = splitapply(@func,A(:,2:end),G);
function out1=func(vars)
vars
%%perform some calculations here
end

You can see that I have written out the variables passed to the custom function. What is nice here is that you can process all four variables in the same function. You can pass multiple outputs back to the main script if you so desire, however each output should be scalar.

vars =

0e+07 *
0000    0.2501   -1.4700    2.2370
0000   -0.5196   -2.5091    0.6975
    ...

I've never used rowfun personally but I suppose it's very similar to the second method.

8 Comments
Show 6 older comments Hide 6 older comments

billyjthorton on 9 Oct 2018

Open in MATLAB Online

Ah, that makes sense. Thank you for your help and patience! It follow at least from the error code, but the issue seems to be in the function itself. I followed your code earlier:

function out1=func(vars)
xcoord = vars(:,2)
ycoord = vars(:,3)
zcoord = vars(:,4)
out1 = xcoord + ycoord + zcoord 
end

With still the same scalar error. But isn't out1 a scalar in this case already? Additionally, right now it only provides a loop through the first group. Theoretically, should it be looping through all of them? I believe this scalar error is getting in the way, but I was trying to make sure what to expect.

jonas on 9 Oct 2018

Edited: jonas on 9 Oct 2018

Ugh, thats not scalar. Scalar is a single element. That is a matrix. Poor example from my end.

Sign in to comment.

How to traverse table row based on sets

15 Comments
Show 13 older comments Hide 13 older comments

Answers (2)

8 Comments
Show 6 older comments Hide 6 older comments

8 Comments
Show 6 older comments Hide 6 older comments

Categories

Tags

Community Treasure Hunt

How to traverse table row based on sets

15 Comments Show 13 older comments Hide 13 older comments

Answers (2)

8 Comments Show 6 older comments Hide 6 older comments

8 Comments Show 6 older comments Hide 6 older comments

Categories

Tags

See Also

Community Treasure Hunt

15 Comments
Show 13 older comments Hide 13 older comments

8 Comments
Show 6 older comments Hide 6 older comments

8 Comments
Show 6 older comments Hide 6 older comments