Cumsum of array, sum of previous data

2 views (last 30 days)
I have a database, where each row is a tennis match, you can read ID_Tournament, ID_Round, DATE_Game, ID1 Player1, ID2 Player2, FirstServe Player1, DoubleFault Player1, FirstServe Player2, DoubleFault Player2. I have imported the file in mathlab and now I want to create new columns next to the last one (column I)
The new column should be the SUM of the previous match statics of that player. I'll give you an example referred to the image. I would like to know how many FS player1 had before in the row11 (ID1 488). So in J11 I'll read 47 (only one match before row11 occurred). Instead in row16 in J16 the player ID1 488 previously had 2 matches, so the correct data should be 47+52=99. In each row I want to read the new data about previous matches. How could it be script? How can I set a parameter that could indicate how many days one should look back?
In matlab answer an user gave me this script that actually works but not properly, however I think it's a good start:
load('demo.mat');
d = demoS1{:,[4,6]};
[a,ii] = sortrows(d,1);
[~,~,c] = unique(a(:,1));
d1 = accumarray(c,a(:,2),[],@(x){cumsum(x(:))});
d1 = cat(1,d1{:});
[~,i1] = sort(ii);
demoS1.last_column = d1(i1);
Above, the script sum previous match and also the match that we are analyzing, and of course it is incorrect. Also I can't set (or I don't know how) how many days (or previous match) the script have to take and put in the sum.
Need your help guys. Thanx in advance

Accepted Answer

Brandon Eidson
Brandon Eidson on 11 Sep 2017
The contribution from the previous MATLAB answers post provides a very elegant solution that utilizes some of MATLAB's most powerful features (vectorization and logic indexing). However, depending on your experience level with MATLAB, such code can be difficult to understand and modify for your own purposes.
Below is some less elegant code that may be a bit more readable and accomplishes what you are looking for (assuming you meant J16 should be 47 + 38).
load('demo.mat');
numRows = length(demoS1.ID1(:,1));
demoS1.newColumn = zeros(numRows,1);
for k = 1:numRows
id = demoS1.ID1(k);
indexes = demoS1.ID1(1:k-1) == id;
demoS1.newColumn(k) = sum(demoS1.FS_1(indexes));
end
If you are anticipating working with enormous datasets, it is best to learn the initial approach that avoids the "for" loop. If interested, a good way to begin mastering such techniques is to take the free training, MATLAB Onramp .

More Answers (1)

Mirko Piccolo
Mirko Piccolo on 11 Sep 2017
Edited: Mirko Piccolo on 11 Sep 2017
IT WORKS!! thank you Brandon! I will do some research on MATLAB Onramp as you said. Some final questions: 1. When a datasets is considered "enormous"? 2. How can I set a parameter that could indicate how many days should look back? 3. What if I have to check the player both in ID1 and ID2? In your script the SUM is only if the player is in ID1, but if previously in others match he is under ID2?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!