Sorting time series data by category

Hi, I need to sort and count a time series data set. For example, there are 5 different species captured on camera throughout the year, each with unique time/name/size/direction of movement. How can I query the data to sort for all of the occurrences of species A and their associated data? After querying by species, I would like to produce a daily count. Ex) species A was recorded x times daily throughout the time period.

3 Comments

All depends on what data you've got stored and how...
The data is from a standard table (excel). Unique Categories are in columns. Each row represents a unique fish passage event. Each fish passage event contains numeric and text data: ex) mm/dd/yyyy hh:mm:ss, chinook salmon, 65cm, up.
Sounds like the table would be the ticket in Matlab with things such as species as categorical variables against which you can do grouping/selection/etc., ...

Sign in to comment.

 Accepted Answer

Let's say you're starting with the spreadsheet version of something like this:
Time,Species,Size,Direction
10/25/2016 10:11:12,chinook salmon,65cm,up
10/25/2016 13:14:15,chinook salmon,66cm,up
10/25/2016 16:17:18,steelhead trout,67cm,down
First read the data in as a table.
>> T = readtable('fishData.csv')
T =
Time Species Size Direction
___________________ _________________ ______ _________
10/25/2016 10:11:12 'chinook salmon' '65cm' 'up'
10/25/2016 13:14:15 'chinook salmon' '66cm' 'up'
10/25/2016 16:17:18 'steelhead trout' '67cm' 'down'
In recent versions of MATLAB, that first column in the file will automatically be a datetime variable in T, in earlier versions you can convert it. The other three come in as strings, it will be more convenient to have two categoricals and a numeric.
>> T.Species = categorical(T.Species);
>> T.Size = str2double(strrep(T.Size,'cm',''));
>> T.Direction = categorical(T.Direction);
>> T
T =
Time Species Size Direction
___________________ _______________ ____ _________
10/25/2016 10:11:12 chinook salmon 65 up
10/25/2016 13:14:15 chinook salmon 66 up
10/25/2016 16:17:18 steelhead trout 67 down
Now you can select or sort on time, species, size or direction, such as
>> T(T.Species == 'chinook salmon',:)
ans =
Time Species Size Direction
___________________ ______________ ____ _________
10/25/2016 10:11:12 chinook salmon 65 up
10/25/2016 13:14:15 chinook salmon 66 up
You can also get daily counts by species. If you also want something like the mean size, varfun would be one way to do that:
>> T.Day = dateshift(T.Time,'start','day'); T.Day.Format = 'MM/dd/yyyy';
>> varfun(@mean,T,'GroupingVariables',{'Day' 'Species'},'InputVariables','Size')
ans =
Day Species GroupCount mean_Size
__________ _______________ __________ _________
10/25/2016 chinook salmon 2 65.5
10/25/2016 steelhead trout 1 67
If you have access to R2016b, you probably want to look at the new timetable type, which makes things like daily counts simpler, using synchronize.
Hope this helps.

1 Comment

Wow! thank you so much Peter, I will run through the workflow today and let you know how it goes. I had worked my way into a method using strings but it was a little messy. Again, much appreciated. Best,Ryan

Sign in to comment.

More Answers (0)

Asked:

on 24 Oct 2016

Commented:

on 26 Oct 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!