Filename to categorical in table

2 views (last 30 days)
Dylan den Hartog
Dylan den Hartog on 16 Oct 2020
Edited: Seth Furman on 16 Oct 2020
I created a datastore with accelerometer data of different files (S1_walking.csv, S1_sitting.csv and S1_laying.csv). Then I used to readall function to create a table with all data together.
Now, I want to add an categorical column containing a specific part of the file names to the corresponding part in the table. So next to every row of the 'walking data' a categorical 'walking' should be added, next to every row of 'sitting data' a categorical 'sitting' and next to every row of 'laying data' categorical 'laying'.
How can I do this easily by using the part of the filenames?
  1 Comment
Mathieu NOE
Mathieu NOE on 16 Oct 2020
I guess you want t make a string manipulation like this :
this will return 1 , so you can use it to fulfill (tick) the condition
filename = 'S1_walking.csv'
walking_tick = ~isempty(findstr(filename,'walking'))

Sign in to comment.

Answers (1)

Seth Furman
Seth Furman on 16 Oct 2020
Edited: Seth Furman on 16 Oct 2020
Assuming we know which rows contain data from each file, we can create a categorical variable and add it directly to the table.
For example,
>> % Here we assume 'data' is our table.
>> % Rows 1-3 are 'walking data', 4-8 are 'sitting data', and 9-16 are 'laying data'.
>> data = table((1:16)')
data =
16×1 table
Var1
____
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>> files = ["S1_walking.csv" "S1_sitting.csv" "S1_laying.csv"];
>> data.SourceFile = categorical([repmat(1,3,1);repmat(2,5,1);repmat(3,8,1)],[1 2 3],files)
data =
16×2 table
Var1 SourceFile
____ ______________
1 S1_walking.csv
2 S1_walking.csv
3 S1_walking.csv
4 S1_sitting.csv
5 S1_sitting.csv
6 S1_sitting.csv
7 S1_sitting.csv
8 S1_sitting.csv
9 S1_laying.csv
10 S1_laying.csv
11 S1_laying.csv
12 S1_laying.csv
13 S1_laying.csv
14 S1_laying.csv
15 S1_laying.csv
16 S1_laying.csv
Alternatively, we could add the categorical filename data by calling transform on the datastore with a custom transform function.
ds = datastore("sampleDir");
ds = transform(ds,@transformFcn,"IncludeInfo",true);
data = readall(ds)
function [data,dsInfo] = transformFcn(data,dsInfo)
[~,filename] = fileparts(dsInfo.Filename);
data.SourceFile = categorical(repmat(string(filename),height(data),1));
end
In my case this outputs the following
data =
16×2 table
Var1 SourceFile
____ __________
5 S1_laying
6 S1_laying
7 S1_laying
8 S1_laying
9 S1_laying
10 S1_laying
5 S1_sitting
6 S1_sitting
7 S1_sitting
8 S1_sitting
9 S1_sitting
10 S1_sitting
1 S1_walking
2 S1_walking
3 S1_walking
4 S1_walking

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!