- Load in a given table (say jj=1)
- create TT2{1}
- find the min of TT{1}.temperature
- add a third column to TT{1} containing this min in every row
- repeat for mean, max
- repeat for TT{2:34}

# How to find the min, max and mean values of 34 timetables stored in a 1 x 34 cell and add them as extra columns to the respective timetables?

17 views (last 30 days)

Show older comments

I have 34 CSV files that each consist of N rows and 3 columns. My code loops through one CSV file at a time (storing them in a 1 x 34 cell) and firstly converts them into a N x 2 timetable, TT{jj}. I then filter this timetable by a timerange (the difference between two dates and times in format dd/MM/uuuu HH:mm) to leave all the values within that timerange which is stored in TT2{jj}. The timetables have 2 columns (date and time, temperature).

I now want to know how to find the min, max and mean values of the temperature column for each timetable and how to add these values to the original timetable to create a N x 5 timetable (date and time, temperature, min temp, max temp, mean temp) which would look like this:

e.g. for jj=1, the final table would look like:

Date/Time Temperature Min Max Mean

21/02/2020 08:00 20 16 20 18.25

21/02/2020 08:03 16

21/02/2020 08:06 18

21/02/2020 08:09 19

etc etc

Then it would loop again for jj=2 etc

The loop to create TT2{jj} is:

for jj = 1:34

thisfile{jj} = files{jj}.name; % creates a cell containing each CSV file name in directory

T{jj} = readtable(thisfile{jj},'Headerlines',19,'ReadVariableNames',true); % converts CSV to timetable ignoring rows 1-19

TT{jj} = table2timetable(T{jj}(:,[1 3])); % convert table to timetable and ignore column 2

TT2{jj} = TT{jj}(TR,:); % creates timetable containing all rows found within timerange TR

end

At the end of the loop (above) I export the 34 tables to one spreadsheet that has 34 tabs.

I have no idea how to proceed so any help would be appreciated.

##### 0 Comments

### Answers (3)

Sindar
on 30 Apr 2020

Is this what you want to do:

If so, then:

for jj = 1:34

thisfile{jj} = files{jj}.name; % creates a cell containing each CSV file name in directory

T{jj} = readtable(thisfile{jj},'Headerlines',19,'ReadVariableNames',true); % converts CSV to timetable ignoring rows 1-19

TT{jj} = table2timetable(T{jj}(:,[1 3])); % convert table to timetable and ignore column 2

TT2{jj} = TT{jj}(TR,:); % creates timetable containing all rows found within timerange TR

TT2{jj}.min_T=repelem(min(TT2{jj}{:,2}), size(TT2{jj},1), 1);

TT2{jj}.max_T=repelem(max(TT2{jj}{:,2}), size(TT2{jj},1), 1);

TT2{jj}.mean_T=repelem(mean(TT2{jj}{:,2}), size(TT2{jj},1), 1);

end

If you want to modify T (or likewise TT):

...

T{jj}.min_T=repelem(min(TT2{jj}{:,2}), size(T{jj},1), 1);

...

##### 4 Comments

Guillaume
on 1 May 2020

The simplest thing would be to add one column to each timetable to indicate the timetable of origin, then concatenate all these timetables into one timetable. Then with just one call to groupsummary, you can get your desired output.

However, if you get the mean, min and max for each timetable, so get one scalar value for each stat per timetable, I'm a bit unclear why you still want to store a datetime. Doesn't it become meaningless?

Anyway:

TT = cell(size(files));

for fileidx = 1:numel(files)

t = readtable(files(fileidx).name, 'Headerlines', 19, 'ReadVariableNames', true);

TT{fileidx} = table2timetable(t(:, [1, 3]))

TT{fileidx}.FileIndex(:) = fileidx; %add column with file number

end

alltimetables = vertcat(TT{:}); %concatenate all in one timetable

alltimetables = alltimetables(TR, :); %keep only desired timerange

ttstats = groupsummary(alltimetables, 'FileIndex', {'mean', 'min', 'max'}) %get mean min max for each FileIndex

Note that if you're using sufficiently recent version of matlab I'd replace the loop by:

TT = cell(size(files));

opts = detectImportOptions(files(1).name, 'NumHeaderLines', 19, 'ReadVariableNames', true);

opts.SelectedVariableNames = [1, 3]; %don't bother reading 2nd column

for fileidx = 1:numel(files)

TT{fileidx} = readtimetable(files(fileidx).name, opts);

TT{fileidx}.FileIndex(:) = fileidx; %add column with file number

end

%rest of code stays the same

Peter Perkins
on 5 May 2020

As others have said, it seems to make little sense to create new variables in each timetable, each of which are a column vector of a constant. Maybe you want something like the following.

First, make something like your data:

n = 3;

tt_list = cell(n,1);

for i = 1:3

X = rand(5,1);

Time = datetime(2020,5,i)+days(rand(5,1));

tt_list{i} = timetable(Time,X);

end

Now get the stats for each timetable, and put those in a table that also includes your cell array of timetables:

t = table(tt_list,zeros(n,1),zeros(n,1),zeros(n,1),'VariableNames',["Data" "Mean" "Min" "Max"]);

for i = 1:n

t.Mean(i) = mean(tt_list{i}.X);

t.Min(i) = min(tt_list{i}.X);

t.Max(i) = max(tt_list{i}.X);

end

From that, you end up with

>> t

t =

3×4 table

Data Mean Min Max

_______________ _______ _______ _______

{5×1 timetable} 0.67375 0.4607 0.94475

{5×1 timetable} 0.56289 0.15039 0.9865

{5×1 timetable} 0.52956 0.26661 0.91785

That's the brute force way. As Guillaume suggests, you might find it convenient to put all your timetables in one longer one. The following gets you essentially yhe same table as above.

tt = vertcat(tt_list{:});

tt.Source = repelem(1:n,5)';

fun = @(x) deal(mean(x),min(x),max(x));

t = rowfun(fun,tt,'GroupingVariable','Source','NumOutputs',3, ...

'OutputFormat','table','OutputVariableNames',["Mean" "Min" "Max"])

>> t =

3×5 table

Source GroupCount Mean Min Max

______ __________ _______ _______ _______

1 5 0.67375 0.4607 0.94475

2 5 0.56289 0.15039 0.9865

3 5 0.52956 0.26661 0.91785

I used rowfun; splitapply or groupsummary would also work.

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!