Could someone please help me speed up my code?
Show older comments
I have a code that I am trying to run that will end up taking days for me to execute. The problem is that I have to read in tons and tons of values to "mean" and a couple other matlab functions. I have 12 .mat files that I have to do my operation for, but I can't get it fast enough to get through one file in less than a few days. I really need help finding a way to speed everything up. Just name a file KCTDI001A.mat and make a 14002450x2 random number matrix to check the code.
clear
clc
addpath('C:\filepath');
AllFiles = [];
filenames = dir('C:\filepath');
profile clear
profile on
for ii = 3:length(filenames); % Start at third file( i.e., don’t include “.” and “..”)
%Get the filename
filename_timestamp = filenames(ii).name;
Index = findstr('A',filename_timestamp);
n = str2num(filename_timestamp(6:Index(1)-1));
File_Name = sprintf('KCTDI0%dA.mat',n);
DataFile = load(File_Name);
ACC_Data = DataFile.FileData(:,:);
for k = 1:13978444;
x = (ACC_Data(1+(k-1):24007+(k-1),:));
RMS(k,:) = sqrt(mean(x(:,:).^2));
end
new_name = sprintf('KCTDI0%dA_RMS.mat',n);
save(new_name,'RMS');
end
profile off
profile viewer
5 Comments
What is the purpose of the (:, :) in DataFile.FileData(:,:), is it
- to reshape an ND array into a 2D array, in which case an explicit reshape would make it a lot clearer,
- no purpose at all, it's the same as DataFile.FileData and is just there to confuse the reader.
?
In any case, it's certainly not needed for x(:, :).^2, since x is guaranteed to be 2D.
Tony Pate
on 27 Jun 2016
Your code includes use of the profiler so presumably that should turn your "I think" into something more certain as to where the bottleneck is?
Sqrt is not an especially fast operation when done many times, but often it cannot be avoided. I assume in this case you do need the square-rooted result? In some cases, e.g. sorting of results or some other similar comparisons the square-rooting is un-necessary because the answer will be the same without it.
Guillaume
on 27 Jun 2016
Also, using addpath just so you can load files in a different directory is not very good, this would be much better:
root = 'C:\filepath'
filenames = dir(root);
for ...
...
DataFile = load(fullfile(root, File_name));
Also, note that your code will never load a file named KCTDI001A.mat (as you suggest creating) since for n = 1, the name your sprintf generates is KCTDI01A.mat (one less 0).
Tony Pate
on 27 Jun 2016
Accepted Answer
More Answers (3)
Roger Stafford
on 27 Jun 2016
1 vote
It is the repeated ‘mean’ of 24008 elements at a time taken 13978444 times that is the time-consuming aspect of your computation. You would greatly increase your speed if you compute the column-wise cumulative sum of the squares of the x elements and use that to compute the equivalent of the mean instead. There is of course a loss of accuracy over such a large number of cumulative sums but perhaps that would be acceptable to you. If not, perhaps you could still break up things into overlapping cumulative blocks only relatively small multiples of 24008. You would still gain a lot of speed that way. Having to add almost the same set of numbers repeatedly in forming your means is bound to be an inefficient kind of procedure.
Jan Orwat
on 27 Jun 2016
1 vote
- If you have to use loop, preallocate variable RMS cause it seems it's changing size every iteration. With 14M iterations it may take "ages".
- It looks like you are doing moving average. Vectorize the code. Try movmean if you have MATLAB 2016a or newer. You can also do it via convolution, using conv/conv2, filter/filter2 or fft etc.
I found this to run much faster (about 23s on my machine): preallocate RMS_new, move the squaring and the division by N (to compute the mean) out of the loop, and then in each iteration subtract a single element and add a single element to be previous mean; finally do the square root.
K = 13978444;
N = 24007;
ACC_DataN = (ACC_Data.^2)/N;
RMS_new = nan(K, size(ACC_DataN, 2));
RMS_new(1, :) = sum(ACC_DataN(1:1+N-1, :));
for i = 2:K
RMS_new(i,:) = RMS_new(i-1,:) - ACC_DataN(i-1,:) + ACC_DataN(i+N-1,:);
end
RMS_new = sqrt(RMS_new);
1 Comment
Tony Pate
on 27 Jun 2016
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!