Importing large .raw file

5 views (last 30 days)
Anas Khan
Anas Khan on 11 Nov 2020
Edited: Anas Khan on 12 Nov 2020
I am attempting to import a large (>6 GB) .raw file containing time series data from neurons. The goal of my script is to store all the voltage/time traces from each electrode from each of the 96 wells of my culture plate into a matrix that I can work with later. I have worked out all of the bugs in the code and now have gotten it to the point where it is just taking an eggregious amount of time to finish loading the data. I'm wondering if it is possible to speed this up any how. Thanks!
%% Script to take raw voltage data and convert into spike data for a rasterplot
%RawV=AxisFile(‘Filename.raw’).RawVoltageData.LoadData('A1','13',[0 300]); % Loads first 5 minutes of raw voltage data from "Filename.raw", well A1, electrode 13 and stores it in the variable "RawV"
% The above command returns a 4-D cell array in which all cells are empty
% except for RawV{1,1,1,3] where {WR,WC,EC,ER}
%[t,v]=RawV{1,1,1,3}.GetTimeVoltageVector; % Retrieves voltage (v) and time (t) vectors from the above raw data set
%plot(t,v) % Plots voltage against time to visualize work so far
%%
Rows = {'A','B','C','D','E','F','G','H'}; % Array of rows
Cols = [1:1:12]; % Vector of 12 columns by 1
elecC = [1:3]; % electrode column 1 to 3 by 1
elecR = [1:3]; % electrode row 1 to 3 by 1
cntr = 1;
for y = Cols % Iterate over each element in Cols
for x = Rows % Iterate over each element in Rows
for z=elecC % Iterate over each element in elecC
for q=elecR % Iterate over each element in elecR
RC_str = cell2mat([x,num2str(y)]); % Combines elements from row and column variables to identify the well e.g. 'A1' from 'A' and 1 and stores as a non-string variable
eRC_str = [num2str(z),num2str(q)]; % Same thing for electrode row and colume but store as a string
if strcmp(eRC_str,'32')
continue
end
RawV=AxisFile('Maestro_(000).raw').RawVoltageData.LoadData(RC_str,eRC_str,[0 300]); % Function loads data from file 'Maestro_(000).raw': well RC_str, electrode eRC_str
RowVal = find(strcmp(Rows,x)); % Converts 'A' to 1, 'B' to 2 and etc.
[t,v]=RawV{RowVal,y,z,q}.GetTimeVoltageVector; % Function to extract voltage vs time data from RawV: RowVal, y, z, q specify well and electrode
Vdata(:,cntr) = v;
cntr = cntr + 1;
end
end
end
end
  2 Comments
Walter Roberson
Walter Roberson on 11 Nov 2020
How is AxisFile() implemented? If it is not using fseek() to position to locations in the file, then it is probably going to be faster to load larger parts of the file at a time.
Based on the form of your expressions, it looks to me as if AxisFile() might be based on a Java class?
Anas Khan
Anas Khan on 11 Nov 2020
Thanks so much for your reply! Forgive me, I am quite new to the programming world. A lot of the jargon goes over my head. I looked through the AxisFile() function and saw that it is using fseek() somewhere. I have attached the .m file also!

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 12 Nov 2020
You should avoid creating the reading object each time, and you should avoid re-doing work. And you should pre-allocate
I suspect there are more efficiencies, but it would be necessary to study the AxisFile class more carefully.
I am not clear as to why Column 3, Electrode Row 2 is being skipped? That would reduce from 12*3*3 = 96 to 93 columns, and makes it more difficult to keep track of which column holds which data.
%% Script to take raw voltage data and convert into spike data for a rasterplot
%RawV=AxisFile(‘Filename.raw’).RawVoltageData.LoadData('A1','13',[0 300]); % Loads first 5 minutes of raw voltage data from "Filename.raw", well A1, electrode 13 and stores it in the variable "RawV"
% The above command returns a 4-D cell array in which all cells are empty
% except for RawV{1,1,1,3] where {WR,WC,EC,ER}
%[t,v]=RawV{1,1,1,3}.GetTimeVoltageVector; % Retrieves voltage (v) and time (t) vectors from the above raw data set
%plot(t,v) % Plots voltage against time to visualize work so far
%%
Rows = {'A','B','C','D','E','F','G','H'}; % Array of rows
Cols = [1:1:12]; % Vector of 12 columns by 1
elecC = [1:3]; % electrode column 1 to 3 by 1
elecR = [1:3]; % electrode row 1 to 3 by 1
cntr = 1;
RawVoltageData = AxisFile('Maestro_(000).raw').RawVoltageData;
for y = Cols % Iterate over each element in Cols
ystr = sprintf('%d', y);
for RowVal = Rows % Iterate over each element in Rows
x = Rows{RowVal};
RC_str = [x, ystr]; % Combines elements from row and column variables to identify the well e.g. 'A1' from 'A' and 1 and stores as a non-string variable
for z=elecC % Iterate over each element in elecC
zstr = sprintf('%d', z);
for q=elecR % Iterate over each element in elecR
qstr = sprintf('%d', q);
eRC_str = [zstr, qstr]; % Same thing for electrode row and colume but store as a string
if strcmp(eRC_str,'32')
continue
end
RawV = RawVoltageData.LoadData(RC_str,eRC_str,[0 300]); % Function loads data from file 'Maestro_(000).raw': well RC_str, electrode eRC_str
[t,v] = RawV{RowVal,y,z,q}.GetTimeVoltageVector; % Function to extract voltage vs time data from RawV: RowVal, y, z, q specify well and electrode
if cntr == 1
Vdata = zeros(length(v),96); %pre-allocate!
end
Vdata(:,cntr) = v;
cntr = cntr + 1;
end
end
end
end
Vdata = Vdata(:,1:cntr - 1); %we allocated for maximum size, but some items were skipped so trim down
  1 Comment
Anas Khan
Anas Khan on 12 Nov 2020
Edited: Anas Khan on 12 Nov 2020
Really appreciate your time! Electrode 32 is being skipped because it is turned off in the acquisition software - no data in it - and was causing the script to terminate with an error until I inserted the line to skip it. I do have one question, I think Vdata should have many more than 96 columns, right? 96 wells, but each well has 8 electrodes, so 96 * 8 = 768 columns (I assume starting from electrode 11, 12, 13, 21, 22, etc. etc. for the A1 well, then going to the B1 well until the entire first column of the plate is populated. Then, if I understand correctly, the script hops over to the top of the 2nd column (A2) on the plate. I tried to initialize Vdata outside of the for loop like this:
Vdata = NaN(3750000,768);
Where 3750000 is the "length(v)" you have, but MatLab gave me an error saying the array was too large so I took it out..

Sign in to comment.

More Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!