How to search for channel name and numerical data in resulting struct after importing multiple data files?

6 views (last 30 days)
Questions:
1.) In avoiding using eval to dynamically name variables, how do I search a resulting struct of data and labels to link a channel name to a data column and then analyze multiple channels all having the same name?
2.) How do I properly write an if or switch case statement to deal with importing a single or multiple data files when the resulting workspace object is either a character array for a single file or a cell array for multiple files?
Background:
Currently using Matlab R2014b. I'm trying to write a script to select which data files, import / load the data, and place the data into a matrix or array or struct or whatever is most useful and appropriate for signal analysis, processing, and plotting afterward.
My data files are an export from a data acquisition tool (ATI VISION). The generated .mat file creates one cell array and one matrix. The cell array contains the text names of the data channels. This is n x 1 in size, where n is the number of channels exported. The matrix contains the numerical data and is m x n in size, where n is again the number of exported data channels and m is the number of samples.
The cell array of names has nearly zero consistency in the organization of the names (not alphabetical, not any order representing a channel number in the recording tool). The only consistency is that the nth row in the cell array shows no name "[]" but is always the "time" channel, and this always corresponds to column 1 in the data matrix. I've attached two mat files for reference. You can see in one file the first three rows are 'AngleSlipPoint2', 'AngleSlipPoint1', and 'AngleSlip', but in the other file the first three rows are 'PosLon', 'FRSpeed', and 'AccActPos'. I know my script can't be as easy as always accessing column 2 for x data and column 6 for y data. I need to search the cell array to link a name to a column in each individual data file.
In the two weeks that I've now been teaching myself how to write scripts and analyze data with Matlab I've apparently learned to do things the ill-advised way. The first obstacle I tried to address was aligning the data name in the cell array with the appropriate column in the data matrix. Because the cell array appears to be an array of text and not characters or strings I could find no other way to pull out the names, link them to data, and generate a variable than to use the frequently unrecommended eval function.
%% Extract Data
uiload
NumVars = numel(Data_Labels); % Establish number of variables to be created.
Time = Data(:,1); % Time values are always the first column of the Data matrix, so it's easy to define and create.
for k = 1:NumVars-1 % Time already created and exists as final channel, so we only need to generate variables for the remaining n-1 variables.
eval([Data_Labels{k},'=[Data(:,k+1)]']); % Extract variable names and populate with data in workspace.
end
This worked for a single data file as it gets me a workspace full of variables and correctly populates them with the numerical data. I can integrate, derive, filter, and plot whatever I want. It fails miserably as soon as I attempt to load a second file as the next import will overwrite everything created from the first file. Hard to compare longitudinal acceleration in 2wd and 4wd when the newest import overwrites the old, and it's understandably stupid to write the script to append a 1/2/3 to the end of the name so I can have multiple instances in the workspace.
This is what I've come up with for importing multiple files. Still using the two attached files as my test files for writing the script.
%% Select files for 2WD analysis
[selected2wdFiles,pathName2wd] = uigetfile('*.mat','Select 2WD data files for analysis','MultiSelect','on');
if isequal(selected2wdFiles, 0)
disp('No Files Selected')
return;
end
for m = 1:length(selected2wdFiles)
data2wd(m) = load(fullfile(pathName2wd, selected2wdFiles{m}));
end
%% Select files for 4WD analysis
[selected4wdFiles,pathName4wd] = uigetfile('*.mat','Select 4WD data files for analysis','MultiSelect','on');
if isequal(selected4wdFiles, 0)
disp('No Files Selected')
return;
end
for n = 1:length(selected4wdFiles)
data4wd(n) = load(fullfile(pathName4wd, selected4wdFiles{n}));
end
This generates two structs, data2wd and data4wd, which contain the loaded cell arrays and data matrices. Unfortunately this script only works if I am selecting multiple files. If I only select one file it fails because the resulting item is a character array instead of a cell array. I haven't tried to script around that, but I suppose a switch case or if statement should work. Question #2 above...any suggestions?
The next step / steps is where I am lost. I believe I have avoided dynamically named variables, but I don't know how to go about extracting my longitudinal acceleration data from each data set. The specific channel name in the cell array of text is going to be 'AccelForward'. I know I need to search the cell array in row 1, column 2 of the struct to find the row number containing that name. This will tell me which column to access in the matrix stored in row 1, column 1 of the struct. Because it is a cell array of text the strfind command doesn't work. They aren't strings. Similarly they aren't characters either, so the related char commands don't work. Without using eval to extract things, how do I go about searching an array of text?
Once I can find the name, identify the data column, and the locate the actual data, how do I manipulate it without falling back on dynamically named workspace variables? I feel like I'm going to end up with pulling these columns of data back into the workspace as AccelForward_1, AccelForward_2, etc. and then more complicated and dynamic because I will have 2wd and 4wd data being compared and plotted against eacy other. What's the correct way to identify the data, manipulate the data, store the new data, and then access it later for plotting? Do I just keep generating more structs or arrays or matrices to stuff the data into and avoid a ridiculous workspace full of variables?
Now that I'm done writing a novel I suppose I simply don't know what I don't know and it makes it difficult to search and find answers. If anyone can put some labels on the forks in the road and send me in a useful direction I'd appreciate it. Thank you.
  1 Comment
Stephen23
Stephen23 on 5 Apr 2019
"The only consistency is that the nth row in the cell array shows no name "[]" but is always the "time" channel, and this always corresponds to column 1 in the data matrix."
Ouch!

Sign in to comment.

Accepted Answer

Stephen23
Stephen23 on 5 Apr 2019
Edited: Stephen23 on 5 Apr 2019
You are right to avoid dynamically accessing variable names (e.g. using eval, assignin, evalin, and load without an output variable). Read this to know some of the reasons why:
Here is one simple solution for your task, using a non-scalar structure and dynamic fieldnames:
Using structure fields makes the order of the columns in the numeric matrix totally irrelevant.
[F,P] = uigetfile('*.mat','2WD','MultiSelect','on');
if isnumeric(F)
error('User quit')
elseif ischar(F)
F = {F};
end
S = struct('filename',F);
for ii = 1:numel(F)
T = load(fullfile(P,F{ii}));
L = [{'Time'};T.Data_Labels(1:end-1)]; % fix "Time" column mismatch
for jj = 1:numel(L)
S(ii).(L{jj}) = T.Data(:,jj);
end
end
The imported data is very easy to access in the structure, you only need to refer to the indices (corresponding to each file) and the fieldnames (corresponding to each data column), e.g:
>> S(1).filename
ans =
MKZ_2WD_LevelSnowAccel.mat
>> S(1).AccelForward([1:4,end-4:end])
ans =
-0.18
-0.18
-0.18
-0.18
... lots of lines
-1.92
-1.72
-1.2
-2.24
-4.1
>> S(1).Time([1:4,end-4:end])
ans =
-5.1505
-5.1405
-5.1305
-5.1205
... lots of lines
23.01
23.02
23.03
23.04
23.05
>> S(2).filename
ans =
MKZ_4WD_LevelSnowAccel.mat
>> S(2).AccelForward([1:4,end-4:end])
ans =
-0.07
-0.07
-0.07
-0.07
... lots of lines
0.23
0.19
0.3
0.33
0.27
>> S(2).Time([1:4,end-4:end])
ans =
-5.3711
-5.3611
-5.3511
-5.3411
... lots of lines
24.069
24.079
24.089
24.099
24.109
You could also do something similar with tables, timetables, or by rearranging the columns of the numeric array to have the same order.
  2 Comments
Scooby921
Scooby921 on 13 May 2019
Edited: Scooby921 on 13 May 2019
As a follow-up a month later...thank you again! Worked with it a bit and learned a good deal more about working with structs. Wound up extended the script to include calling data from these initially generated structs, deriving acceleration from velocity, appending a lost data point, creating and applying filters, and loading everything back into a new struct of filtered data.
Just in case anyone looks up this question / answer and wants to see my end-result. Added notes at the end for colleagues who might use this script and may not fully understand what I've done.
%% Initialize
close all
clear
clc
%% Select and load 2wd data files into struct
[F2,P2] = uigetfile('*.mat','Select 2WD Data Files','MultiSelect','on');
if isnumeric(F2)
error('User quit')
elseif ischar(F2)
F2 = {F2};
end
D2 = struct('filename',F2);
for ii = 1:numel(F2)
Tmp2 = load(fullfile(P2,F2{ii}));
L2 = [{'Time'};Tmp2.Data_Labels(1:end-1)]; % fix "Time" column mismatch
for jj = 1:numel(L2)
D2(ii).(L2{jj}) = Tmp2.Data(:,jj);
end
end
clearvars ii jj Tmp2 L2
%% Select and load 4wd data files into struct
[F4,P4] = uigetfile('*.mat','Select 4WD Data Files','MultiSelect','on');
if isnumeric(F4)
error('User quit')
elseif ischar(F4)
F4 = {F4};
end
D4 = struct('filename',F4);
for kk = 1:numel(F4)
Tmp4 = load(fullfile(P4,F4{kk}));
L4 = [{'Time'};Tmp4.Data_Labels(1:end-1)]; % fix "Time" column mismatch
for nn = 1:numel(L4)
D4(kk).(L4{nn}) = Tmp4.Data(:,nn);
end
end
clearvars kk mm Tmp4 L4
%% Define inertial sensor filter
AccelFilt = designfilt('lowpassiir', 'PassbandFrequency', 5, 'StopbandFrequency', 25, 'PassbandRipple', 1, 'StopbandAttenuation', 40, 'SampleRate', 100, 'MatchExactly', 'passband');
%% Derive wheel accelerations from wheel speeds
WhlAcc2 = struct('filename',F2,'FLAcc',zeros,'FRAcc',zeros,'RLAcc',zeros,'RRAcc',zeros);
for qq = 1:numel(F2)
WhlAcc2(qq).FLAcc = [diff(D2(qq).FLSpeed);0];
WhlAcc2(qq).FRAcc = [diff(D2(qq).FRSpeed);0];
WhlAcc2(qq).RLAcc = [diff(D2(qq).RLSpeed);0];
WhlAcc2(qq).RRAcc = [diff(D2(qq).RRSpeed);0];
end
WhlAcc4 = struct('filename',F4,'FLAcc',zeros,'FRAcc',zeros,'RLAcc',zeros,'RRAcc',zeros);
for rr = 1:numel(F4)
WhlAcc4(rr).FLAcc = [diff(D4(rr).FLSpeed);0];
WhlAcc4(rr).FRAcc = [diff(D4(rr).FRSpeed);0];
WhlAcc4(rr).RLAcc = [diff(D4(rr).RLSpeed);0];
WhlAcc4(rr).RRAcc = [diff(D4(rr).RRSpeed);0];
end
clearvars qq rr
%% Filter data and load into new struct
D2f = struct('filename',F2,'AccelxF',zeros,'AccelyF',zeros,'YawRateF',zeros,'FLAccF',zeros,'FRAccF',zeros,'RLAccF',zeros,'RRAccF',zeros);
for nn = 1:numel(F2)
D2f(nn).AccelxF = filtfilt(AccelFilt,D2(nn).AccelForward);
D2f(nn).AccelyF = filtfilt(AccelFilt,D2(nn).AccelLateralCorr);
D2f(nn).YawRateF = filtfilt(AccelFilt,D2(nn).AngRateZCorr);
D2f(nn).FLAccF = filtfilt(AccelFilt,WhlAcc2(nn).FLAcc);
D2f(nn).FRAccF = filtfilt(AccelFilt,WhlAcc2(nn).FRAcc);
D2f(nn).RLAccF = filtfilt(AccelFilt,WhlAcc2(nn).RLAcc);
D2f(nn).RRAccF = filtfilt(AccelFilt,WhlAcc2(nn).RRAcc);
end
D4f = struct('filename',F4,'AccelxF',zeros,'AccelyF',zeros,'YawRateF',zeros,'FLAccF',zeros,'FRAccF',zeros,'RLAccF',zeros,'RRAccF',zeros);
for pp = 1:numel(F4)
D4f(pp).AccelxF = filtfilt(AccelFilt,D4(pp).AccelForward);
D4f(pp).AccelyF = filtfilt(AccelFilt,D4(pp).AccelLateralCorr);
D4f(pp).YawRateF = filtfilt(AccelFilt,D4(pp).AngRateZCorr);
D4f(pp).FLAccF = filtfilt(AccelFilt,WhlAcc4(pp).FLAcc);
D4f(pp).FRAccF = filtfilt(AccelFilt,WhlAcc4(pp).FRAcc);
D4f(pp).RLAccF = filtfilt(AccelFilt,WhlAcc4(pp).RLAcc);
D4f(pp).RRAccF = filtfilt(AccelFilt,WhlAcc4(pp).RRAcc);
end
clearvars nn pp
%% Note
% At this point all 2wd data is loaded into a struct named "D2"
% and all 4wd data is loaded into a struct named "D4".
% Filtered 2wd data for accelerations is loaded into "D2f"
% and filtered 4wd data for accelerations is loaded into "D4f".
% All file names are loaded into the first column of those structs.
% To confirm which data file is in each row use the following syntax:
% D2(r).filename where 'r' is the row in question
% D4(r).filename where 'r' is the row in question
% Numerical data can be accessed by calling the struct, row, and name
% of desired data.
% Example: Call steering wheel angle for first 2wd data file -->
% D2(1).SteeringWheelAngle
% Example: Call front right wheel speed for fifth 4wd data file -->
% D4(5).FRSpeed
% Example: Call filtered long accel for third 2wd data file -->
% D2f(3).AccelxF
% Plotting will use the same syntax for calling a variable -->
% plot(D2(2).Time,D2(2).SteeringWheelAngle)
%

Sign in to comment.

More Answers (1)

Guillaume
Guillaume on 5 Apr 2019
Considering that one of the variable is time, you may be better off storing your data in a timetable rather than a structure
The principle would be the same, use the cell array of names to name the variables instead of fields.
I'm a bit confused about one thing. If the time is the first column of the matrix, why is it the last element of the cell array. Is the array of name reversed with regards to the data column or does Data_Label(1:end-1) correspond to Data(2:end)?
I'm assuming the time is in seconds:
filepath = 'MKZ_2WD_LevelSnowAccel.mat'; %obtained however you want, with uigetfile for eg.
filecontent = load(filepath);
signals = array2timetable(filecontent.Data(:, 2:end), 'RowTimes', seconds(filecontent.Data(:, 1)), 'VariableNames', filecontent.Data_Labels(1:end-1));
If you want to import multiple files, you can store each timetable in a cell array, or vertically concatenate them into one big timetable. For that, I'd add a column indicating which source file each row came from. The order of the variables in a table does not have to be the same when you vertically concatenate tables, so the mismatched ordering wouldn't be an issue.
  6 Comments
Scooby921
Scooby921 on 5 Apr 2019
Since there seem to be concerns with releases and available features...I started using R2014b because that's what we use for Simulink modeling and are stuck on that release for the moment to maintain model / s-function compatibility with customers. With my data analysis likely being a stand-alone function that I or other team members are going to run separate from model development I shouldn't have a problem upgrading to R2018b or 19a. If that opens up more options and makes life easier I'll go ahead and do that. Wasn't an initial thought or concern simply because I already had a version of Matlab loaded and working on my computer.
Guillaume
Guillaume on 5 Apr 2019
the original question mentions "Currently using Matlab R2014b..."
That, I did indeed miss in the wall of text (and the fact that the Release was tagged, I should have looked at that).
Yes the columns and names do match, just offset by 1 due to the time data being column 1 yet row n in the array of names
Then, both answers account for that. The timetable or structure use the names in whichever order they come to name the matching column.
Neither timetables or structures care about the ordering of the fields/variables when you operate on them (well as long as you are using the names and not numeric indices), so it does not matter if they're not in the same order from file to file.
%tables work the same way as timetables
t1 = array2table(rand(10, 3), 'VariableNames', {'Speed', 'Slip', 'Pitch'})
t1.Slip %will return the 2nd column of the table
t2 = arrat2table(rand(10, 3), 'VariableNames', {'Pitch', 'Speed', 'Slip'})
t2.Slip %will return the 3rd column of the table

Sign in to comment.

Products


Release

R2014b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!