Extract data from HTML Table

15 views (last 30 days)
I have a HTML file (attached at .txt) with a table of data I would like to extract. I found the FEX file htmltabletocell, but I don't thtink that will work for me as my table has several repetitions of simalar data blocks with the same initial string.
Ultimately, I would like to extract the data values from only a few of the selected fields below (for example Date/Time and Diagnostic Reason only)
Any help is much appreciated.

Accepted Answer

Mathieu NOE
Mathieu NOE on 5 Mar 2021
hello Marcus
here you are, simple code shipset :
Filename = 'html_table.txt';
[myDate, myDiagnostic] = extract_data(Filename)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [myDate, myDiagnostic] = extract_data(Filename)
fid = fopen(Filename);
tline = fgetl(fid);
% initialization
k = 0;
p = 0;
q = 0;
k_parameter = 0;
q_parameter = 0;
myDate = [];
myDiagnostic = [];
while ischar(tline)
k = k+1; % loop over line index
% retrieve line Date/Time
if contains(tline,'Date/Time')
k_parameter = k;
p = p+1;
end
if p>0 & k == k_parameter + 3
myDate = [myDate; cellstr(tline)];
end
% retrieve line Date/Time
if contains(tline,'Diagnostic')
q_parameter = k;
q = q+1;
end
if q>0 & k == q_parameter + 3
myDiagnostic = [myDiagnostic; cellstr(tline)];
end
tline = fgetl(fid);
end
fclose(fid);
end
gives following results :
myDate =
3×1 cell array
{'2021-01-20 18:17'}
{'2021-01-20 18:16'}
{'2020-12-10 22:44'}
myDiagnostic =
3×1 cell array
{'LARGE PT'}
{'LARGE PT'}
{'LG PT' }

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!