Importing table/array from website to Matlab
1 view (last 30 days)
Show older comments
Hello. I am trying to import the central table from this web url 'https://tennisabstract.com/reports/atp_elo_ratings.html', from the players name to the "grass" column into MatLab, as an array or table. I want to do it this way because it keeps updating every week, but I do not know how to approach this problem which makes me need your help.
Thank you, Erik
0 Comments
Accepted Answer
Guillaume
on 3 Jul 2019
First, note that importing data from html is always going to be very iffy. html is a presentation format designed to display things to humans, it's not design for data transfer and you're going to have to remove all the presentation cruft to get at your data.
So, the first thing you should try is contacting the website to see if they have a direct interface to the underlying database.
Bearing this in mind, the following will import your data with the current format of the website. Any change, even minor to the format of that page may break the code.
%definition of patterns used to locate required information with a table row:
intpattern = '<td[^>]*>(\d+)</td>';
linkpattern = '<td[^>]*><a[^>]+>([^<]+)</a></td>';
numberpattern = '<td[^>]*>(\d+(\.\d+)?)</td>';
anypattern = '<td[^>]*>([^<]+)</td>';
emptypattern = '<td[^>]*></td>';
%read and parse html
html = webread('https://tennisabstract.com/reports/atp_elo_ratings.html');
tabledata = regexp(html, ['<tr[^>]*>', ...
intpattern, ...
linkpattern, ...
numberpattern, ...
numberpattern, ...
emptypattern, ...
numberpattern, ...
numberpattern, ...
numberpattern, ...
emptypattern, ...
anypattern, ...
numberpattern, ...
numberpattern], 'tokens');
assert(~isempty(tabledata), 'Failed to parse html according to pattern. The format of the page may have changed');
tabledata = cell2table(vertcat(tabledata{:}), 'VariableNames', {'Rank', 'Player', 'Age', 'ELO', 'Hard', 'Clay', 'Grass', 'Peak_Match', 'Peak_Age', 'Peak_ELO'});
tabledata = convertvars(tabledata, [1, 3:7, 9, 10], @str2double);
tabledata.Player = strrep(tabledata.Player, ' ', ' ')
5 Comments
Guillaume
on 3 Jul 2019
Don't use c as a variable name. It's meaningless and doesn't say anything about what it contains.
Assuming, you've imported the data as tabledata:
>> elo(tabledata, 'Ivan Nedelko', 'Kevin King')
ans =
0.230209216637309
0.769790783362691
Guillaume
on 3 Jul 2019
Note: you could calculate the odds for matches of every player against any player with:
Q = 10 .^ (tabledata.ELO / 400);
odds = Q ./ (Q + Q.');
odds(r, c) is then the odds of tabledata.Player{r} winning against tabledata.Player{c}
More Answers (1)
See Also
Categories
Find more on Data Import and Export in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!