Convert a datastore into a tall table, calculate its size using a deferred calculation, and then perform the calculation and return the result in memory.
First, create a datastore for the airlinesmall.csv
data set. Treat 'NA'
values as missing data so that they are replaced with NaN
values. Set the text format of a few columns so that they are read as a cell array of character vectors. Convert the datastore into a tall table.
T =
Mx29 tall table
Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum TailNum ActualElapsedTime CRSElapsedTime AirTime ArrDelay DepDelay Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted CarrierDelay WeatherDelay NASDelay SecurityDelay LateAircraftDelay
____ _____ __________ _________ _______ __________ _______ __________ _____________ _________ _______ _________________ ______________ _______ ________ ________ _______ _______ ________ ______ _______ _________ ________________ ________ ____________ ____________ ________ _____________ _________________
1987 10 21 3 642 630 735 727 {'PS'} 1503 {'NA'} 53 57 NaN 8 12 {'LAX'} {'SJC'} 308 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 26 1 1021 1020 1124 1116 {'PS'} 1550 {'NA'} 63 56 NaN 8 1 {'SJC'} {'BUR'} 296 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 23 5 2055 2035 2218 2157 {'PS'} 1589 {'NA'} 83 82 NaN 21 20 {'SAN'} {'SMF'} 480 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 23 5 1332 1320 1431 1418 {'PS'} 1655 {'NA'} 59 58 NaN 13 12 {'BUR'} {'SJC'} 296 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 22 4 629 630 746 742 {'PS'} 1702 {'NA'} 77 72 NaN 4 -1 {'SMF'} {'LAX'} 373 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 28 3 1446 1343 1547 1448 {'PS'} 1729 {'NA'} 61 65 NaN 59 63 {'LAX'} {'SJC'} 308 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 8 4 928 930 1052 1049 {'PS'} 1763 {'NA'} 84 79 NaN 3 -2 {'SAN'} {'SFO'} 447 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
1987 10 10 6 859 900 1134 1123 {'PS'} 1800 {'NA'} 155 143 NaN 11 -1 {'SEA'} {'LAX'} 954 NaN NaN 0 {'NA'} 0 NaN NaN NaN NaN NaN
: : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : : :
The display of the tall table indicates that MATLAB® does not yet know how many rows of data are in the table.
Calculate the size of the tall table. Since calculating the size of a tall array requires a full pass through the data, MATLAB does not immediately calculate the value. Instead, like most operations with tall arrays, the result is an unevaluated tall array whose values and size are currently unknown.
s =
1x2 tall double row vector
? ?
Use the gather
function to perform the deferred calculation and return the result in memory. The result returned by size
is a trivially small 1-by-2 vector, which fits in memory.
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 1.8 sec
Evaluation completed in 2.1 sec
If you use gather
on an unreduced tall array, then the result might not fit in memory. If you are unsure whether the result returned by gather
can fit in memory, use gather(head(X))
or gather(tail(X))
to bring only a small portion of the calculation result into memory.