Main Content

arrayDatastore

Datastore for in-memory data

Description

Use an ArrayDatastore object to manage a datastore created from in-memory data. You can create an ArrayDatastore object using the arrayDatastore function, specify its properties, and then import and process the data using object functions.

Creation

Description

example

arrds = arrayDatastore(A) creates a datastore arrds from array A stored in memory.

example

arrds = arrayDatastore(A,Name,Value) specifies additional parameters and properties for arrds using one or more name-value pair arguments. For example, specify that each call to the read function reads three rows of data by calling arrds = arrayDatastore(data,"ReadSize",3).

Input Arguments

expand all

Input array, specified as a matrix.

Properties

expand all

ArrayDatastore properties describe the format of in-memory data in a datastore object and control how the data is read from the datastore. You can specify the value of ArrayDatastore properties using name-value pair arguments when you create the datastore object. To view or modify a property after creating the object, use the dot notation.

Amount of data to read in a call to the read function, specified as the comma-separated pair consisting of 'ReadSize' and a positive integer. Each call to read reads a maximum of ReadSize rows. If you specify a value for 'ReadSize' that exceeds the number of rows in the input data, read will read all the rows in the datastore object.

The default value of 'ReadSize' is 1.

Data Types: double

Dimension in which to read in a call to the read function, specified as the comma-separated pair consisting of 'IterationDimension' and a positive integer. For example, 'IterationDimension',2 makes read return column-oriented data from the datastore object. The default value of 'IterationDimension' is 1, which makes read return row-oriented data..

If you specify the value of the 'OutputType' property as 'same', then 'IterationDimension' must be set to a value of 1.

If you modify the value of 'IterationDimension' after creating your ArrayDatastore object, MATLAB® resets the datastore to an unread state.

Data Types: double

Output data type, specified as the comma-separated pair consisting of 'OutputType' and one of these values:

  • 'cell' — Return the data as an n-by-1 cell array. For example, if A is a numeric array and ReadSize is 3, read returns a 3-by-1 cell array of numeric data.

  • 'same' — Return the same data type as the input array A. For example, if A is a numeric array, read returns numeric arrays.

The value of OutputType determines the data type returned by the preview, read, and readall functions.

If you modify the value of 'OutputType' after creating your ArrayDatastore object, MATLAB resets the datastore to an unread state.

Data Types: char | string

Object Functions

hasdataDetermine if data is available to read
numpartitionsNumber of datastore partitions
partitionPartition a datastore
previewPreview subset of data in datastore
readRead data in datastore
readallRead all data in datastore
resetReset datastore to initial state
transformTransform datastore
combineCombine data from multiple datastores
shuffleShuffle all data in datastore
subsetCreate subset of datastore or file-set

Examples

collapse all

Create an ArrayDatastore object from a matrix, then read all of the data in the datastore.

Create a matrix.

A = magic(10)
A = 10×10

    92    99     1     8    15    67    74    51    58    40
    98    80     7    14    16    73    55    57    64    41
     4    81    88    20    22    54    56    63    70    47
    85    87    19    21     3    60    62    69    71    28
    86    93    25     2     9    61    68    75    52    34
    17    24    76    83    90    42    49    26    33    65
    23     5    82    89    91    48    30    32    39    66
    79     6    13    95    97    29    31    38    45    72
    10    12    94    96    78    35    37    44    46    53
    11    18   100    77    84    36    43    50    27    59

Create an ArrayDatastore object from the matrix.

arrds = arrayDatastore(A)
arrds = 
  ArrayDatastore with properties:

              ReadSize: 1
    IterationDimension: 1
            OutputType: "cell"

Read all of the data in the datastore.

readall(arrds)
ans=10×1 cell array
    {[   92 99 1 8 15 67 74 51 58 40]}
    {[  98 80 7 14 16 73 55 57 64 41]}
    {[  4 81 88 20 22 54 56 63 70 47]}
    {[  85 87 19 21 3 60 62 69 71 28]}
    {[   86 93 25 2 9 61 68 75 52 34]}
    {[ 17 24 76 83 90 42 49 26 33 65]}
    {[  23 5 82 89 91 48 30 32 39 66]}
    {[  79 6 13 95 97 29 31 38 45 72]}
    {[ 10 12 94 96 78 35 37 44 46 53]}
    {[11 18 100 77 84 36 43 50 27 59]}

Return the same data types as the input array instead of returning the data as an n-by-1 cell array.

Create a table from the spreadsheet file counties.xlsx. Import all the rows from the fifth through tenth columns in the spreadsheet.

T = readtable("counties.xlsx","Range",[1,5,67,10])
T=66×6 table
          CountyName           State        StateName       Population2010    HousingUnits2010     LandArea 
    _______________________    ______    _______________    ______________    ________________    __________

    {'Fairfield County'   }    {'CT'}    {'Connecticut'}      9.1683e+05         3.6122e+05       1.6185e+09
    {'Hartford County'    }    {'CT'}    {'Connecticut'}      8.9401e+05         3.7425e+05       1.9039e+09
    {'Litchfield County'  }    {'CT'}    {'Connecticut'}      1.8993e+05              87550       2.3842e+09
    {'Middlesex County'   }    {'CT'}    {'Connecticut'}      1.6568e+05              74837       9.5649e+08
    {'New Haven County'   }    {'CT'}    {'Connecticut'}      8.6248e+05           3.62e+05       1.5657e+09
    {'New London County'  }    {'CT'}    {'Connecticut'}      2.7406e+05         1.2099e+05        1.722e+09
    {'Tolland County'     }    {'CT'}    {'Connecticut'}      1.5269e+05              57963       1.0624e+09
    {'Windham County'     }    {'CT'}    {'Connecticut'}      1.1843e+05              49073       1.3284e+09
    {'Androscoggin County'}    {'ME'}    {'Maine'      }       1.077e+05              49090       1.2119e+09
    {'Aroostook County'   }    {'ME'}    {'Maine'      }           71870              39529       1.7279e+10
    {'Cumberland County'  }    {'ME'}    {'Maine'      }      2.8167e+05         1.3866e+05       2.1633e+09
    {'Franklin County'    }    {'ME'}    {'Maine'      }           30768              21709       4.3942e+09
    {'Hancock County'     }    {'ME'}    {'Maine'      }           54418              40184         4.11e+09
    {'Kennebec County'    }    {'ME'}    {'Maine'      }      1.2215e+05              60972       2.2469e+09
    {'Knox County'        }    {'ME'}    {'Maine'      }           39736              23744       9.4569e+08
    {'Lincoln County'     }    {'ME'}    {'Maine'      }           34457              23493       1.1806e+09
      ⋮

Create a datastore from the table. Set 'OutputType' to 'same' to return the same data types as the input table.

arrds = arrayDatastore(T,"OutputType","same")
arrds = 
  ArrayDatastore with properties:

              ReadSize: 1
    IterationDimension: 1
            OutputType: "same"

Preview the data in the datastore.

preview(arrds)
ans=8×6 table
         CountyName          State        StateName       Population2010    HousingUnits2010     LandArea 
    _____________________    ______    _______________    ______________    ________________    __________

    {'Fairfield County' }    {'CT'}    {'Connecticut'}      9.1683e+05         3.6122e+05       1.6185e+09
    {'Hartford County'  }    {'CT'}    {'Connecticut'}      8.9401e+05         3.7425e+05       1.9039e+09
    {'Litchfield County'}    {'CT'}    {'Connecticut'}      1.8993e+05              87550       2.3842e+09
    {'Middlesex County' }    {'CT'}    {'Connecticut'}      1.6568e+05              74837       9.5649e+08
    {'New Haven County' }    {'CT'}    {'Connecticut'}      8.6248e+05           3.62e+05       1.5657e+09
    {'New London County'}    {'CT'}    {'Connecticut'}      2.7406e+05         1.2099e+05        1.722e+09
    {'Tolland County'   }    {'CT'}    {'Connecticut'}      1.5269e+05              57963       1.0624e+09
    {'Windham County'   }    {'CT'}    {'Connecticut'}      1.1843e+05              49073       1.3284e+09

Create a datastore for a MAT-file variable, and then read data from the file with different ReadSize values.

Load the MAT-file BostonWeatherData.mat into the workspace.

load 'BostonWeatherData.mat'

Create a datastore for the weatherData variable. Set ReadSize to 10 rows. The value of ReadSize determines how many rows of data are read from the datastore with each call to the read function. Set 'OutputType' to 'same' to return the same data types as the input array.

arrds = arrayDatastore(weatherData,"ReadSize",10, "OutputType","same")
arrds = 
  ArrayDatastore with properties:

              ReadSize: 10
    IterationDimension: 1
            OutputType: "same"

Read the data from the datastore.

data1 = read(arrds)
data1=10×3 timetable
       Time        TemperatureF    Humidity       Events   
    ___________    ____________    ________    ____________

    01-Jul-2015         72            78       Thunderstorm
    02-Jul-2015         72            60       None        
    03-Jul-2015         70            56       None        
    04-Jul-2015         67            75       None        
    05-Jul-2015         72            67       None        
    06-Jul-2015         74            69       None        
    07-Jul-2015         75            77       Rain        
    08-Jul-2015         79            68       Rain        
    09-Jul-2015         66            77       Rain        
    10-Jul-2015         69            74       Rain        

Set the ReadSize property value to 30 and read from the datastore. The second call to the read function reads the next 30 rows from the datastore.

arrds.ReadSize = 30;

Read the data from the datastore.

data2 = read(arrds)
data2=30×3 timetable
       Time        TemperatureF    Humidity    Events
    ___________    ____________    ________    ______

    11-Jul-2015         76            49        None 
    12-Jul-2015         81            54        None 
    13-Jul-2015         72            81        None 
    14-Jul-2015         74            72        Rain 
    15-Jul-2015         75            87        Rain 
    16-Jul-2015         64            65        None 
    17-Jul-2015         68            72        None 
    18-Jul-2015         71            81        Rain 
    19-Jul-2015         81            73        Rain 
    20-Jul-2015         81            62        None 
    21-Jul-2015         76            66        None 
    22-Jul-2015         77            58        None 
    23-Jul-2015         75            52        None 
    24-Jul-2015         74            60        Rain 
    25-Jul-2015         66            81        None 
    26-Jul-2015         71            79        Rain 
      ⋮

You can select the dimension in which to read from an ArrayDatastore. For example, you can read the frames of a video whose data is stored in an ArrayDatastore by reading along the fourth dimension.

Load the video data. Create a VideoReader object from the file xylophone.mp4.

v = VideoReader('xylophone.mp4');

Read all video frames from the VideoReader object into the workspace.

allFrames = read(v);

Create a datastore from the frames you read. Set 'IterationDimension' to 4 to read the data along its fourth dimension. Set 'OutputType' to 'cell' to return the data as a cell array. Set 'ReadSize' to 4 to read four video frames in each call to the read function.

arrds = arrayDatastore(allFrames,"IterationDimension",4,"OutputType","cell","ReadSize",4)
arrds = 
  ArrayDatastore with properties:

              ReadSize: 4
    IterationDimension: 4
            OutputType: "cell"

Read and display the first four video frames in the datastore as a rectangular tiled image.

frames = read(arrds);
imout = imtile(frames);
imshow(imout)

Figure contains an axes object. The axes object contains an object of type image.

Tips

  • You can combine and transform ArrayDatastore objects with datastores that contain on-disk data (such as ImageDatastore and TabularTextDatastore objects) using the combine and transform functions.

Introduced in R2020b