Hello,
I'd like to generate a set of indicies for my tall datastore in advance so, when I want to select rows, it will be fast.
Here is what I came up with.
I expect there is a much better way.  Please help me improve it.
FYI, the datastore will be several million rows by four columns. I ran the following test on a very small file.
Thanks
Michael
Code:
ds = datastore(fname);
ds.VariableNames = {'DATE','TICKER','FIELD','VALUE'};
ds.SelectedFormats(1:3) = {'%{MM/dd/uuuu}D','%C','%C'};
tds = tall(ds);
gDATE = unique(tds.DATE);
gTICKER = unique(tds.TICKER);
gFIELD = unique(tds.FIELD);
[uniqueDates, uniqueTickers, uniqueFields] = gather(gDATE,gTICKER,gFIELD);
for iTicker = 1:length(uniqueTickers)
    gTicker = find(tds.TICKER==uniqueTickers(iTicker));
    idxTicker{iTicker} = gather(gTicker);
end
for iField = 1:length(uniqueFields)
    gField = find(tds.FIELD==uniqueFields(iField));
    idxField{iField} = gather(gField);
end
tTickerIndex = array2table([cellstr(uniqueTickers) idxTicker.'], ...
    'VariableNames',{'TICKER','INDEX'});
tTickerIndex.TICKER= categorical(tTickerIndex.TICKER)
tFieldIndex = array2table([cellstr(uniqueFields) idxField.'], ...
    'VariableNames',{'FIELD','INDEX'})
tFieldIndex.FIELD = categorical(tFieldIndex.FIELD)
out = tds(tFieldIndex.INDEX{tFieldIndex.FIELD=='EQY_DVD_YLD_IND'}(1:10),:);
gather(out)
Output
tTickerIndex = 1×2 table
        TICKER              INDEX      
    _______________    ________________
    DFIVX US EQUITY    [34112×1 double]
    
tFieldIndex = 6×2 table
                 FIELD                       INDEX     
    ________________________________    _______________
    DAY_TO_DAY_TOT_RETURN_GROSS_DVDS    [6423×1 double]
    DAY_TO_DAY_TOT_RETURN_NET_DVDS      [6423×1 double]
    DIVIDEND_INDICATED_YIELD            [6242×1 double]
    EQY_DVD_YLD_IND                     [6242×1 double]
    EQY_DVD_YLD_IND_NET                 [6242×1 double]
    FUND_TOTAL_ASSETS                   [2540×1 double]
    
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 2: Completed in 1.1 sec
- Pass 2 of 2: Completed in 0.98 sec
Evaluation completed in 2.8 sec
ans = 10×4 table
       DATE           TICKER              FIELD         VALUE 
    __________    _______________    _______________    ______
    06/01/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1484
    06/02/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1611
    06/03/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1718
    06/06/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1654
    06/07/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND     2.159
    06/08/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1442
    06/09/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1484
    06/10/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND     2.138
    06/13/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1277
    06/14/1994    DFIVX US EQUITY    EQY_DVD_YLD_IND    2.1277