Good morning to all!
I am developing a neural network with deep network designer and for this I need a datastore. I have hundreds of numerical data in an excel table and I have converted it to a csv, which I transformed into a datastore with the following code:
DataTableTrain = tabularTextDatastore("TableTrain_Borrar.csv","VariableNamingRule","preserve");
NewDataTableTrain = transform(DataTableTrain, @(x) [cellfun(@transpose,mat2cell(x{:,1:end-1},ones(1,numTrain)),...
'UniformOutput',false) , mat2cell((x{:,end}),ones(1,numTrain))]);
When I write the function "isShuffleable" it throws a value 0 since it is not, so I don't explain how to create a datastore of this type. I have tried with the function "shuffle" but it tells me that all the cells must be shuffleable. I think that a datastore that is shuffleable has several advantages over one that is not, that's why I insist.
Thank you very much!

 Accepted Answer

A different approach would be to read csv file in a table and create datastore from the table.
this datastore issubsetable and therefore shuffleable:
T = readtable("airlinesmall.csv");
ds = arrayDatastore(T,"OutputType","same")
ds =
ArrayDatastore with properties: ReadSize: 1 IterationDimension: 1 OutputType: "same"
isSubsettable(ds)
ans = logical
1
isShuffleable(ds)
ans = logical
1

7 Comments

If it is now shuffleable, but the app does not read it well, it has to be a cell datastore with two columns per observation, one for the predictors and one for the answers.
If I change the output type to "cell" it gives me everything in cells which is what I need, but these have to be divided which I don't know how to do.
:(
Please have a look at:
There is a part of splitting the datastore into training, validation and test sets.
That's it! I had to modify it a bit to get it to work, but I've got it now.
Can you please share more details, so others with similar problems can benefit?
I put the code that worked for me and it is shuffleable, I explain it to you:
Table = table2array(Table);
Creation of random indexes (not necessary at all).
Index = randperm(96,96);
Datastore for training with predictors (dsNewTrain) and their corresponding responses (dsLabelTrain). It is necessary to transpose each cell and combine both datastores.
numTrain is the training data that we want to have in the datastore.
dsTableTrain = arrayDatastore(Table(Index(1:numTrain),1:end-1));
dsNewTrain = transform(dsTableTrain,@(x) [cellfun(@transpose,x,'UniformOutput',false)]);
dsLabelTrain = arrayDatastore(Table(Index(1:numTrain),end));
dsDatasetTrain = combine(dsNewTrain,dsLabelTrain);
we check that it is shuffleable.
isShuffleable(dsDatasetTrain)
numValid = numTrain+1;
We do the same for validation data, around 20 to 30% of the total data.
dsTableValidation = arrayDatastore(Table(Index(numValid:end),1:end-1));
dsNewValidation = transform(dsTableValidation,@(x) [cellfun(@transpose,x,'UniformOutput',false)]);
dsLabelValidation = arrayDatastore(Table(Index(numValid:end),end));
dsDatasetValidation = combine(dsNewValidation,dsLabelValidation);
we check that it is shuffleable.
isShuffleable(dsDatasetValidation)
And that's it, this code is the one that worked for me and it is shuffleable. You would just have to modify it to your liking.
Great, thank you. It will be useful for anyone who has similar problem.

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Products

Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!