Datastore for custom files

7 views (last 30 days)
Chong Wang
Chong Wang on 22 Mar 2016
Commented: Walter Roberson on 21 Feb 2018
I am trying to work with Hadoop and the datastore object. However, the file to be read is binary, hence up until now, I have used fopen and fread to read in the stream of binary. This does not work with datastore, but should work with fileDatastore. So this is what I did:
fds = fileDatastore(fname{1},'ReadFcn',@testfcn)
read(fds);
function output=testfcn(fileName)
fileID = fopen(fileName);
output=fread(fileID,'ubit1');
end
During the debugging, I found out that the file actually has been successfully loaded. But I keep gettin an error message at the end:
Warning: The following error was caught while executing
'matlab.io.datastore.splitreader.WholeFileCustomReadSplitReader' class
destructor:
No directories were removed.
> In matlab.io.datastore.SplittableDatastore/delete (line 108)
In matlab.io.datastore.FileDatastore/readall (line 11)
Error using matlab.io.datastore.FileDatastore/readall (line 21)
No directories were removed.
Does anyone know what the problem is? Thanks!

Accepted Answer

Rick Amos
Rick Amos on 22 Mar 2016
For files on Hadoop, the File Datastore makes a temporary copy of the file on the local drive to be read by the ReadFcn. The ReadFcn here, testfcn, opens the file ID but does not close it afterwards. This prevents the File Datastore from being able to clean up the temporary copy of the file.
Closing the file ID after use should resolve this error:
function output=testfcn(fileName)
fileID = fopen(fileName);
output=fread(fileID,'ubit1');
fclose(fileID);
end
  5 Comments
SRUTHY SKARIA
SRUTHY SKARIA on 21 Feb 2018
Hi, I need some help please... I have 4 mat files and I need to read cell array from all four using imageDatastore. I was able to read the mat files, but unable to read the cellarray which contains my data of use. I used a similar function which is mentioned above, but I was wondering how I specify fileName for all the 4 mat files, or should I need not?
Walter Roberson
Walter Roberson on 21 Feb 2018
Your custom ReadFcn will be called once for each file name. You would use the complete path passed in to load the appropriate variable from the file. The output must be a single image.
I suspect that the imageDatastore filters out duplicate file names, so I do not think that you could use the trick of adding the same file name once for each image stored in the file with your ReadFcn keeping track of how many times it has been invoked on any one file in order to know which index into the cell to use.
I would suggest to you that you pull the individual cell members out into individual files.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!