How to write data to a binary file at a specific position?
16 views (last 30 days)
Show older comments
NeuronDB
on 24 Mar 2022
Commented: Walter Roberson
on 26 Mar 2022
Hello,
Let us say that my data looks like this -
data = [1,1,1,1,1;...
2,2,2,2,2; ...
3,3,3,3,3];
I would like a write this data to a binary file such that it looks like - [1;2;3;1;2;3;1;2;3;1;2;3 ... and so on].
Now for a small file, I can easily do this as - fwrite(fp, data(:), 'int16'); However, for a very large data file (where data size is 100*1e10 or more), it becomes extraordinary slow. The raw data is stored as deparate files for each row, so I can read the data row by row. So, is it possible to write data to a binary file in a specific position?
Thank you for help!
Accepted Answer
Walter Roberson
on 25 Mar 2022
First (and this is important!) write a block of zeros that is the same number of bytes as the final array size. The writing will not work properly if you omit this step. But you do not need to create an array that size: you could loop writing out a buffer of zeros until enough had accumulated. Do not write extra data: there is no way in MATLAB of getting rid of the extra data once it is written.
Now, repeat:
fseek to ((row number minus 1) times (bytes per element)) from beginning of file.
fwrite() the content of the row, making sure to use the precision argument to control how the data is written, and making sure to use the "skip" option. The value of the skip should be ((total rows minus 1) times (bytes per element))
Go back to the next row.
This will not be fast at all. Every page that is being updated will have to be read by MATLAB, and MATLAB will have to do the modification in its internal buffers and write the results out again.
It is not possible at the MATLAB level to "leave holes" that you gradually fill in. And even if it were, MATLAB would still need to do the continual read/modify/write cycle.
2 Comments
Walter Roberson
on 26 Mar 2022
In MATLAB fseek beyond the end of a file does not work, at least historically.
More Answers (1)
Jan
on 26 Mar 2022
Edited: Jan
on 26 Mar 2022
% Some test data storing the rows in different files:
nRow = 10;
nCol = 1e6;
for k = 1:nRow
[fid, msg] = fopen(sprintf('file%02d.bin', k), 'W');
assert(fid > 0, msg);
data = randi([0, 32767], nCol, 1, 'int16');
fwrite(fid, data, 'int16');
fclose(fid);
end
% *** Version 1: insert data in chunks into the file:
tic
% Create the output file:
[ofid, msg] = fopen(sprintf('matrix1.bin'), 'W');
assert(ofid > 0, msg);
% Pre-allocate the output file (not really needed):
width = 2; % Bytes per element
skip = (nRow - 1) * width;
fwrite(ofid, 0, 'int16', (nRow * nCol - 1) * width);
% Loop over input files:
for k = 1:nRow
[ifid, msg] = fopen(sprintf('file%02d.bin', k), 'r');
assert(ifid > 0, msg);
data = fread(ifid, Inf, '*int16');
fclose(ifid);
% Insert in output file in chunks:
fseek(ofid, (k-1) * width, 'bof');
fwrite(ofid, data(1), 'int16');
fseek(ofid, k * width, 'bof');
fwrite(ofid, data(2:nCol), 'int16', skip);
end
fclose(ofid);
toc;
% *** Version 2: Join array in the memory:
tic
% Loop over input files:
data = zeros(nRow, nCol, 'int16');
for k = 1:nRow
[ifid, msg] = fopen(sprintf('file%02d.bin', k), 'r');
assert(ifid > 0, msg);
data(k, :) = fread(ifid, Inf, '*int16');
fclose(ifid);
end
% Write output file at once:
[ofid, msg] = fopen(sprintf('matrix2.bin'), 'W');
assert(ofid > 0, msg);
fwrite(ofid, data, 'int16');
fclose(ofid);
toc;
Timings on my i5, Matlab R2018b, SSD:
Elapsed time is 46.099363 seconds. % Insert on disk
Elapsed time is 0.060289 seconds. % Insert in memory
This means, that the joining in the RAM is much faster than writing the data with skipping.
This might be different, if you convert the imported data to doubles, which use 8 byte per element instead of 2 bytes for int16. Maybe the available RAM is exhausted and the computer stores the data in the much slower virtual memory.
See Also
Categories
Find more on Logical in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!