Save -append increases mat file size

Hi, I have a mat file that is being updated with the function save('myfile.mat', 'myvar' -append). Everytime this function is called, my mat file increases a lot in size, even though the 'myvar' has not changed. Does anybody know how I can fix this?

6 Comments

I'm unable to reproduce your problem. Please post an example that results in the observed behavior.
a=rand(10,1);
save('tmp.mat','a')
for n=1:3
x=dir('tmp.mat');
disp(x.bytes)
save('tmp.mat','a','-append')
end
258 258 258
I am experiencing the same problem. I have a large dataset saved in a structure 'data', which is roughly 20 MB. I have that saved with a second structure 'SCdata' (16 MB) in a .mat file 'myFile', which totals ~ 70 MB. I want to be able to save the two structures independently without overwriting the other, and I often don't have both structures loaded simultaneously.
In theory, save('myFile.mat', 'data', '-append') should be perfect; overwrite the old 'data' structure while keeping 'SCdata' untouched. However when I used '-append', the .mat file size nearly doubles, while the contents seemingly remain unchanged. This continues everytime I try to "append" the mat file. Performing save('myFile.mat', 'data', 'SCdata') reduces the file back to its orginal 70 MB size.
It seems like '-append' is not truly overwriting the existing data, which I do not understand. I have also tried overwriting the data using matfile objects, which produces the same problem. This seems to be the same problem Azucena is encoutnering.
Stephen23
Stephen23 on 18 Feb 2022
Edited: Stephen23 on 18 Feb 2022
"It seems like '-append' is not truly overwriting the existing data"
It doesn't.
The point of -append is to be fast. To achieve that it does not check and rearrange the entire file content (slow).
Jan
Jan on 18 Feb 2022
Edited: Jan on 18 Feb 2022
Which type of Mat-files are you using? v4, v6, v7, v7.3, v7.3 & -nocompression?
If compression is enabled, it is not trivial to overwrite an exitsing variable, e.g. if the size changes. Then the files suffer from the same fragmentation problems as files on the disk. Freeing the unused space would reduce the file size, but costs time.
I am using version 7, compression.
I was thinking, I could first load the entire MAT file and then update only the variable I need to change and in the end save all variables without -append.
However, that doesn't seems to be the most efficient and elegant solution.
When you -append to a -v7 or earlier .mat file, it doesn't update anything existing in the file: it just adds on the blob that represents the new variable. Part of the semantics of those files is that any program reading from them is responsible for scanning the content of the file, and finding the last blob with the desired variable name and using that. Earlier blobs with the same name are not even marked as deleted or as something that can potentially be released.
It is the presentation_final_final_final_last_final.doc kind of storage: you don't stop looking in your folder when you see presentation.doc or presentation_final.doc, you keep looking for the newest version and use that, leaving all the other ones where they are. Any cleanup comes later.

Sign in to comment.

Answers (1)

Pritesh Parmar
Pritesh Parmar on 15 Apr 2025
Edited: Pritesh Parmar on 17 Apr 2025
As a workaround, you could try loading the variables you want to preserve and save the MAT file again with both the variables you want to preserve and updated variables. Try the following code: fastSaveUpdate - Efficiently update variables in MAT-file - File Exchange - MATLAB Central
Pritesh Parmar

Asked:

on 16 Feb 2022

Edited:

on 17 Apr 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!