# How can I write duplicate files that are equal to or greater than 1GB to a excel file in Matlab?

1 view (last 30 days)
Rookie Programmer on 3 May 2023
Commented: Walter Roberson on 3 May 2023
The script below detects all duplicates from a given directory. I'd like to edit this script so that only duplicates equal to or greater than 1GB are printed to an excel file.
root = 'C:\Users\19108\Documents';
dirlist = dir(fullfile(root, '**\*.*'));
filelist = dirlist(~[dirlist.isdir]);`
isunique = cell(size(filelist));
names = {filelist.name};
for i = 1:length(filelist)
temp = [];
for j = 1:length(filelist)
if (i~=j)
if isequal(name(i),name(j))
if isempty(temp)
temp = j;
else
temp = [temp,j];
end
isunique{i} = temp;
end
end
end
end
fl_T = struct2table(filelist);
merge_T = [isunique, fl_T];
merge_T.Properties.VariableNames("name") = "Dup_Row";
%writetable(merge_T,filename)

Rik on 3 May 2023
You can use the unique function instead of your double loop (which could be improved by starting the second at (i+1) instead of 1).
names = {'ab','ba','ab','ab','cd','x'};
[~,~,UniqueIndex]=unique(names)
UniqueIndex = 6×1
1 2 1 1 3 4
Now you can loop over 1:max(UniqueIndex) like this:
FileSizes = [filelist.bytes];
L_sz = FileSizes>(1024^3); % >1GB
L_duplicates = false(size(filelist));
for n=1:max(UniqueIndex)
L_n = UniqueIndex==n;
if sum(L_n)==1,continue,end % skip non-duplicate elements
L_duplicates(L_sz & L_n) = true;
end
Now you have a logical array containing the indices of all files greater than 1GB for which there exists another file with the same name.
Walter Roberson on 3 May 2023
intersect() would be more appropriate for detecting duplicates

### Categories

Find more on Spreadsheets in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!