# Replicating a curve with frequency and area

2 views (last 30 days)
V.D-C on 26 Aug 2020
Answered: Ayush Gupta on 8 Sep 2020
Hello,
I was asked to replicate the figure attached to this message.
I have a vector of glacier areas (in square-kilometer). What I want is to put those areas in bins, and count how many glaciers are per bin (one row equals one glacier in the vector). But, I can't manage to make the link between the x-axis and the curve. I tried 2 different codes, one creating bins of 0.01km2 discretization of bins between 0km2 and the maximum area, and one code creating 10 bins per log threshold of area (10 bins between 0-1 km2, 10 bins between 1-10 km2, 10 bins between 10-100km2 and so on).
This code makes bins of 0.02km2 width:
clear all
min_area = min(area);
max_area = max(area);
mean_area = mean(area);
%% Create bins
%Every 0.0010 m2 between 0.0100/1.000, every 0.1 between 1/10, every 1
%between 10/max
i = 1;
for b = 0:0.01:max_area
if b == 0
box(i,1) = sum(area(:)==b);
else
box(i,1) = sum(area(:)<=b) - sum(area(:)<=b-0.01);
end
i = i + 1;
end
figure()
plot(box)
set(gca, 'YScale', 'log')
set(gca, 'XScale', 'log')
xt = get(gca, 'XTick');
set(gca, 'XTick',xt, 'XTickLabel',xt*0.01)
xlabel('log10(Area(km2))');
ylabel('log10(Count)');
This code makes 60 bins with sizes varying sizes, as I described sooner in my message.
clear all
min_area = min(area);
max_area = max(area);
mean_area = mean(area);
i = 1;
coef = 0.01;
coef_vect = coef;
for b = 1:60
if mod(b,11) == 0
i = i+1;
coef = coef * 10;
coef_vect(i,1) = coef;
end
box(b,1) = sum(area(:) <= (b*coef)) - sum(area(:) <= ((b-1)*coef));
end
figure(2)
plot(box)
set(gca, 'YScale', 'log')
set(gca, 'XScale', 'log')
xt = get(gca, 'XTick');
set(gca, 'XTick',xt, 'XTickLabel',xt)
%xticklabels(coef_vect)
xlabel('log10(Area(km2))');
ylabel('log10(Count)');
The second code looks the most like the figure I am supposed to replicate, but I can't manage to make any link with the area.
How can I implement the fact that the bins 1 to 10 represent 0.01-0.01 km2, the 11th to 20th bins represent 0.1-1 km2, 21st to 30th bins represent 1-10 km2 etc... ?
I hope I was clear enough, don't hesitate to write what is confusing in my question.
Have a nice day !

Ayush Gupta on 8 Sep 2020
The second approach can be followed where the log(area) is considered while dividing into bins and the histcounts function can be used to calculate the number of elements that fall in each category. The documentation of histcounts and some examples on how to use it can be accessed here. Refer to the following code on how to do it:
% converting our data to log since it will form bins of uniform sizes
area_log = log10(area);
%specifying the edges of the bins from he minimum value to maximum value
%of the data
edges = [-2 -1 0 1 2 3 4];
% using histcount function to calculate the occurence in each bin
N = histcounts(area_log,edges);