If you want to use image processing to solve this sort of problem, that's fine, but don't start with a junk screenshot.
Back up and consider the example image:
What is the area enclosed by this curve? Is it this?
... or is it this?
In reality, it's somewhere in between, but it's hard to say where exactly, because graphs and charts are crude visualizations of data, not accurate stores of data. The ambiguity created by the heavy lineweight isn't the only problem. What are the units? Are they even the same? Start with the data that created the plot, not a screenshot of a plot of the data.
Let's say that bridge has already been burned. We don't have the data. All we have is a fist-and-crayon interpretation of the data. What can we do? We have two options. Manually transcribe the curve, or try to process the small raster image directly.
Normally, I like to just transcribe the curve manually. See this and this. Once we have smooth xy data, we can convert that to a logical mask at any scale and apply image processing methods if we choose to process it that way. str = regexp(str,'((?<=<rect)(.*?)(?=\/>))','match');
pbx = regexp(str,'((?<=x=")(.*?)(?="))','match');
pby = regexp(str,'((?<=y=")(.*?)(?="))','match');
pbw = regexp(str,'((?<=width=")(.*?)(?="))','match');
pbh = regexp(str,'((?<=height=")(.*?)(?="))','match');
pbrect = [str2double(pbx{1}{1}) str2double(pby{1}{1}) ...
str2double(pbw{1}{1}) str2double(pbh{1}{1})];
S = loadsvg(fname,coarseness,false);
x = xrange(1) + diff(xrange)*(x-pbrect(1))/pbrect(3);
y = yrange(1) + diff(yrange)*(pbrect(4) - (y-pbrect(2)))/pbrect(4);
sz = scale*ceil(max(abs([x y] - cen),[],1))*2;
xy = scale*([x y] - cen) + sz/2;
mask = poly2mask(xy(:,1),xy(:,2),sz(2),sz(1));
S = regionprops(mask,'area','circularity','extent','solidity')
Area_dataunits = S.Area/scale^2
S =
struct with fields:
Area: 528978
Circularity: 0.7572
Solidity: 0.9005
Extent: 0.6021
Area_dataunits =
52.8978
Okay, so that works. What about direct processing? We'll still need to manually get the calibration information. Other than that, it's fairly succinct, so long as the image is forgiving.
inpict = imread('image.png');
fprintf('select two points on the x-axis ruler at %d and %d\n',x0(1),x0(2))
scalex = abs(diff(x))/range(x0);
fprintf('select two points on the y-axis ruler at %d and %d\n',y0(1),y0(2))
scaley = abs(diff(y))/range(y0);
newy = scalex/scaley*sz(1);
inpict = imresize(inpict,[newy sz(2)]);
newx = scaley/scalex*sz(2)
inpict = imresize(inpict,[sz(1) newx]);
[H,S,~] = rgb2hsv(inpict);
mask = (H >= 0.83 | H <= 0.16) & (S >= 0.6);
mask = bwareafilt(mask,1);
mask = imfill(mask,'holes');
S = regionprops(mask,'area','circularity','extent','solidity')
Area_dataunits = S.Area/scale^2
S =
struct with fields:
Area: 20341
Circularity: 0.7540
Solidity: 0.9010
Extent: 0.6037
Area_dataunits =
54.4627
So that works as well. There are some things to note though. We can expect our computed area to be larger in the second example, since we're working at a much lower resolution. In this case, the area estimate is 3% larger.
The first example allows us to operate at any scale, whereas the second example is limited by the resolution of the original image. The first mask is 1200x1000 and is tight-fit to the blob, whereas the second mask is 319x374, and only a small portion of that (200x160) is devoted to the blob. Even disregarding obvious differences caused by the masking process alone, the excess area contributed by the pixels in the blob boundary is more significant when the relative size of pixels is large (i.e. when the resolution is low). Whether that's significant enough to make up the 3%, I'm not going to guess.