Determine area inside a polygon on a map

11 views (last 30 days)
SZee
SZee on 4 Sep 2017
Edited: Cedric on 12 Sep 2017
I have a 2x2.5 degree grid overlaying the United States. Each grid has a certain value. I want to determine the value for each state in the U.S. by weighing each grid box by how much of a certain state is inside each grid
For example, if I was looking at Texas, I can see that there are some grids that are 100% inside Texas, but there are many other grids that are only partly inside Texas. I would like to weigh those grids by the area covered. I need to do this for every state.
Mapping code:
figure('units','normalized','outerposition',[0 0 1 1], 'visible','on');
ax = usamap('conus');
states = shaperead('usastatelo', 'UseGeoCoords', true,...
'Selector',...
{@(name) ~any(strcmp(name,{'Alaska','Hawaii'})), 'Name'});
geoshow(ax, states,'FaceColor', [0.8 0.8 0.8], 'EdgeColor', [0 0 0])
framem off; gridm off; mlabel off; plabel off
hold on
pcolorm(lat,wrapTo360(lon),data(:,:,day))
hold on
geoshow(ax, states,'FaceColor', 'none', 'EdgeColor', [0 0 0])
colormap(flipud(hot))
h = colorbar('location', 'SouthOutside');
caxis([20 100])
Longitude file (with values denoting the edges of the box) 25x1:
233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500 233.7500
236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500 236.2500
238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500 238.7500
241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500 241.2500
243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500 243.7500
246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500 246.2500
248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500 248.7500
251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500 251.2500
253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500 253.7500
256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500 256.2500
258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500 258.7500
261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500 261.2500
263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500 263.7500
266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500 266.2500
268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500 268.7500
271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500 271.2500
273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500 273.7500
276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500 276.2500
278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500 278.7500
281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500 281.2500
283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500 283.7500
286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500 286.2500
288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500 288.7500
291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500 291.2500
293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500 293.7500
Latitude file (with values denoting the edges of the box) 25x14:
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
23 25 27 29 31 33 35 37 39 41 43 45 47 49
  2 Comments
José-Luis
José-Luis on 4 Sep 2017
Edited: José-Luis on 6 Sep 2017
Looks you already have the data as shapefiles. This is easier with most (all) GIS software. I'd perform the required operations there first and then import to Matlab if you really need to.
Is there a compelling reason for you to want to do this in Matlab? Because mapping is arguably nicer and I'd postulate that it is objectively less of a headache if you do it with actual mapping tools.
SZee
SZee on 4 Sep 2017
I don't have the data as shapefiles. I just have a matrix of values for the data in the 2x2.5 grid form. The shapefiles for the states are in MATLAB I guess, but I'm not sure how to use it. What mapping tools would make this easier? I have very limited knowledge of how to use GIS and don't currently have the software.

Sign in to comment.

Accepted Answer

Cedric
Cedric on 5 Sep 2017
Edited: Cedric on 12 Sep 2017
---[ PART 1 ]-------------------------------------------------------------------------------------------------------------------------------------------
I took 10 minutes and built quickly what you need for approach #1 mentioned in my comment (moved below). I do not guarantee that this is correct, you'll have to understand and check.
File Grid_20x25.zip contains the shapefile for the grid (that you may not need actually).
File regrid.mat is a MAT-File that contains geostructs for the states (MATLAB "high" resolution version) and for the grid (corresp. to shapefile), and intersect parameters: polygon IDs of grid cells and states, and surfaces areas.
loaded = load( 'regrid.mat' ) ;
regrid = loaded.regrid ;
Using the content of the intersect field, you can easily compute regridding parameters based on surface area (using e.g. ACCUMARRAY). Note that accumulating values from regrid.intersect.surfaceArea by grid polygon ID will not always sum up to the area of grid cells, because some grid cells do not fully overlap states (or not at all). If you need the area of grid cells (e.g. for computing a loss), it is available in regrid.grid.surfaceArea.
Finally, all areas are in m^2 and summing up all surface areas from the intersect (as a quick check) gives a number close to what Wikipedia provides for contiguous USA.
PS: surfaces areas were computed by projecting the intersect and the grid using the World/Behrmann projected CS.
Original comment:
I don't have time to answer, but here is a series of relevant points, quickly:
  • Your gridded data is a raster. While the function pcolorm can display it on a grid, you should understand that as a set of points (raster "pixels"/cells centers) with a single value per point. You don't have the geometry of the grid available as a set of polygons (for performing intersections for example).
  • Several options are available if you want to summarize the gridded data set on polygons. I don't know well the latest versions of the Mapping toolbox, but I don't think that it provides any relatively direct way to do it, so you will likely need to use a GIS.
  • A first option consists in building the grid geometry as a set of polygons, intersecting it with states, and using the output to "distribute" the data appropriately.
  • You can build the set of polygons for the grid in MATLAB by building an appropriate geostruct and exporting it to shapefile with SHAPEWRITE. You can also build it in Quantum GIS or ArcGIS if you have a license. In ArcGIS, one way to achieve that is to create a fishnet and to convert it to polygon (in ArcGIS 10, I think that you can directly create polygons).
  • In both cases (MATLAB or ArcGIS fishnet), you will have to define a coordinate system for the polygons (in your case a Geographic one, e.g. WGS1984). This must be done in the GIS, because MATLAB doesn't export the projection file (there are side ways but ..). If you do it in ArcGIS, use Data Management Tools/Projections and Transformations/Define projection.
  • Then you need to work in a GIS (MATLAB possible but much longer afaik). Assuming you have the grid loaded in the GIS, you need to either load a shapefile of US states (e.g. from the ESRI data set) or export it from MATLAB and load it (in this case you need to define the coordinate system as well).
  • Once both are loaded, you can perform an intersect (in ArcGIS Analysis Tools/Overlay/Intersect, keeping FIDs only), project it to an equal area coordinate system (e.g. Behrmann, this time using Data Management Tools/Projections and Transformations/Project for effectively projecting to the new equal areas projected coordinate system), compute shape areas of polygons from the intersect (if you save the output of the intersect operation to a file geodatabase, it is computed automatically (field Shape_Area)) and export the table of attributes (to DBF or text or shapefile, or whatever format you want to read with MATLAB).
  • Then you load it in MATLAB; it has one "row" per polygon of the intersect, with fields that identify grid cells, states, and define the surface area.
  • Using this table, you can regrid your data as appropriate, weighted e.g. by surface area. Using ACCUMARRAY can be useful for doing this (all aggregations). Careful with FIDs though, they start at 0 (and not 1), so you have to add 1 before you can use them as indices for ACCUMARRAY.
  • Also be careful to account properly for the nature of the data. Extensive and intensive data are not summarized the same way.
  • A second option consists in performing a zonal statistics, and again I don't know any easy way to do it in MATLAB (unless the Mapping toolbox can convert polygons to raster, in which case implementing the zonal statistics in MATLAB would trivial using ACCUMARRAY and indices defined by a raster of polygon IDs, whose underlying geometry matches the geometry of the raster of data).
  • If you have a license for it, the Zonal Statistics as a table tool from the Spatial Analyst Tools/Zonal toolbox (you have to enable it in Customize/Extensions) of ArcGIS is a good way to go.
  • Now the problem is that your raster is rather low resolution compared to states. In fact, many states won't encompass a single raster cell center. In such case, you have to resample the raster and make it higher resolution. ArcGIS has tools for this, but again you have to think about the nature of your data and resample appropriately (if each pixel/cell is divided in 10x10=100 pixels/cells, resampling a temperature means giving each of these pixels the temperature (intensive) of the original pixel so the average over the area of the original pixel gives the original temperature, but resampling a mass (extensive) of e.g. pollutant means giving each pixel the original mass divided by 100 so the sum over the area of the original pixel gives the value).
  • Once you have a resampled raster, you can perform a zonal statistics using states polygon IDs (FID or OBJECTID or whatever other unique ID) as zone ID.
A few extra notes:
  • I generally prefer working with the first approach, because it requires a one-shot effort for generating a table of contact surfaces areas between two geometries, and then I can "regrid" anything from one geometry to the other.
  • It also allows to deal with special cases a little better. To illustrate, if the raster is population counts for example, resampling and performing a zonal statistics induces losses for cells that are overlapping ocean coastal zones. This is something that can be avoided when we work with the geometry of the intersect.
---[ PART 2 ]-------------------------------------------------------------------------------------------------------------------------------------------
If you don't understand, study a simple case: two grids that partly overlap, A in blue made of two cells A1 and A2, and B in green made of three cells B1, B2, and B3.
The size of the cells is indicated, so you can compute the surface area of each cell and the surface are of the overlaps. As the overlap is not perfect, part of grid B has no counterpart in A.
The intersection between these two grid is given in black. This is what you get when you perform an intersect in a GIS for example.
Now you have to study a few basic examples.
  • Say you are regridding a mass (extensive) of something associated with grid A, e.g. 600kg associated with cell A1 and 480kg associated with cell A2. How would you do this?
  • How would you do it if these were temperatures (intensive)? E.g. 90F in A1 and 120F in A2?
  • How about regridding from B to A now? Is there a loss, how to account for it?
Once you understand how to perform these operations using basic, scalar formulas, you may realize that you can generalize the approach and express it in a matrix way. Masses/temperatures in A1 and A2 can be expressed as a vector of masses in A, and the same for masses/temperatures in B, so you have
m_A : 2x1, m_B : 3x1 (masses, extensive)
t_A : 2x1, t_B : 3x1 (temperatures, intensive)
and the passage from one to the other can be achieved through matrix multiplication. If you went through the examples above, you understand that these matrices depend on the nature (extensive, intensive) of the data that you are regridding, so:
m_B = R_AB_ext * m_A, m_A = R_BA_ext * m_B
t_B = R_AB_int * t_A, t_A = R_BA_int * t_B
where
R_AB_ext : 3x2, R_BA_ext : 2x3
R_AB_int : 3x2, R_BA_int : 2x3
are regridding matrices. This formalism is clear, elegant, and simple.
What I did in PART 1 was to build the intersect for you with a GIS and to store in the MAT file all the parameters/information that you need for building these regridding matrices.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!