Skip to content

Commit c8018ef

Browse files
authored
Merge pull request #25 from justinkadi/2025-04-arctic
Updating geopandas chapter
2 parents 65e0802 + 70fd0a8 commit c8018ef

File tree

1 file changed

+21
-16
lines changed

1 file changed

+21
-16
lines changed

sections/geopandas.qmd

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -86,21 +86,26 @@ The upper left panel of the figure above shows some satellite imagery data. Thes
8686
This is all great, and the array of values is a lot of information, but there are some key items that are missing. This array isn't imaginary, it represents a physical space on this earth, so where is all of that contextual information? The answer is in the `rasterio` profile object. This object contains all of the metadata needed to interpret the raster array. Here is what our `ships_meta` contains:
8787

8888
```
89-
'driver': 'GTiff',
90-
'dtype': 'float32',
91-
'nodata': -3.3999999521443642e+38,
92-
'width': 3087,
93-
'height': 2308,
94-
'count': 1,
95-
'crs': CRS.from_epsg(3338),
96-
'transform': Affine(999.7994153462766, 0.0, -2550153.29233849, 0.0, -999.9687691991521, 2711703.104608573),
97-
'tiled': False,
98-
'compress': 'lzw',
99-
'interleave': 'band'}
89+
{'blockxsize': 3087,
90+
'blockysize': 1,
91+
'compress': 'lzw',
92+
'count': 1,
93+
'crs': CRS.from_wkt('PROJCS["unnamed",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101004,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Albers_Conic_Equal_Area"],PARAMETER["latitude_of_center",50],PARAMETER["longitude_of_center",-154],PARAMETER["standard_parallel_1",55],PARAMETER["standard_parallel_2",65],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]'),
94+
'driver': 'GTiff',
95+
'dtype': 'float32',
96+
'height': 2308,
97+
'interleave': 'band',
98+
'nodata': -3.3999999521443642e+38,
99+
'tiled': False,
100+
'transform': Affine(999.7994153462766, 0.0, -2550153.29233849,
101+
0.0, -999.9687691991521, 2711703.104608573),
102+
'width': 3087}
100103
```
101104

102105
This object gives us critical information, like the CRS of the data, the no data value, and the transform. The transform is what allows us to move from image pixel (row, column) coordinates to and from geographic/projected (x, y) coordinates. The transform and the CRS are critically important, and related. If the CRS are instructions for how the coordinates can be represented in space and on a flat surface (in the case of projected coordinate systems), then the transform describes how to locate the raster array positions in the correct coordinates given by the CRS.
103106

107+
In the Introduction section of this chapter, we mentioned that the CRS of this raster file would be Alaska Albers with an EPSG code of 3338. Along with this CRS being represented by an EPSG code, it can also be represented by a WKT (Well-Known-Text) format, which is what see in the 'crs' section of `ships_meta`. A CRS can be created from the WKT information just like how a CRS can be created from an EPSG code. In this [description of the Alaska Albers CRS](https://epsg.io/3338), we can see the WKT defined at the bottom.
108+
104109
Note that since the array and the profile are in separate objects it is easy to lose track of one of them, accidentally overwrite it, etc. Try to adopt a naming convention that works for you because they usually need to work together in geospatial operations.
105110

106111
## Pre-processing vector data
@@ -169,12 +174,12 @@ comm.plot(figsize=(9,9))
169174
This plot doesn't look so good. Turns out, these data are in WGS 84 (EPSG 4326), as opposed to Alaska Albers (EPSG 3338), which is what our raster data are in. To make pretty plots, and allow our raster data and vector data to be analyzed together, we'll need to reproject the vector data into 3338. To to this, we'll use the `to_crs` method on our `comm` object, and specify as an argument the projection we want to transform to.
170175

171176
```{python}
172-
comm_3338 = comm.to_crs("EPSG:3338")
177+
comm_3338 = comm.to_crs(ships_meta["crs])
173178
174179
comm_3338.plot()
175180
```
176181

177-
Much better!
182+
Much better! Additionally, we could have reprojected the data using `comm_3338 = comm.to_crs("EPSG:3338")`.
178183

179184
## Crop data to area of interest
180185

@@ -189,10 +194,10 @@ coord_box = box(-159.5, 55, -144.5, 62)
189194
190195
coord_box_df = gpd.GeoDataFrame(
191196
crs = 'EPSG:4326',
192-
geometry = [coord_box]).to_crs("EPSG:3338")
197+
geometry = [coord_box]).to_crs(ships_meta["crs"])
193198
```
194199

195-
Now, we can read in raster again cropped to bounding box. We use the `mask` function from `rasterio.mask`. Note that we apply this to the connection to the raster file (`with rasterio.open(...)`), then update the metadata associated with the raster, because the `mask` function requires as its first `dataset` argument a dataset object opened in `r` mode.
200+
Now, we can read in the raster again, but cropped to bounding box. We use the `mask` function from `rasterio.mask`. Note that we apply this to the connection to the raster file (`with rasterio.open(...)`), then update the metadata associated with the raster, because the `mask` function requires as its first `dataset` argument a dataset object opened in `r` mode.
196201

197202

198203
```{python}
@@ -219,7 +224,7 @@ shipc_meta.update({"driver": "GTiff",
219224
"compress": "lzw"})
220225
```
221226

222-
Now we'll do a similar task with the vector data. Tin this case, we use a spatial join. The join will be an inner join, and select only rows from the left side (our fishing districts) that are **within** the right side (our bounding box). I chose this method as opposed to a clipping type operation because it ensures that we don't end up with any open polygons at the boundaries of our box, which could cause problems for us down the road.
227+
Now we'll do a similar task with the vector data. In this case, we use a spatial join. The join will be an inner join, and select only rows from the left side (our fishing districts) that are **within** the right side (our bounding box). I chose this method as opposed to a clipping type operation because it ensures that we don't end up with any open polygons at the boundaries of our box, which could cause problems for us down the road.
223228

224229
```{python}
225230
comm_clip = gpd.sjoin(comm_3338,

0 commit comments

Comments
 (0)