-
Notifications
You must be signed in to change notification settings - Fork 20
/
interactive-maps.Rmd
444 lines (349 loc) · 20.9 KB
/
interactive-maps.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
# Interactive maps
```{r, echo=FALSE}
detachAllPackages()
#library(webshot)
#library(htmlwidgets)
```
For this chapter you'll need the following files, which are available for download [here](https://github.com/jacobkap/crimebythenumbers/tree/master/data): san_francisco_marijuana_geocoded.csv and sf_neighborhoods_suicide.rda.
While maps of data are useful, their ability to show incident-level information is quite limited. They tend to show broad trends - where crime happened in a city - rather than provide information about specific crime incidents. While broad trends are important, there are significant drawbacks about being unable to get important information about an incident without having to check the data. An interactive map bridges this gap by showing trends while allowing you to zoom into individual incidents and see information about each incident.
For this lesson we will be using data on every marijuana dispensary in San Francisco that has an active dispensary license as of late September 2019. The file is called "san_francisco_marijuana_geocoded.csv".
When downloaded from California's Bureau of Cannabis Control ([here](https://aca5.accela.com/bcc/customization/bcc/cap/licenseSearch.aspx) if you're interested) the data contains the address of each dispensary but does not have coordinates. Without coordinates we are unable to map points, meaning we need to geocode them. Geocoding is the process of taking an address and getting the longitude and latitude of that address for mapping. For this lesson I've already geocoded the data, and we'll learn how to do so in Chapter \@ref(geocoding).
```{r}
library(readr)
marijuana <- read_csv("data/san_francisco_marijuana_geocoded.csv")
marijuana <- as.data.frame(marijuana)
```
## Why do interactive graphs matter?
### Understanding your data
The most important thing to learn from this book is that understanding your data is crucial to good research. Making interactive maps is a very useful way to better understand your data as you can immediately see geographic patterns and quickly look at characteristics of those incidents to understand them.
In this lesson we will make a map of each marijuana dispensary in San Francisco that lets you click on the dispensary and see some information about it. If we see a cluster of dispensaries, we can click on each one to see if they are similar - for example, if owned by the same person. Though it is possible to find these patterns just looking at the data, it is easier to be able to see a geographic pattern and immediately look at information about each incident.
### Police departments use them
Interactive maps are popular in large police departments, such as Philadelphia and New York City. They allow easy understanding of geographic patterns in the data and, importantly, allow such access to people who do not have the technical skills necessary to interact with the data itself. If nothing else, learning interactive maps may help you with a future job.
## Making the interactive map
As usual, let's take a look at the top 6 rows of the data.
```{r}
head(marijuana)
```
This data has information about the type of license, who the owner is, and where the dispensary is (as an address and as coordinates). We'll be making a map showing every dispensary in the city and make it so when you click a dot it'll make a popup showing information about that dispensary.
We will use the package `leaflet` for our interactive map. `leaflet` produces maps similar to Google Maps with circles (or any icon we choose) for each value we add to the map. It allows you to zoom in, scroll around, and provides context to each incident that isn't available on a static map.
```{r, eval = FALSE}
install.packages("leaflet")
```
```{r}
library(leaflet)
```
To make a `leaflet` map we need to run the function `leaflet()` and add a tile to the map. We can just use the default tile which doesn't need an input. If you're interested in other tiles, please see this [website](https://leaflet-extras.github.io/leaflet-providers/preview/).
We will use a standard tile from Open Street Maps. This tile gives street names and highlights important features such as parks and large stores which provides useful contexts for looking at the data.
```{r, eval = FALSE}
leaflet() %>%
addTiles()
```
```{r, echo = FALSE, eval = FALSE}
m <- leaflet() %>%
addTiles()
saveWidget(m, "leaflet_map.html", selfcontained = FALSE)
webshot("leaflet_map.html",
file = "images/leaflet_map1.png",
cliprect = "viewport")
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_map1.png")
```
When you run the above code it shows a world map (copied several times). Zoom into it, and it'll start showing relevant features of wherever you're looking.
Note the `%>%` between the `leaflet()` function and the `addTiles()` function. `leaflet` is one of the packages in R where we can use pipes.
To add the points to the graph we use the function `addMarkers()`, which has two parameters, `lng` and `lat`. For both parameters we put the column in which the longitude and latitude are, respectively.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addMarkers(lng = marijuana$long,
lat = marijuana$lat)
```
```{r, echo = FALSE, eval = FALSE}
m <- leaflet() %>%
addTiles() %>%
addMarkers(lng = marijuana$long,
lat = marijuana$lat)
saveWidget(m, "leaflet_map.html", selfcontained = FALSE)
webshot("leaflet_map.html",
file = "images/leaflet_map2.png",
cliprect = "viewport")
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_map2.png")
```
It now adds an icon indicating where every dispensary in our data is. You can zoom in and scroll around to see more about where the dispensaries are. There are only a few dozen locations in the data so the popups overlapping a bit doesn't affect our map too much. If we had more - such as crime data with millions of offenses - it would make it very hard to read. To change the icons to circles we can change the function `addMarkers()` to `addCircleMarkers()`, keeping the rest of the code the same.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat)
```
```{r, echo = FALSE, eval = FALSE}
m <- leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat)
saveWidget(m, "leaflet_map.html", selfcontained = FALSE)
webshot("leaflet_map.html",
file = "images/leaflet_map3.png",
cliprect = "viewport")
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_map3.png")
```
This makes the icon into circles, which take up less space than icons. To adjust the size of our icons we use the `radius` parameter in `addMarkers()` or `addCircleMarkers()`. The larger the radius, the larger the icons.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5)
```
```{r, echo = FALSE, eval = FALSE}
m <- leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5)
saveWidget(m, "leaflet_map.html", selfcontained = FALSE)
webshot("leaflet_map.html",
file = "images/leaflet_map4.png",
cliprect = "viewport")
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_map4.png")
```
Setting the `radius` option to 5 shrinks the size of the icon a lot. In your own maps you'll have to fiddle with this option to get it to look the way you want. Let's move on to adding information about each icon when clicked upon.
## Adding popup information
The parameter `popup` in the `addMarkers()` or `addCircleMarkers()` functions lets you input a character value (if not already a character value it will convert it to one) and that will be shown as a popup when you click on the icon. Let's start simple here by inputting the business owner column in our data and then build it up to a more complicated popup.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5,
popup = marijuana$Business_Owner)
```
```{r, echo = FALSE}
knitr::include_graphics("images/interactive_popup1.PNG")
```
Try clicking around and you'll see that the owner of the dispensary you clicked on appears over the dot. If you're reading the print version of this book you won't, of course, be able to click on the map. We usually want to have a title indicating what the value in the popup means. We can do this by using the `paste()` function to combine text explaining the value with the value itself. Let's add the words "Business Owner:" before the business owner column.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5,
popup = paste("Business Owner:",
marijuana$Business_Owner))
```
```{r, echo = FALSE}
knitr::include_graphics("images/interactive_popup2.PNG")
```
We don't have too much information in the data, but let's add the address and license number to the popup by adding them to the `paste()` function we're using.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5,
popup = paste("Business Owner:",
marijuana$Business_Owner,
"Address:",
marijuana$Premise_Address,
"License:",
marijuana$License_Number))
```
```{r, echo = FALSE}
knitr::include_graphics("images/interactive_popup3.PNG")
```
Just adding the location text makes it try to print out everything on one line, which is hard to read. If we add the text `<br>` where we want a line break, it will make one. `<br>` is the HTML tag for line-break, which is why it works making a new line in this case.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5,
popup = paste("Business Owner:",
marijuana$Business_Owner,
"<br>",
"Address:",
marijuana$Premise_Address,
"<br>",
"License:",
marijuana$License_Number))
```
```{r, echo = FALSE}
knitr::include_graphics("images/interactive_popup4.PNG")
```
## Dealing with too many markers
In our case with only 33 rows of data, turning the markers to circles solves our visibility issue. In cases with many more rows of data, this doesn't always work. A solution for this is to cluster the data into groups where the dots only show if you zoom in.
If we add the code `clusterOptions = markerClusterOptions()` to our `addCircleMarkers()` it will cluster for us.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5,
popup = paste("Business Owner:",
marijuana$Business_Owner,
"<br>",
"Address:",
marijuana$Premise_Address,
"<br>",
"License:",
marijuana$License_Number),
clusterOptions = markerClusterOptions())
```
```{r, echo = FALSE, eval = FALSE}
m <- leaflet() %>%
addTiles() %>%
addCircleMarkers(lng = marijuana$long,
lat = marijuana$lat,
radius = 5,
popup = paste("Business Owner:",
marijuana$Business_Owner,
"<br>",
"Address:",
marijuana$Premise_Address,
"<br>",
"License:",
marijuana$License_Number),
clusterOptions = markerClusterOptions())
saveWidget(m, "leaflet_map.html", selfcontained = FALSE)
webshot("leaflet_map.html",
file = "images/leaflet_map_temp.png",
cliprect = "viewport")
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_map_temp.png")
```
Locations close to each other are grouped together in fairly arbitrary groupings, and we can see how large each grouping is by moving our cursor over the circle. Click on a circle or zoom in and it will show smaller groupings at lower levels of aggregation. Keep clicking or zooming in, and it will eventually show each location as its own circle.
This method is very useful for dealing with huge amounts of data as it avoids overflowing the map with too many icons at one time. A downside, however, is that the clusters are created arbitrarily meaning that important context, such as neighborhood, can be lost.
## Interactive choropleth maps
In Chapter \@ref(choropleth-maps) we worked on choropleth maps which are maps with shaded regions, such as states colored by which political party won them in an election. Here we will make interactive choropleth maps where you can click on a shaded region and see information about that region. We'll make the same map as before - neighborhoods shaded by the number of suicides.
Let's load the San Francisco suicides-by-neighborhood data that we made earlier. We'll also want to project it to the standard longitude and latitude projection, otherwise our map won't work right.
```{r}
library(sf)
load("data/sf_neighborhoods_suicide.rda")
sf_neighborhoods_suicide <- st_transform(sf_neighborhoods_suicide,
"+proj=longlat +datum=WGS84")
```
We'll begin the `leaflet` map similar to before but use the function `addPolygons()`, and our input here is the geometry column of *sf_neighborhoods_suicide*.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry)
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth1.png")
```
It made a map with thick blue lines indicating each neighborhood. Let's change the appearance of the graph a bit before making a popup or shading the neighborhoods The parameter `color` in `addPolygons()` changes the color of the lines - let's change it to black. The lines are also very thick, blurring into each other and making the neighborhoods hard to see. We can change the `weight` parameter to alter the size of these lines - smaller values are thinner lines. Let's try setting this to 1.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry,
color = "black",
weight = 1)
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth2.png")
```
That looks better and we can clearly distinguish each neighborhood now.
As we did earlier, we can add the popup text directly to the function which makes the geographic shapes, in this case `addPolygons()`. Let's add the *nhood* column value - the name of that neighborhood - and the number of suicides that occurred in that neighborhood. As before, when we click on a neighborhood a popup appears with the output we specified.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry,
col = "black",
weight = 1,
popup = paste0("Neighborhood: ",
sf_neighborhoods_suicide$nhood,
"<br>",
"Number of Suicides: ",
sf_neighborhoods_suicide$number_suicides))
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth3.png")
```
For these types of maps we generally want to shade each polygon to indicate how frequently the event occurred in the polygon. We'll use the function `colorNumeric()`, which takes a lot of the work out of the process of coloring in the map. This function takes two inputs, first a color palette, which we can get from the site [Color Brewer.](http://colorbrewer2.org/#type=sequential&scheme=OrRd&n=3) Let's use the fourth bar in the Sequential page, which is light orange to red. If you look in the section with each HEX value it says that the palette is "3-class OrRd." The "3-class" just means we selected 3 colors, the "OrRd" is the part we want. That will tell `colorNumeric()` to make the palette using these colors. The second parameter is the column for our numeric variable, *number_suicides*.
We will save the output of `colorNumeric("OrRd", sf_neighborhoods_suicide$number_suicides)` as a new object, which we'll call *pal* for convenience since it is a palette of colors. Then inside of `addPolygons()` we'll set the parameter `fillColor` to `pal(sf_neighborhoods_suicide$number_suicides)`, running this function on the column. What this really does is determine which color every neighborhood should be based on the value in the *number_suicides* column.
```{r, eval = FALSE}
pal <- colorNumeric("OrRd", sf_neighborhoods_suicide$number_suicides)
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry,
col = "black",
weight = 1,
popup = paste0("Neighborhood: ",
sf_neighborhoods_suicide$nhood,
"<br>",
"Number of Suicides: ",
sf_neighborhoods_suicide$number_suicides),
fillColor = pal(sf_neighborhoods_suicide$number_suicides))
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth4.png")
```
Since the neighborhoods are transparent, it is hard to distinguish which color is shown. We can make each neighborhood a solid color by setting the parameter `fillOpacity` inside of `addPolygons()` to 1.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry,
col = "black",
weight = 1,
popup = paste0("Neighborhood: ",
sf_neighborhoods_suicide$nhood,
"<br>",
"Number of Suicides: ",
sf_neighborhoods_suicide$number_suicides),
fillColor = pal(sf_neighborhoods_suicide$number_suicides),
fillOpacity = 1)
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth5.png")
```
To add a legend to this we use the function `addLegend()`, which takes three parameters. `pal` asks which color palette we are using - we want it to be the exact same as we use to color the neighborhoods, so we'll use the *pal* object we made. The `values` parameter is used for which column our numeric values are from, in our case the *number_suicides* column so we'll input that. Finally `opacity` determines how transparent the legend will be. As each neighborhood is set to not be transparent at all, we'll also set this to 1 to be consistent.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry,
col = "black",
weight = 1,
popup = paste0("Neighborhood: ",
sf_neighborhoods_suicide$nhood,
"<br>",
"Number of Suicides: ",
sf_neighborhoods_suicide$number_suicides),
fillColor = pal(sf_neighborhoods_suicide$number_suicides),
fillOpacity = 1) %>%
addLegend(pal = pal,
values = sf_neighborhoods_suicide$number_suicides,
opacity = 1)
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth6.png")
```
Finally, we can add a title to the legend using the `title` parameter inside of `addLegend()`.
```{r, eval = FALSE}
leaflet() %>%
addTiles() %>%
addPolygons(data = sf_neighborhoods_suicide$geometry,
col = "black",
weight = 1,
popup = paste0("Neighborhood: ",
sf_neighborhoods_suicide$nhood,
"<br>",
"Number of Suicides: ",
sf_neighborhoods_suicide$number_suicides),
fillColor = pal(sf_neighborhoods_suicide$number_suicides),
fillOpacity = 1) %>%
addLegend(pal = pal,
values = sf_neighborhoods_suicide$number_suicides,
opacity = 1,
title = "Suicides") %>%
addProviderTiles(providers$CartoDB.Positron)
```
```{r, echo = FALSE}
knitr::include_graphics("images/leaflet_choropleth7.png")
```