-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or request
Description
Väestörekisterikeskus publishes annually data containing all buildings in Finland. Data is zipped delimited file with .OPT
-extension and has 3,6 million rows. It can be read and processed in R (slowly) with following code:
# 2019
library(dplyr)
library(sp)
library(sf)
tmpfile <- tempfile()
tmpdir <- tempdir()
download.file("https://www.avoindata.fi/data/dataset/cf9208dc-63a9-44a2-9312-bbd2c3952596/resource/ae13f168-e835-4412-8661-355ea6c4c468/download/suomi_osoitteet_2019-05-15.zip",
destfile = tmpfile)
unzip(zipfile = tmpfile,
exdir = tmpdir)
opt <- read.csv(glue::glue("{tmpdir}/Suomi_osoitteet_2019-05-15.OPT"),
sep = ";",
stringsAsFactors = FALSE,
header = FALSE)
names(opt) <- c("rakennustu","sijaintiku",
"sijaintima","rakennusty",
"CoordY","CoordX",
"osoitenume", "katunimi_f",
"katunimi_s", "katunumero",
"postinumer", "vaalipiirikoodi",
"vaalipiirinimi","tyhja",
"idx", "date")
if (F){ # subsetting just to make conversions faster
opt_orig <- as_tibble(opt)
opt <- sample_n(opt_orig, size = 2000)
}
opt$katunimi_f <- iconv(opt$katunimi_f, from = "windows-1252", to = "UTF-8")
opt$katunimi_s <- iconv(opt$katunimi_s, from = "windows-1252", to = "UTF-8")
opt$katunumero <- iconv(opt$katunumero, from = "windows-1252", to = "UTF-8")
opt$vaalipiirinimi <- iconv(opt$vaalipiirinimi, from = "windows-1252", to = "UTF-8")
sp.data <- SpatialPointsDataFrame(opt[, c("CoordX", "CoordY")],
opt,
proj4string = CRS("+init=epsg:3067"))
# Project the spatial data to lat/lon
# sp.data <- spTransform(sp.data, CRS("+proj=longlat +datum=WGS84"))
shape <- st_as_sf(sp.data)
st_coordinates(shape)
# shape %>% select(rakennustu) %>% plot()
saveRDS(shape, file=paste0("./sf19_buildings.RDS"))
Any ideas how to incorporate this with geofi
. It is useful for instance when geocoding sensitive addresses.
However, this would require a storage as the data should be preprocessed. Do you think this as a suitable data for geofi
and should we create a data repo such as geofi_data
?
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or request