class: center, middle, inverse, title-slide # Introduction to geocomputation with R ##
### Jakub Nowosad
(
https://nowosad.github.io
) ### CinDay RUG, 2018-05-22 --- ### Geocomputation - **GIS** - A **G**eographic **I**nformation **S**ystem - analysis, manipulation and visualization of geographical data (Longley, Goodchild, Maguire, and Rhind, 2015) - GIS software - GRASS GIS, SAGA GIS, QGIS, ArcGIS (commercial) <table> <thead> <tr> <th style="text-align:left;"> Attribute </th> <th style="text-align:left;"> GIS </th> <th style="text-align:left;"> Geocomputation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Home disciplines </td> <td style="text-align:left;"> Geography </td> <td style="text-align:left;"> Geography, Computing, Statistics </td> </tr> <tr> <td style="text-align:left;"> Software focus </td> <td style="text-align:left;"> Graphical User Interface </td> <td style="text-align:left;"> Code </td> </tr> <tr> <td style="text-align:left;"> Reproduciblility </td> <td style="text-align:left;"> Minimal </td> <td style="text-align:left;"> Maximal </td> </tr> </tbody> </table> - Geographic data is represented by **spatial coordinates** - There are many **Coordinate Reference Systems (CRS)**, which defines how the spatial elements of the data relate to the surface of the Earth (or other bodies) - There are two main models for representing geographic data - the **vector** and the **raster** data model --- ### Spatial R history - Spatial packages already available in the S language since the 1990s (Bivand and Gebhardt, 2000) - By 2000, modifications of these became R packages for point pattern analysis, geostatistics, exploratory - spatial data analysis and spatial econometrics - R-GIS bridges (Bivand, 2000) - Bivand (2003) proposed a spatial data class system for R which eventually led to the packages **rgdal** (first released in 2003; Bivand, Keitt, and Rowlingson, 2018) and **sp** (first released in 2005; Bivand, Pebesma, and Gomez-Rubio, 2013) - 2008: *Applied Spatial Data Analysis with R* (Bivand, Pebesma, and Gomez-Rubio, 2013) - 2010: **raster** package (Hijmans, 2017) - 2011: **rgeos** package (Bivand and Rundel, 2017) - 2014-15: Spatial visualization - **tmap** (Tennekes, 2018), **rasterVis** (Lamigueiro, 2014), **leaflet** (Cheng, Karambelkar, and Xie, 2018) - 2016-17: **sf** - simple features for R (Pebesma, 2018a) - 2017-18: **stars** - spatiotemporal tidy arrays for R (Pebesma, 2018b) --- ### Spatial R ![](figs/vecras-1.png)<!-- --> - **sf** and **sp** are the most important R packages to handle vector data; **sf** is a successor of **sp** but its still evolving. Moreover, many other R packages depend on the functions and classes for the **sp** package - **raster** is an extension of spatial data classes to work with rasters - It is also easy to connect R with a GIS software - GRASS GIS (**rgrass7**), SAGA GIS (**RSAGA**), QGIS (**RQGIS** and **qgisremote**), and ArcGIS (**arcgisbinding**) --- ### The **sf** package The **sf** package in an R implementation of [Simple Features](https://en.wikipedia.org/wiki/Simple_Features). This package incorporates: - A relatively new spatial data class system in R - Functions for reading and writing data - Tools for spatial operations on vectors Most of the functions in this package start with a prefix `st_`. ```r devtools::install_github("r-spatial/sf") # development version ``` or ```r install.packages("sf") # stable version ``` You need a recent version of the GDAL, GEOS, Proj.4, and UDUNITS libraries installed for this to work on Mac and Linux. More information on that at https://github.com/r-spatial/sf. --- ### How to put this data on the map? - I've got some non-spatial data - How can I **add spatial information to my data**? - How to visualize my locations on a map? ```r library(readr) my_df = read_csv("breweries.csv") ``` <table> <thead> <tr> <th style="text-align:left;"> brewery </th> <th style="text-align:right;"> no_of_beers </th> <th style="text-align:right;"> revenue </th> <th style="text-align:right;"> profit </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Listermann Brewing Company </td> <td style="text-align:right;"> 999 </td> <td style="text-align:right;"> 7986 </td> <td style="text-align:right;"> 513 </td> </tr> <tr> <td style="text-align:left;"> Rhinegeist Brewery </td> <td style="text-align:right;"> 5155 </td> <td style="text-align:right;"> 4651 </td> <td style="text-align:right;"> 651 </td> </tr> <tr> <td style="text-align:left;"> The Woodburn Brewery </td> <td style="text-align:right;"> 3111 </td> <td style="text-align:right;"> 5156 </td> <td style="text-align:right;"> 879 </td> </tr> <tr> <td style="text-align:left;"> Urban Artifact </td> <td style="text-align:right;"> 710 </td> <td style="text-align:right;"> 8847 </td> <td style="text-align:right;"> 634 </td> </tr> </tbody> </table> > Note: All the data presented here (except the names) are artificial! --- ### Spatial data - Usually, spatial files are stored in special data formats, such as ESRI Shapefile (.shp) or **GeoPackage (.gpkg)** - Vector data can be represented by areas (**polygons**), **lines**, or **points** ```r library(sf) breweries_sf1 = st_read("breweries_sf1.gpkg", quiet = TRUE) breweries_sf1 ``` ``` ## Simple feature collection with 2 features and 1 field ## geometry type: POINT ## dimension: XY ## bbox: xmin: 1327662 ymin: -529531.6 xmax: 1331125 ymax: -525580 ## epsg (SRID): NA ## proj4string: +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs ## name geom ## 1 Listermann Brewing Company POINT (1331125 -525580) ## 2 Rhinegeist Brewery POINT (1327662 -529531.6) ``` --- ### Geocode - Only two of my locations exist in my spatial file - How can I get coordinates of my missing breweries? - The answer is **geocoding** - converting names/addresses into spatial coordinates ```r library(tibble) library(ggmap) breweries_df2 = data_frame(name = c("The Woodburn Brewery", "Urban Artifact")) breweries_df2 = mutate_geocode(breweries_df2, name) ``` ```r breweries_df2 ``` ``` ## # A tibble: 2 x 3 ## name lon lat ## <chr> <dbl> <dbl> ## 1 The Woodburn Brewery -84.5 39.1 ## 2 Urban Artifact -84.5 39.2 ``` - Also try the **opencage** package - https://github.com/ropensci/opencage --- ### Convert to spatial object ```r breweries_df2 ``` ``` ## # A tibble: 2 x 3 ## name lon lat ## <chr> <dbl> <dbl> ## 1 The Woodburn Brewery -84.5 39.1 ## 2 Urban Artifact -84.5 39.2 ``` ```r breweries_sf2 = st_as_sf(breweries_df2, coords = c("lon", "lat"), crs = 4326) breweries_sf2 ``` ``` ## Simple feature collection with 2 features and 1 field ## geometry type: POINT ## dimension: XY ## bbox: xmin: -84.542 ymin: 39.12895 xmax: -84.47681 ymax: 39.16068 ## epsg (SRID): 4326 ## proj4string: +proj=longlat +datum=WGS84 +no_defs ## # A tibble: 2 x 2 ## name geometry ## <chr> <POINT [°]> ## 1 The Woodburn Brewery (-84.47681 39.12895) ## 2 Urban Artifact (-84.542 39.16068) ``` --- ### Unify and combine - Transform to the same spatial projection (`st_transform`) and unify the columns names (`rename`): ```r breweries_sf1 = st_transform(breweries_sf1, crs = st_crs(breweries_sf2)) breweries_sf1 = dplyr::rename(breweries_sf1, geometry = geom) ``` - Bind the spatial datasets (`rbind`): ```r breweries_sf = rbind(breweries_sf1, breweries_sf2) breweries_sf ``` ``` ## Simple feature collection with 4 features and 1 field ## geometry type: POINT ## dimension: XY ## bbox: xmin: -84.542 ymin: 39.11727 xmax: -84.4724 ymax: 39.16068 ## epsg (SRID): 4326 ## proj4string: +proj=longlat +datum=WGS84 +no_defs ## name geometry ## 1 Listermann Brewing Company POINT (-84.4724 39.14628) ## 2 Rhinegeist Brewery POINT (-84.5201 39.11727) ## 3 The Woodburn Brewery POINT (-84.47681 39.12895) ## 4 Urban Artifact POINT (-84.542 39.16068) ``` --- ### Join - Join the spatial and non-spatial datasets: ```r library(dplyr) my_breweries = left_join(breweries_sf, my_df, by = c("name" = "brewery")) my_breweries ``` ``` ## Simple feature collection with 4 features and 4 fields ## geometry type: POINT ## dimension: XY ## bbox: xmin: -84.542 ymin: 39.11727 xmax: -84.4724 ymax: 39.16068 ## epsg (SRID): 4326 ## proj4string: +proj=longlat +datum=WGS84 +no_defs ## name no_of_beers revenue profit ## 1 Listermann Brewing Company 999 7986 513 ## 2 Rhinegeist Brewery 5155 4651 651 ## 3 The Woodburn Brewery 3111 5156 879 ## 4 Urban Artifact 710 8847 634 ## geometry ## 1 POINT (-84.4724 39.14628) ## 2 POINT (-84.5201 39.11727) ## 3 POINT (-84.47681 39.12895) ## 4 POINT (-84.542 39.16068) ``` --- ### Plot - There are several ways to visualize spatial data in R - as a static map (*plot()*, **ggplot2**, **tmap**) and interactive one (**leaflet**, **mapview**, **tmap**) ```r library(mapview) mapview(my_breweries) ```
--- class: middle ![](figs/geocompr.png) The online version of the book is at http://geocompr.robinlovelace.net/ and its source code at https://github.com/robinlovelace/geocompr. We encourage contributions on any part of the book, including: - Improvements to the text, e.g. clarifying unclear sentences, fixing typos (see guidance from [Yihui Xie](https://yihui.name/en/2013/06/fix-typo-in-documentation/)) - Changes to the code, e.g. to do things in a more efficient way - Suggestions on content (see the projects [issue tracker](https://github.com/Robinlovelace/geocompr/issues) Please see [our-style.md](https://github.com/Robinlovelace/geocompr/blob/master/our-style.md) for the book's style. --- class: middle # Thanks!
jakub_nowosad
https://geocompr.robinlovelace.net/