Preface
Who this book is for
This book is for people who want to analyze, visualize and model geographic data with open source software. It is based on R, a statistical programming language that has powerful data processing, visualization and geospatial capabilities. The book covers a wide range of topics and will be of interest to a wide range of people from many different backgrounds, especially:
-
People who have learned spatial analysis skills using a desktop Geographic Information System (GIS), such as QGIS, ArcGIS, GRASS GIS or SAGA, who want access to a powerful (geo)statistical and visualization programming language and the benefits of a command line approach (Sherman 2008):
With the advent of ‘modern’ GIS software, most people want to point and click their way through life. That’s good, but there is a tremendous amount of flexibility and power waiting for you with the command line.
Graduate students and researchers from fields specializing in geographic data including Geography, Remote Sensing, Planning, GIS and Spatial Data Science
Academics and post-graduate students working with geographic data — in fields such as Geology, Regional Science, Biology and Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, and broadly defined Data Science — who require the power and flexibility of R for their research
Applied researchers and analysts in public, private or third-sector organizations who need the reproducibility, speed and flexibility of a command line language such as R in applications dealing with spatial data as diverse as Urban and Transport Planning, Logistics, Geo-marketing (store location analysis) and Emergency Planning
The book is designed for intermediate-to-advanced R users interested in geocomputation and R beginners who have prior experience with geographic data. If you are new to both R and geographic data, do not be discouraged: we provide links to further materials and describe the nature of spatial data from a beginner’s perspective in Chapter 2 and in links provided below.
How to read this book
The book is divided into three parts:
- Part I: Foundations, aimed at getting you up-to-speed with geographic data in R.
- Part II: Advanced techniques, including spatial data visualization, bridges to GIS software, programming with spatial data, and statistical learning.
- Part III: Applications to real-world problems, including transportation, geomarketing and ecological modeling.
The chapters get harder from one part to the next. We recommend reading all chapters in Part I in order before tackling the more advanced topics in Part II and Part III. The chapters in Part II and Part III benefit slightly from being read in order, but can be read independently if you are interested in a specific topic. A major barrier to geographical analysis in R is its steep learning curve. The chapters in Part I aim to address this by providing reproducible code on simple datasets that should ease the process of getting started.
An important aspect of the book from a teaching/learning perspective is the exercises at the end of each chapter. Completing these will develop your skills and equip you with the confidence needed to tackle a range of geospatial problems. Solutions to the exercises can be found in an online booklet that accompanies Geocomputation with R, hosted at r.geocompx.org/solutions. To learn how this booklet was created, and how to update solutions in files such as _01-ex.Rmd, see our blog post on Geocomputation with R solutions. More blog posts and examples can be found at geocompx.org.
Impatient readers are welcome to dive straight into the practical examples, starting in Chapter 2. However, we recommend reading about the wider context of Geocomputation with R in Chapter 1 first. If you are new to R, we also recommend learning more about the language before attempting to run the code chunks provided in each chapter (unless you’re reading the book for an understanding of the concepts). Fortunately for beginners, R has a supportive community that has developed a wealth of resources that can help. We particularly recommend three tutorials: R for Data Science (Grolemund and Wickham 2016) Efficient R Programming (Gillespie and Lovelace 2016), and An introduction to R (R Core Team 2021).
Why R?
Although R has a steep learning curve, the command line approach advocated in this book can quickly pay off. As you’ll learn in subsequent chapters, R is an effective tool for tackling a wide range of geographic data challenges. We expect that, with practice, R will become the program of choice in your geospatial toolbox for many applications. Typing and executing commands at the command line is, in many cases, faster than pointing-and-clicking around the graphical user interface (GUI) of a desktop GIS. For some applications such as Spatial Statistics and modeling, R may be the only realistic way to get the work done.
As outlined in Section 1.3, there are many reasons for using R for geocomputation:
R is well suited to the interactive use required in many geographic data analysis workflows compared with other languages.
R excels in the rapidly growing fields of Data Science (which includes data carpentry, statistical learning techniques and data visualization) and Big Data (via efficient interfaces to databases and distributed computing systems).
Furthermore, R enables a reproducible workflow: sharing scripts underlying your analysis will allow others to build on your work.
To ensure reproducibility in this book, we have made its source code available at github.com/geocompx/geocompr.
There you will find script files in the code/
folder that generate figures:
when code generating a figure is not provided in the main text of the book, the name of the script file that generated it is provided in the caption (see for example the caption for Figure 13.2).
Other languages such as Python, Java and C++ can be used for geocomputation. There are excellent resources for learning geocomputation without R, as discussed in Section 1.4. None of these provide the unique combination of package ecosystem, statistical capabilities, and visualization options offered by the R community. Furthermore, by teaching how to use one language (R) in depth, this book will equip you with the concepts and confidence needed to do geocomputation in other languages.
Real-world impact
Geocomputation with R will equip you with knowledge and skills to tackle a wide range of issues, including those with scientific, societal and environmental implications, manifested in geographic data. As described in Section 1.1, geocomputation is not only about using computers to process geographic data, it is also about real-world impact. The wider context and motivations underlying this book are covered in Chapter 1.
Acknowledgments
Many thanks to everyone who contributed directly and indirectly via the code hosting and collaboration site GitHub, including the following people who contributed direct via pull requests: prosoitos, tibbles-and-tribbles, florisvdh, babayoshihiko, katygregg, Lvulis, rsbivand, iod-ine, KiranmayiV, cuixueqin, defuneste, smkerr, zmbc, marcosci, darrellcarvalho, dcooley, FlorentBedecarratsNM, erstearns, appelmar, MikeJohnPage, eyesofbambi, krystof236, nickbearman, tylerlittlefield, sdesabbata, howardbaik, edzer, pat-s, giocomai, KHwong12, LaurieLBaker, eblondel, MarHer90, mdsumner, ahmohil, richfitz, VLucet, wdearden, yihui, adambhouston, chihinl, cshancock, e-clin, ec-nebi, gregor-d, jasongrahn, p-kono, pokyah, schuetzingit, tim-salabim, tszberkowitz, vlarmet, ateucher, annakrystalli, andtheWings, kant, gavinsimpson, Himanshuteli, yutannihilation, jimr1603, jbixon13, jkennedyie, olyerickson, yvkschaefer, katiejolly, kwhkim, layik, mpaulacaldas, mtennekes, mvl22, and ganes1410.
Thanks to Marco Sciaini who created the front cover image for the first edition and to Benjamin Nowak who created the cover image for the second edition.
See code/frontcover.R
and code/frontcover2.R
for the reproducible code that generated these visualizations.
Dozens more people contributed online, by raising and commenting on issues, and by providing feedback via social media.
The #geocompr
and geocompx
hashtags will live on!
We would like to thank John Kimmel and Lara Spieker from CRC Press and Taylor & Francis for taking our ideas from an early book plan into production via four rounds of peer review for each edition. The reviewers deserve special mention here for their detailed feedback and expertise substantially improved the book’s structure and content.
We thank Patrick Schratz and Alexander Brenning from the University of Jena for fruitful discussions on and contributions to Chapters 12 and 15. We thank Emmanuel Blondel from the Food and Agriculture Organization of the United Nations for expert contributions to the section on web services; Michael Sumner for critical contributions to many areas of the book, especially the discussion of algorithms in Chapter 11; Tim Appelhans, David Cooley and Kiranmayi Vadlamudi for key contributions to the visualization chapter (Chapter 9); Marius Appel for his contributions to Chapter 10; and Katy Gregg, who proofread every chapter and greatly improved the readability of the book.
Countless others could be mentioned who contributed in myriad ways. The final thank you is for all the software developers who make geocomputation with R possible. Especially, Edzer Pebesma (who created the sf package), Robert Hijmans (who created terra) and Roger Bivand (who laid the foundations for much R-spatial software) who have made high performance geographic computing possible in R.