Simple Features for R

I’m probably late to the game, but since I’ve been working a lot with R lately, I finally got around to give the Simple Features for R package a proper shot. And boy should I have done that earlier. If you use R with spatial data and haven’t checked it out yet, please do. Here’s a brief list of my favorite features (pun intended):

  • Much faster reading and writing data
  • No more clumsy working with attribute data – instead of mylayer@data$attribute, just go straight to mylayer$attribute
  • If you work with PostGIS a lot, you’ll feel right at home with the spatial operators (and they automatically use spatial indexes!)
  • Mapping with ggplot2 is much more intuitive than before, using the geom_sf function.
  • The automatic faceted plots by attribute when you use the plain plot function are also pretty cool.
  • If you need to run functions that don’t work on simple feature collections, it is super easy to just convert them to a data frame (my_df <- as.data.frame(my_sf)), run the function, and convert them back (my_sf <- st_as_sf(my_df)) – geometries, CRS, etc. are picked up correctly automatically.

I'm sure I'm missing more great stuff, but this is just a first impression after a day of work with sf. Overall, working with spatial data in R feels much more natural with sf, with less extra code and special cases than before. Kudos to Edzer and the other contributors for this one!

Our GIS is too small

Very nice paper by Mark Gahegan. Here’s the abstract:

GIScience and GISystems have been successful in tackling many geographical problems over the last 30 years. But technologies and associated theory can become limiting if they end up defining how we see the world and what we believe are worthy and tractable research problems. This paper explores some of the limitations currently impacting GISystems and GIScience from the perspective of technology and community, contrasting GIScience with other informatics communities and their practices. It explores several themes: (i) GIScience and the informatics revolution; (ii) the lack of a community-owned innovation platform for GIScience research; (iii) the computational limitations imposed by desktop computing and the inability to scale up analysis; (iv) the continued failure to support the temporal dimension, and especially dynamic processes and models with feedbacks; (v) the challenge of embracing a wider and more heterogeneous view of geographical representation and analysis; and (vi) the urgent need to foster an active software development community to redress some of these shortcomings. A brief discussion then summarizes the issues and suggests that GIScience needs to work harder as a community to become more relevant to the broader geographic field and meet a bigger set of representation, analysis, and modelling needs.

Good food for thought at the beginning of the year, even though I do not agree with all of his points. There is currently a lot of progress being made concerning some of the problems he mentions (such as GeoMesa addressing scalability, or the NSF funding the Geospatial Software Institute, to name just two examples). I also don’t think (or at least hope?) that it is a prevalent position in our field that if it doesn’t fit on a desktop computer, it is some other community that should deal with it.

One point he raises about software platforms really resonates with me, though, since this is something I have been thinking about a lot recently:

Personally, this has driven me to use R, Python, and PostGIS for almost any kind of work, but I’m wondering if that is a viable solution for everyone? Or are the GIsystems he talks about more like classical, GUI-driven GIS systems that can be used without programming skills?

Cards Against Humanity’s Pulse of the Nation →

For the fifth day of Cards Against Humanity Saves America, we used your money to fund one year of monthly public opinion polls. We’ll ask the American people about their social and political views, what they think of the president, and their pee-pee habits.

In fact, we secretly started polling three months ago. What a delightful surprise!

To conduct our polls in a scientifically rigorous manner, we’ve partnered with Survey Sampling International — a professional research firm — to contact a nationally representative sample of the American public. For the first three polls, we interrupted people’s dinners on both their cell phones and landlines, and a total of about 3,000 adults didn’t hang up immediately. We examined the data for statistically significant correlations, and boy did we find some stuff.

Hilarious. I think I’m going to use their data in class some time. Too bad it doesn’t include respondents’ location.

R for Data Science →

I recently came across O’Reilly’s R for Data Science by Hadley Wickham and Garrett Grolemund. From cross-reading some of the chapters, it is a very easily digestible intro to R and it also goes into topics such as cleaning up data (something most books suggest to happen automagically). It doesn’t go very deep into the statistical capabilities of R, though.

Anyway, it turns out they actually have the full book online for free at http://r4ds.had.co.nz.

AI for Earth Grant from Microsoft

I have been awarded a grant from Microsoft as part of its AI for Earth program. The grant will be used to develop high-resolution spatialized population projections, which will take population projections from the shared socioeconomic pathways and use a geosimulation approach to distribute the projected populations on a map. The resulting maps can then be used to assess the number of people who will be directly affected by climate change.

AI for Earth is a Microsoft program aimed at empowering people and organizations to solve global environmental challenges by increasing access to AI tools and educational opportunities, while accelerating innovation. Via the Azure for Research AI for Earth award program, Microsoft provides selected researchers and organizations access to its cloud and AI computing resources to accelerate, improve and expand work on climate change, agriculture, biodiversity and/or water challenges.

I am among the first grant recipients of AI for Earth, first launched in July 2017. The grant process was a competitive and selective process and was awarded in recognition of the potential of the work and power of AI to accelerate progress. To date, Microsoft has distributed more than 35 grants to qualifying researchers and organizations around the world. Microsoft just announced their intent to put $50 million over 5 years into the program, enabling grant-making and educational trainings possible at a much larger scale.

New Paper: A Geoprivacy Manifesto

I have a new paper out in Transactions in GIS, together with Grant McKenzie. A Geoprivacy Manifesto has taken us quite a while to write, the initial idea came up after our workshop on Geoprivacy at ACM SIGSPATIAL 2014 (!), so I’m really glad this one is finally out. Here’s the abstract:

As location-enabled technologies are becoming ubiquitous, our location is being shared with an ever-growing number of external services. Issues revolving around location privacy—or geoprivacy—therefore concern the vast majority of the population, largely without knowing how the underlying technologies work and what can be inferred from an individual’s location (especially if recorded over longer periods of time). Research, on the other hand, has largely treated this topic from isolated standpoints, most prominently from the technological and ethical points of view. This article therefore reflects upon the current state of geoprivacy from a broader perspective. It integrates technological, ethical, legal, and educational aspects and clarifies how they interact and shape how we deal with the corresponding technology, both individually and as a society. It does so in the form of a manifesto, consisting of 21 theses that summarize the main arguments made in the article. These theses argue that location information is different from other kinds of personal information and, in combination, show why geoprivacy (and privacy in general) needs to be protected and should not become a mere illusion. The fictional couple of Jane and Tom is used as a running example to illustrate how common it has become to share our location information, and how it can be used—both for good and for worse.

[DOI:10.1111/tgis.12305 / Preprint PDF]

GeoNotebook →

GeoNotebook is an application that provides client/server environment with interactive visualization and analysis capabilities using Jupyter, GeoJS and other open source tools.

I use Jupyter notebooks all the time when I write Python code, so I definitely need to give GeoNotebook a shot.