Category Archives: Papers

Paper accepted for AGILE 2018: Animation as a Visual Indicator of Positional Uncertainty in Geographic Information

We have a full paper accepted for AGILE 2018, which summarizes the results of our little online experiment about visualizations of uncertainty:

Here’s the preprint, and the abstract: Effectively communicating the uncertainty that is inherent in any kind of geographic information remains a challenge. This paper investigates the efficacy of animation as a visual variable to represent positional uncertainty in a web mapping context. More specifically, two different kinds of animation (a ‘bouncing’ and a ‘rubberband’ effect) have been compared to two static visual variables (symbol size and transparency), as well as different combinations of those variables in an online experiment with 163 participants. The participants’ task was to identify the most and least uncertain point objects in a series of web maps. The results indicate that the use of animation to represent uncertainty imposes a learning step on the participants, which is reflected in longer response times. However, once the participants got used to the animations, they were both more consistent and slightly faster in solving the tasks, especially when the animation was combined with a second visual variable. According to the test results, animation is also particularly well suited to represent positional uncertainty, as more participants interpreted the animated visualizations correctly, compared to the static visualizations using symbol size and transparency. Somewhat contradictory to those results, the participants showed a clear preference for those static visualizations.

Our GIS is too small

Very nice paper by Mark Gahegan. Here’s the abstract:

GIScience and GISystems have been successful in tackling many geographical problems over the last 30 years. But technologies and associated theory can become limiting if they end up defining how we see the world and what we believe are worthy and tractable research problems. This paper explores some of the limitations currently impacting GISystems and GIScience from the perspective of technology and community, contrasting GIScience with other informatics communities and their practices. It explores several themes: (i) GIScience and the informatics revolution; (ii) the lack of a community-owned innovation platform for GIScience research; (iii) the computational limitations imposed by desktop computing and the inability to scale up analysis; (iv) the continued failure to support the temporal dimension, and especially dynamic processes and models with feedbacks; (v) the challenge of embracing a wider and more heterogeneous view of geographical representation and analysis; and (vi) the urgent need to foster an active software development community to redress some of these shortcomings. A brief discussion then summarizes the issues and suggests that GIScience needs to work harder as a community to become more relevant to the broader geographic field and meet a bigger set of representation, analysis, and modelling needs.

Good food for thought at the beginning of the year, even though I do not agree with all of his points. There is currently a lot of progress being made concerning some of the problems he mentions (such as GeoMesa addressing scalability, or the NSF funding the Geospatial Software Institute, to name just two examples). I also don’t think (or at least hope?) that it is a prevalent position in our field that if it doesn’t fit on a desktop computer, it is some other community that should deal with it.

One point he raises about software platforms really resonates with me, though, since this is something I have been thinking about a lot recently:

Personally, this has driven me to use R, Python, and PostGIS for almost any kind of work, but I’m wondering if that is a viable solution for everyone? Or are the GIsystems he talks about more like classical, GUI-driven GIS systems that can be used without programming skills?

New Paper: A Geoprivacy Manifesto

I have a new paper out in Transactions in GIS, together with Grant McKenzie. A Geoprivacy Manifesto has taken us quite a while to write, the initial idea came up after our workshop on Geoprivacy at ACM SIGSPATIAL 2014 (!), so I’m really glad this one is finally out. Here’s the abstract:

As location-enabled technologies are becoming ubiquitous, our location is being shared with an ever-growing number of external services. Issues revolving around location privacy—or geoprivacy—therefore concern the vast majority of the population, largely without knowing how the underlying technologies work and what can be inferred from an individual’s location (especially if recorded over longer periods of time). Research, on the other hand, has largely treated this topic from isolated standpoints, most prominently from the technological and ethical points of view. This article therefore reflects upon the current state of geoprivacy from a broader perspective. It integrates technological, ethical, legal, and educational aspects and clarifies how they interact and shape how we deal with the corresponding technology, both individually and as a society. It does so in the form of a manifesto, consisting of 21 theses that summarize the main arguments made in the article. These theses argue that location information is different from other kinds of personal information and, in combination, show why geoprivacy (and privacy in general) needs to be protected and should not become a mere illusion. The fictional couple of Jane and Tom is used as a running example to illustrate how common it has become to share our location information, and how it can be used—both for good and for worse.

[DOI:10.1111/tgis.12305 / Preprint PDF]

New paper out in Transactions in GIS: Extracting central places from the link structure in Wikipedia

  • Carsten Keßler (2017) Extracting Central Places from the Link Structure in Wikipedia. Transactions in GIS 21(3):488–502.

Abstract: Explicit information about places is captured in an increasing number of geospatial datasets. This article presents evidence that relationships between places can also be captured implicitly. It demonstrates that the hierarchy of central places in Germany is reflected in the link structure of the German language edition of Wikipedia. The official upper and middle centers declared, based on German spatial laws, are used as a reference dataset. The characteristics of the link structure around their Wikipedia pages, which link to each other or mention each other, and how often, are used to develop a bottom-up method for extracting central places from Wikipedia. The method relies solely on the structure and number of links and mentions between the corresponding Wikipedia pages; no spatial information is used in the extraction process. The output of this method shows significant overlap with the official central place structure, especially for the upper centers. The results indicate that real-world relationships are in fact reflected in the link structure on the web in the case of Wikipedia.

The published version is available from the TGIS website, a preprint PDF is available right here. I’ll also present this at the ESRI User Conference in San Diego next month.

While we’re at it: IJGIS has also published a brief book review online that I wrote about Glen Hart and Catherine Dolbear’s Linked data: a geographic perspective.

See you at AGILE 2017 in Wageningen

I’ll be presenting a short paper at AGILE in Wageningen next week that outlines some of the stuff I’ve been working on with Peter Marcotullio:

The presentation is scheduled for Wednesday at 12:00PM in the SOCIETAL-1 session in room 4.


Hierarchical Prism Trees for Scalable Time Geographic Analysis

We have a full paper accepted for GIScience 2016:

Carson J. Q. Farmer and Carsten Keßler (2016) Hierarchical Prism Trees for Scalable Time Geographic Analysis. Full paper accepted for GIScience 2016, September 27–30, 2016, Montreal, Canada.

Abstract: As location-aware applications and location-based services continue to increase in popularity, data sources describing a range of dynamic processes occurring in near real-time over multiple spatial and temporal scales are becoming the norm. At the same time, existing frame- works useful for understanding these dynamic spatio-temporal data, such as time geography, are unable to scale to the high volume, velocity, and variety of these emerging data sources. In this paper, we introduce a com- putational framework that turns time geography into a scalable analysis tool that can handle large and rapidly changing datasets. The Hierar- chical Prism Tree (HPT) is a dynamic data structure for fast queries on spatio-temporal objects based on time geographic principles and theories, which takes advantage of recent advances in moving object databases and computer graphics. We demonstrate the utility of our proposed HPT us- ing two common time geography tasks (finding similar trajectories and mapping potential space-time interactions), taking advantage of open data on space-time vehicle emissions from the EnviroCar platform.

New Paper in Journal of Web Semantics

Carsten Keßler and Carson J. Q. Farmer (2015) Querying and integrating spatial–temporal information on the Web of Data via time geography. Journal of Web Semantics, in press. DOI: 10.1016/j.websem.2015.09.005

Abstract: The Web of Data is a rapidly growing collection of datasets from a wide range of domains, many of which have spatial–temporal aspects. Hägerstrand’s time geography has proven useful for thinking about and understanding the movements and spatial–temporal constraints of humans. In this paper, we explore time geography as a means of querying and integrating multiple spatial–temporal data sources. We formalize the concept of the space–time prism as an ontology design pattern to use as a framework for understanding and representing constraints and interactions between entities in space and time. We build on a formalization of space–time prisms and apply it in the context of the Web of Data, making it usable across multiple domains and topics. We demonstrate the utility of this approach through two use cases from the domains of environmental monitoring and cultural heritage, showing how space–time prisms enable spatial–temporal and semantic reasoning directly on distributed data sources.

Papers accepted for AGILE and AAAI Spring Symposium

I have two new papers accepted, one for AGILE 2015 in Lisbon, and one for the AAAI Spring Symposium 2015 on Structured Data for Humanitarian Technologies in Stanford. The latter was a collaboration with Tim Clark and Hemant Purohit. Find the preliminary citations below; click the title for a preprint PDF:

  • Carsten Keßler (forthcoming 2015) Central Places in Wikipedia. Accepted for the 18th AGILE Conference on Geographic Information Science: Geographic information Science as an enabler of smarter cities and communities. June 9–12, 2015. Lisbon, Portugal

Abstract Central Place Theory explains the number and locations of cities, towns, and villages based on principles of market areas, transportation, and socio-political interactions between settlements. It assumes a hexagonal segmentation of space, where every central place is surrounded by six lower-order settlements in its range, to which it caters its goods and services. In reality, this ideal hexagonal model is often skewed based on varying popu- lation densities, locations of natural features and resources, and other factors. In this paper, we propose an approach that extracts the structure around a central place and its range from the link structure on the Web. Using a corpus of georeferenced documents from the English language edition of Wikipedia, we combine weighted links between places and semantic annotations to compute the convex hull of a central place, marking its range. We compare the results obtained to the structures predicted by Central Place Theory, demonstrating that the Web and its hyperlink structure can indeed be used to infer spatial structures in the real world. We demonstrate our approach for the four largest metropolitan areas in the United States, namely New York City, Los Angeles, Chicago, and Houston.

Abstract Given the rise of humanitarian crises in the recent years, and adoption of multiple data sharing platforms in offline and online environments, it is increasingly challenging to collect, organize, clean, integrate, and analyze data in the humanitarian domain. On the other side, computer science has built efficient technologies to store, integrate and analyze structured data, however, their role in the humanitarian domain is yet to be shown. We present a case of how structured data technology, specifically Linked Open Data from the Semantic Web area, can be applied for information interoperability in the humanitarian domain. We present the domain-specific challenges, description of the technology adoption via an example of real world adoption of the Humanitarian Exchange Language (HXL) ontology, and describe the lessons from that to build the case of why, how and which components of technologies can be effective for information organization and interoperability in the humanitarian domain.

GeoPrivacy’14 Proceedings

The proceedings of our 1st ACM SIGSPATIAL International Workshop on Privacy in Geographic Information Collection and Analysis have been published online:

We had a great workshop last week, with interesting discussions that showed how pressing the issue of privacy in geographic information is, and that there is still a lot to do. We are already planning some follow-up activities, so stay posted.

New paper out in JMIR Medical Informatics

We have new paper out in JMIR Medical Informatics, an open access journal that focusses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, and ehealth infrastructures:

Binyam Tilahun, Tomi Kauppinen, Carsten Keßler and Fleur Fritz (2014) Design and Development of a Linked Open Data-Based Health Information Representation and Visualization System: Potentials and Preliminary Evaluation. JMIR Medical Informatics 2(2):e31

Here’s the abstract:

Background: Healthcare organizations around the world are challenged by pressures to reduce cost, improve coordination and outcome, and provide more with less. This requires effective planning and evidence-based practice by generating important information from available data. Thus, flexible and user-friendly ways to represent, query, and visualize health data becomes increasingly important. International organizations such as the World Health Organization (WHO) regularly publish vital data on priority health topics that can be utilized for public health policy and health service development. However, the data in most portals is displayed in either Excel or PDF formats, which makes information discovery and reuse difficult. Linked Open Data (LOD)—a new Semantic Web set of best practice of standards to publish and link heterogeneous data—can be applied to the representation and management of public level health data to alleviate such challenges. However, the technologies behind building LOD systems and their effectiveness for health data are yet to be assessed.

Objective: The objective of this study is to evaluate whether Linked Data technologies are potential options for health information representation, visualization, and retrieval systems development and to identify the available tools and methodologies to build Linked Data-based health information systems.

Methods: We used the Resource Description Framework (RDF) for data representation, Fuseki triple store for data storage, and Sgvizler for information visualization. Additionally, we integrated SPARQL query interface for interacting with the data. We primarily use the WHO health observatory dataset to test the system. All the data were represented using RDF and interlinked with other related datasets on the Web of Data using Silk—a link discovery framework for Web of Data. A preliminary usability assessment was conducted following the System Usability Scale (SUS) method.

Results: We developed an LOD-based health information representation, querying, and visualization system by using Linked Data tools. We imported more than 20,000 HIV-related data elements on mortality, prevalence, incidence, and related variables, which are freely available from the WHO global health observatory database. Additionally, we automatically linked 5312 data elements from DBpedia, Bio2RDF, and LinkedCT using the Silk framework. The system users can retrieve and visualize health information according to their interests. For users who are not familiar with SPARQL queries, we integrated a Linked Data search engine interface to search and browse the data. We used the system to represent and store the data, facilitating flexible queries and different kinds of visualizations. The preliminary user evaluation score by public health data managers and users was 82 on the SUS usability measurement scale. The need to write queries in the interface was the main reported difficulty of LOD-based systems to the end user.

Conclusions: The system introduced in this article shows that current LOD technologies are a promising alternative to represent heterogeneous health data in a flexible and reusable manner so that they can serve intelligent queries, and ultimately support decision-making. However, the development of advanced text-based search engines is necessary to increase its usability especially for nontechnical users. Further research with large datasets is recommended in the future to unfold the potential of Linked Data and Semantic Web for future health information systems development.