The Racial Dot Map: One Dot Per Person for the Entire U.S.

racemap

Very cool map by Dustin Cable at University of Virginia:

This map is an American snapshot; it provides an accessible visualization of geographic distribution, population density, and racial diversity of the American people in every neighborhood in the entire country. The map displays 308,745,538 dots, one for each person residing in the United States at the location they were counted during the 2010 Census. Each dot is color-coded by the individual’s race and ethnicity. The map is presented in both black and white and full color versions. In the color version, each dot is color-coded by race.

[Via Citylab]

Dumbsmart systems

Screen Shot 2015-03-06 at 22.46.51

Here’s the submission system of a computer science (!) journal that thinks it is more accurate to manually type in my email address, rather than have it filled in automatically. Probably because of all the typos introduced by copying and pasting. D’oh.

Measuring continental drift with your phone →

So, you could imagine instead of leaving your phone to do nothing overnight you could instead leave it to record 8 hours of drift data. We’d anonymize it and record drift information just for the nearest 100 mile square or something so we don’t know where your house is. Then we could aggregate that data with other phones across the world and see if we get something that looks accurate out of it.

I have no idea whether that works, but it sure is a damn cool idea. Go to m.opendrift.org on your phone to participate.

Bayes’ Theorem explained with Lego →

What’s a good blog on probability without a post on Bayes’ Theorem? Bayes’ Theorem is one of those mathematical ideas that is simultaneously simple and demanding. Its fundamental aim is to formalize how information about one event can give us understanding of another. Let’s start with the formula and some lego, then see where it takes us.

There should be more explanations of mathematical ideas that involve Lego.

Developing and testing SPARQL queries with cURL

There are tons of online SPARQL editors out there, but they often lack some specific functionality. The most common one is that they cannot query an endpoint you may have running on your own machine during development. The SPARQL forms that come with most triple stores, however, are very bare bones, to say the least. Plus I don’t really like going back and forth in a web browser when I’m working on a piece of code.

What I do instead is writing the query in a text editor with syntax highlighting and then shoot them over to the endpoint via cURL on the command line:

curl -i -H "Accept: text/csv" --data-urlencode query@query.sparql http://example.com/sparql

This will take the file query.sparql, send its contents to http://example.com/sparql (with the query parameter name being query), and show the results as comma-separated values. Obviously, this is no magic, I just keep forgetting the exact parameters so I thought I might as well document this here.

If you are using Sublime Text as your text editor, there is also the Sublime SPARQL Runner package. Does exactly the same thing and opens the results in a new text file right in sublime. I’ve only tested the package briefly, but it seems to do what it says on the tin.

Papers accepted for AGILE and AAAI Spring Symposium

I have two new papers accepted, one for AGILE 2015 in Lisbon, and one for the AAAI Spring Symposium 2015 on Structured Data for Humanitarian Technologies in Stanford. The latter was a collaboration with Tim Clark and Hemant Purohit. Find the preliminary citations below; click the title for a preprint PDF:

  • Carsten Keßler (forthcoming 2015) Central Places in Wikipedia. Accepted for the 18th AGILE Conference on Geographic Information Science: Geographic information Science as an enabler of smarter cities and communities. June 9–12, 2015. Lisbon, Portugal

Abstract Central Place Theory explains the number and locations of cities, towns, and villages based on principles of market areas, transportation, and socio-political interactions between settlements. It assumes a hexagonal segmentation of space, where every central place is surrounded by six lower-order settlements in its range, to which it caters its goods and services. In reality, this ideal hexagonal model is often skewed based on varying popu- lation densities, locations of natural features and resources, and other factors. In this paper, we propose an approach that extracts the structure around a central place and its range from the link structure on the Web. Using a corpus of georeferenced documents from the English language edition of Wikipedia, we combine weighted links between places and semantic annotations to compute the convex hull of a central place, marking its range. We compare the results obtained to the structures predicted by Central Place Theory, demonstrating that the Web and its hyperlink structure can indeed be used to infer spatial structures in the real world. We demonstrate our approach for the four largest metropolitan areas in the United States, namely New York City, Los Angeles, Chicago, and Houston.

Abstract Given the rise of humanitarian crises in the recent years, and adoption of multiple data sharing platforms in offline and online environments, it is increasingly challenging to collect, organize, clean, integrate, and analyze data in the humanitarian domain. On the other side, computer science has built efficient technologies to store, integrate and analyze structured data, however, their role in the humanitarian domain is yet to be shown. We present a case of how structured data technology, specifically Linked Open Data from the Semantic Web area, can be applied for information interoperability in the humanitarian domain. We present the domain-specific challenges, description of the technology adoption via an example of real world adoption of the Humanitarian Exchange Language (HXL) ontology, and describe the lessons from that to build the case of why, how and which components of technologies can be effective for information organization and interoperability in the humanitarian domain.