Humanitarian GIS and Crowdsourced Geographic Data
As discussed by Crawford and Finn (2014), one of the primary sources of uncertainty in crowdsourced social media data is the question of how representative volunteered geographic information is of different populations in different geographic and disaster contexts. This involves a significant degree of both statistical uncertainty, since analyzing volunteered geographic information often involves noisy, heterogenous data, of which only a small portion is explicitly geotagged (e.g. ~1% in the case of Twitter) and where not everyone has the ability to produce data (access to internet, smartphones, language barriers, etc); and epistemic uncertainty, since individuals’ motivations and intentions for producing this data (e.g. tweeting about a disaster) are variable and often unknown to researchers.
These issues of uncertainty carry with them specific questions of ethics. As mentioned by Crawford and Finn, researchers and humanitarian groups need to acknowledge and investigate the ways in which social media data and VGI are skewed representations of a population. Addressing this issue demands an effort to think critically about and broaden data collection in order to reduce the biases inherent in these data, as well as an effort to critically examine what conclusions and end uses can reasonably and ethically be drawn from the data. This second point could complement Crawford and Finn’s concerns about local access to the products of data analysis, since maintaining local control over the access, analysis, and use of VGI could make the products of such analysis more accessible to the people who produced the raw data while reducing semantic heterogeneity by reducing the degree to which the producers, analysts, and users of the data have incompatibly different cultural backgrounds and understandings of the data. With regards to privacy concerns, efforts to restrict or change the use of data to address privacy concerns could increase uncertainty in other areas of the analysis - for example, trying to get informed consent to use people’s tweets or text messages would likely drastically reduce the usable sample size and further skew the population that the data represents.
References: Crawford, K., and M. Finn. 2014. The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters. GeoJournal 80 (4):491–502. DOI:10.1007/s10708-014-9597-z
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.