Twitter location (sometimes) matters: Exploring the relationship between georeferenced tweet content and nearby feature classes
Keywords:correlation between location and content, mobile microblogging, natural language processing, data mining, Twitter, OpenStreetMap
In this paper, we investigate whether microblogging texts (tweets) produced on mobile devices are related to the geographical locations where they were posted. For this purpose, we correlate tweet topics to areas. In doing so, classified points of interest from OpenStreetMap serve as validation points. We adopted the classification and geolocation of these points to correlate with tweet content by means of manual, supervised, and unsupervised machine learning approaches. Evaluation showed the manual classification approach to be highest quality, followed by the supervised method, and that the unsupervised classification was of low quality. We found that the degree to which tweet content is related to nearby points of interest depends upon topic (that is, upon the OpenStreetMap category). A more general synthesis with prior research leads to the conclusion that the strength of the relationship of tweets and their geographic origin also depends upon geographic scale (where smaller scale correlations are more significant than those of larger scale).
Copyright (c) 2014 Stefan Hahmann, Ross Purves, Dirk Burghardt
This work is licensed under a Creative Commons Attribution 4.0 International License.
Articles in JOSIS are licensed under a Creative Commons Attribution 3.0 License.