Temporally relevant parallel top-k spatial keyword search

Authors

  • Suprio Ray Faculty of Computer Science, University of New Brunswick, Fredericton, Canada
  • Bradford Nickerson Faculty of Computer Science, University of New Brunswick, Fredericton, Canada

DOI:

https://doi.org/10.5311/JOSIS.2022.24.199

Keywords:

spatial keyword search, spatio-textual, I/O-efficient indexing, top-k, temporal relevance

Abstract

New spatio-textual indexing methods are needed to support efficient search and update of the massive amounts of spatially referenced text being generated. Location based services using geo-tagged documents provide valuable ranked recommendations about nearby restaurants, services, sales, emergency events, and visitor attractions. Consequently, top-k spatial keyword search queries (TkSKQ) have received a lot of attention from the research community. Several spatio-textual indexes have been proposed to efficiently support TkSKQ. Some of these indexes support updates based on live document streams, but the  ranking schemes employed by them do not simultaneously incorporate temporal relevance, textual similarity  and spatial proximity.   Moreover, existing approaches  have limited or no capability to exploit parallelism with document ingestion and query execution. We present a  parallel  spatio-textual index,  Pastri, to address the aforementioned issues. Pastri can be updated incrementally over real-time spatio-textual document streams. To support temporally relevant ranking of continuously generated document streams,  we propose a dynamic ranking scheme. Our approach retrieves the top-k documents that are most temporally relevant at the time of a query execution. We implemented Pastri and we integrate it within a system with a persistent document store and several thread pools to exploit parallelism at various levels. Experimental evaluation involving  real-world datasets and synthetic datasets (that we created) demonstrates that our system is able to sustain high document update throughput. Furthermore, Pastri's TkSKQ search performance is one to two orders of magnitude faster than other spatio-textual indexes. 

199

Downloads

Published

2022-06-20

Issue

Section

Research Articles