Download Queries and Analysis Tasks on Semantically Rich Spatial Data Book in PDF, Epub and Kindle
This dissertation, "Queries and Analysis Tasks on Semantically Rich Spatial Data" by Jieming, Shi, 石杰明, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: Semantically rich spatial data are big and ubiquitous, raising challenges with respect to their effective and efficient querying and analysis. In particular, traditional spatial analysis and querying methods are not readily applicable due to the increased data complexity. Toward addressing these challenges and supporting real-life applications that manage such data, in this thesis, three problems on the querying and analysis of (i) geo-social network data, (ii) spatio-textual data, and (iii) spatial RDF data are proposed and studied. First, we study the problem of Density-based Clustering of Places in Geo-Social networks (DCPGS). Current spatial clustering models disregard information about the people who are related to the clustered places. We extend the density-based clustering paradigm to apply on places in geo-social networks, considering both the spatial information between places and the social relationships between users who visit the places. After formally defining our model and the distance measure it relies on, we present efficient index-based algorithms for its implementation. We evaluate the effectiveness of our model via a case study and two quantitative measures, called social entropy and community score, which indicate that geo-social clusters have special properties and cannot be found by applying simple spatial clustering approaches. The efficiency of our algorithms is also evaluated experimentally. Next, we study the modeling and evaluation of a Spatio-Textual Skyline (STS) query, in which the skyline points are selected based on not only their distances to a set of query locations, but also on their relevance to a set of query keywords. STS is especially relevant to modern applications, where points of interest are typically augmented with textual descriptions. We investigate three models for integrating textual relevance into the spatial skyline. Among them, model STD, combining spatial distance with textual relevance in a derived dimensional space, is the most effective one. STD computes a skyline satisfying the intent of STS, and having a small and easy-to-interpret size. We propose an IR-tree based algorithm for computing STD-based skylines. The effectiveness of our STD model and the efficiency of the algorithm are evaluated experimentally. Finally, we propose the problem of top-k relevant Semantic Place retrieval (kSP) on spatial RDF data, which finds applications in domains such as journalism, health, business, and tourism. Traditionally, RDF data is accessed by structured query languages, e.g., SPARQL. This requires users to understand both the language and the RDF schema. Recent research on keyword search over RDF data aims at reducing such requirements, but still ignores the spatial dimension of RDF data. Our kSP seeks for RDF subgraphs, rooted at spatial entities close to the query location and containing a set of query keywords. Compared to existing work, kSP queries are independent to structured query languages and they are location-aware. We devise a basic method for processing kSP queries. Two pruning approaches and a preprocessing technique are proposed to further improve efficiency. Experiments on real datasets demonstrate the superior and robust performance of our proposals compared to the basic method. Subjects: Spatial analysis (Statistics)