|| In recent years, the geographic information system (GIS) databases develop quickly and play a significant role in many applications. Many of these applications allow users to find objects with keywords and spatial information at the same time. Most researches in the spatial keyword queries only consider the exact match between the database and query with the textual information. Since users may not know how to spell the exact keyword, they make a query with the approximate-keyword, instead of the exact keyword. Therefore, how to process the approximate-keyword query in the spatial database becomes an important research topic. Alsubaiee et al. have proposed the Location-Based-Approximate-Keyword-tree (LBAK-tree) index structure which is to augment a tree-based spatial index with approximate-string indexes such as a gram-based index. However, the LBAK-tree index structure is the R*-tree based index structure. The nodes of the R*-tree have to be split and be reinserted when they get full. Due to this condition, it can not index the spatial attribute and the textual attribute at the same time. It stores the keywords in the nodes after the R*-tree is already built. Based on the R*-tree, it has to search all the children in a node to insert a new item and answer a query. Moreover, after they find the needed keywords by using the approximate index, they probe the nodes by checking the intersection of the similar keyword sets and the keywords stored in the nodes. However, the higher level the node is, the larger the number of keywords stored in the node is. It takes long time to check the intersections. And the LBAK-tree checks all the intersections even if there exits one of the intersections which is already an empty set. Therefore, in this thesis, we propose the Nine-Area-Approximate-Keyword-tree (NAAK-tree) index structure to process the spatial approximate-keyword query. We do not have to partition the space to construct the spatial index. We do not have to reinsert the children when split the nodes, so we can deal with the keywords at the same time. We can use the spatial number to find out the nodes that satisfy the spatial condition of the query. And we augment the NAAK-tree with signatures to speed up the query of the textual condition. We use the union of the bit strings of each keyword in a node to represent them in the node. Therefore, we can efficiently filter out the nodes that there is no keyword corresponding to the query by checking|
the signatures just one time without checking all the keywords stored in the nodes. Based on our NAAK-tree, if there exits one empty set in the similar keywords sets, we do not check all the similar keywords sets. From our simulation results, we show that the NAAK-tree is more efficient than the LBAK-tree to build the index and answer the spatial approximate-keyword query.