||As advances of information technology, engineering consulting firms have gradually digitalized their documents and line-drawing images. Such digital libraries greatly facilitate document retrievals. However, engineers still face a challenging issue: searches and retrievals of line-drawing images in a digital library. With a small number of line-drawing images in a digital library, engineers can browse thumbnails for locating relevant images. As the number of line-drawing images increases, the manual browsing process is time-consuming and frustrated. |
In response to the need and importance of supporting efficient and effective retrieval of line-drawing images, this thesis aims to develop a line-drawing image retrieval system. Typically, a line-drawing image within an engineering document is associated with surrounding text for description or illustration purpose. Such surrounding text provides important information for automatically indexing the line-drawing image. With extracted indexes (or keywords), retrieval of line-drawing images can be accomplished using a traditional information retrieval technique.
Specifically, in this study, we propose a line-drawing image retrieval system based on surrounding text. We develop four models for defining surrounding text boundaries for line-drawing images. Furthermore, two information retrieval techniques (one with and one without query expansion) are implemented and evaluated. According to our empirical evaluations, the surrounding text boundary model with image caption together with three sentences (preceding, image anchoring, and successive sentences) would result in the best retrieval effectiveness, as measured by recall and precision rates.