Title page for etd-0802102-142205


[Back to Results | New Search]

URN etd-0802102-142205
Author Chin-Sheng Yang
Author's Email Address No Public.
Statistics This thesis had been viewed 5331 times. Download 7356 times.
Department Information Management
Year 2001
Semester 2
Degree Master
Type of Document
Language English
Title Investigations of Term Expansion on Text Mining Techniques
Date of Defense 2001-07-25
Page Count 64
Keyword
  • Term Association
  • Word Mismatch
  • Text Mining
  • Event Detection
  • Term Expansion
  • Document Clustering
  • Text Categorization
  • Abstract Recent advances in computer and network technologies have contributed significantly to global connectivity and stimulated the amount of online textual document to grow extremely rapidly. The rapid accumulation of textual documents on the Web or within an organization requires effective document management techniques, covering from information retrieval, information filtering and text mining. The word mismatch problem represents a challenging issue to be addressed by the document management research. Word mismatch has been extensively investigated in information retrieval (IR) research by the use of term expansion (or specifically query expansion). However, a review of text mining literature suggests that the word mismatch problem has seldom been addressed by text mining techniques. Thus, this thesis aims at investigating the use of term expansion on some text mining techniques, specifically including text categorization, document clustering and event detection. Accordingly, we developed term expansion extensions to these three text mining techniques. The empirical evaluation results showed that term expansion increased the categorization effectiveness when the correlation coefficient feature selection was employed. With respect to document clustering, techniques extended with term expansion achieved comparable clustering effectiveness to existing techniques and showed its superiority in improving clustering specificity measure. Finally, the use of term expansion for supporting event detection has degraded the detection effectiveness as compared to the traditional event detection technique.
    Advisory Committee
  • Fu-ren Lin - chair
  • none - co-chair
  • Chih-Ping Wei - advisor
  • Files
  • etd-0802102-142205.pdf
  • indicate in-campus access immediately and off_campus access in a year
    Date of Submission 2002-08-02

    [Back to Results | New Search]


    Browse | Search All Available ETDs

    If you have more questions or technical problems, please contact eThesys