博碩士論文 etd-0803119-022850 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 金淙傑(Chung-Chieh Chin) 電子郵件信箱 E-mail 資料不公開
畢業系所 資訊管理學系研究所(Department of Information Management)
畢業學位 碩士(Master) 畢業時期 107學年第2學期
論文名稱(中) 文字探勘工作流程設計平台之研究
論文名稱(英) The research on designing text mining workflow platform
檔案
  • etd-0803119-022850.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    紙本論文:3 年後公開 (2022-09-03 公開)

    電子論文:使用者自訂權限:校內 3 年後、校外 3 年後公開

    論文語文/頁數 中文/55
    統計 本論文已被瀏覽 5364 次,被下載 0 次
    摘要(中) 近幾年隨著資訊科技的進步,產生了大量的文檔,電子郵件、網路新聞、網路論壇等文字資料。人們為了探究期中蘊含的價值,使得文字探勘領域成為了最受需求的領域之一。本研究旨在設計和實踐一個文字探勘工作流程系統(TMWP),該系統可以快速的建立和執行文字分析的工作流程,並且針對工作流程的合法性以及可執行性進行驗證。我們提出一個文字探勘流程模型,用於定義系統中的各項任務。我們也建立工作流程驗證之本體(Ontology),已更快速簡單的檢驗工作流程之合法性。最終我們透過使用者研究來對系統效能和準確性進行評估。實驗證明,該系統相較於傳統的 R 語言進行文字分析,得到了使用者更好的反饋。
    摘要(英) In recent years, with the advancement of information technology, a huge amount of documents, e-mails, online news, social media content have been produced. To discover the hidden information in those textual data, Text mining had become increasingly important nowadays. In this thesis, we aim to design and implement a Text mining Workflow Platform (TMWP), which can prototypically create an executable text analysis workflow and can validate and verify the workflow. We propose a text mining process model which defines the task in the system. We also build an ontology for workflow validation, which has been faster and easier to verify the validity of workflows. Finally, we conduct a user study experiment to evaluate our system. Experiments show that compared with R language in text analysis, our system has better performance and user feedback.
    關鍵字(中)
  • 使用者研究
  • 自然語言處理
  • 工作流程驗證
  • 文字探勘
  • 服務工程
  • 科學工作流程
  • 關鍵字(英)
  • Service Engineering
  • Scientific workflows
  • User Study
  • Natural Language Processing
  • Workflow Validation
  • Text Mining
  • 論文目次 Content
    論文審定書 ............................................................................................................. i
    摘要 ....................................................................................................................... ii
    Abstract ................................................................................................................. iii
    Table of Contents ................................................................................................... iv
    Table of Figures ..................................................................................................... vi
    List of Tables .......................................................................................................... vii
    Chapter 1 - Introduction ......................................................................................... 1
    1.1 Background and Motivation ................................................................................ 1
    1.2 Thesis Organization ............................................................................................. 2
    Chapter 2 – Related Work ....................................................................................... 3
    2.1 Scientific and Data Analytics workflow ................................................................ 3
    2.2 Text mining workflow .......................................................................................... 5
    2.3 Text mining process model .................................................................................. 6
    2.4 Workflow Validation ............................................................................................ 7
    Chapter 3 – Text Mining Process Model Design ....................................................... 9
    3.1 A Process Model for Text Mining Workflow ........................................................ 9
    3.2 Validation Ontology ........................................................................................... 17
    v
    Chapter 4 – Platform Development ....................................................................... 23
    4.1 Requirement engineering .................................................................................. 23
    4.2 The text mining workflow platform ................................................................... 25
    4.3 Workflow engine ............................................................................................... 26
    4.4 Execution engine ............................................................................................. 27
    4.5 DataObject Retrieval Engine ............................................................................ 27
    4.6 Workflow validation Engine ............................................................................. 28
    Chapter 5 – Platform Evaluation ........................................................................... 29
    5.1 Subject Description ........................................................................................... 29
    5.2 Experiment Design ............................................................................................ 30
    5.3 Result Evaluation ............................................................................................... 30
    Chapter 6 – Conclusion ......................................................................................... 38
    REFERENCES ......................................................................................................... 40
    Appendix 1: Experiment Task Detail ...................................................................... 43
    Appendix 2: Questionnaire Content ...................................................................... 46
    參考文獻 [1] “Microsoft Azure Machine Learning Studio.” [Online]. Available:
    https://studio.azureml.net/. [Accessed: 14-Nov-2018].
    [2] J. Demsˇar et al., “Orange: Data Mining Toolbox in Python,” p. 5.
    [3] K. Wolstencroft et al., “The Taverna workflow suite: designing and executing
    workflows of Web Services on the desktop, web or in the cloud,” Nucleic Acids Res.,
    vol. 41, no. W1, pp. W557–W561, Jul. 2013.
    [4] “Open for Innovation | KNIME.” [Online]. Available: https://www.knime.com/.
    [Accessed: 14-Jul-2019].
    [5] About, • Email, and • Archive, “What Your Audience Is Doing When They’re Not
    Listening To You by Lori Lewis,” All Access. [Online]. Available:
    https://www.allaccess.com/merge/archive/26034/what-your-audience-is-doing
    when-they-re-not. [Accessed: 14-Nov-2018].
    [6] E. Deelman, D. Gannon, M. Shields, and I. Taylor, “Workflows and e-Science: An
    overview of workflow system features and capabilities,” Future Gener. Comput. Syst.,
    vol. 25, no. 5, pp. 528–540, May 2009.
    [7] M. Perovšek, J. Kranjc, T. Erjavec, B. Cestnik, and N. Lavrač, “TextFlows: A visual
    programming platform for text mining and natural language processing,” Sci.
    Comput. Program., vol. 121, pp. 128–152, Jun. 2016.
    [8] “The Web framework for perfectionists with deadlines | Django.” [Online].
    Available: https://www.djangoproject.com/. [Accessed: 14-Nov-2018].
    [9] S. C. Kuah and S. Y. Hwang, “On the Construction of Text mining Workflow
    System,” 2018.
    [10] S. Narayanan and S. A. McIlraith, “Simulation, Verification and Automated
    Composition of Web Services,” in Proceedings of the 11th International Conference
    on World Wide Web, New York, NY, USA, 2002, pp. 77–88.
    [11] J. Korhonen, L. Pajunen, and J. Puustjarvi, “Automatic composition of Web
    service workflows using a semantic agent,” in Proceedings IEEE/WIC International
    Conference on Web Intelligence (WI 2003), 2003, pp. 566–569.
    [12] D. Redavid, R. Corizzo, and D. Malerba, “An OWL Ontology for Supporting
    Semantic Services in Big Data Platforms,” 2018, pp. 228–231.
    [13] J. Zhang, “Ontology-Driven Composition and Validation of Scientific Grid
    Workflows in Kepler: a Case Study of Hyperspectral Image Processing,” in 2006 Fifth
    International Conference on Grid and Cooperative Computing Workshops, 2006, pp.
    282–289.
    [14] S. Sadiq, M. Orlowska, W. Sadiq, and C. Foulger, “Data Flow and Validation in
    Workflow Modelling,” in Proceedings of the 15th Australasian Database Conference -
    Volume 27, Darlinghurst, Australia, Australia, 2004, pp. 207–214.
    [15] J. F.- js.foundation, “jQuery.” .
    [16] J. Bagga and A. Heinz, “JGraph— A Java Based System for Drawing Graphs and
    Running Graph Algorithms,” in Graph Drawing, 2002, pp. 459–460.
    [17] “DataTables | Table plug-in for jQuery.” [Online]. Available:
    https://datatables.net/. [Accessed: 13-Aug-2019].
    [18] “Welcome | Flask (A Python Microframework).” [Online]. Available:
    http://flask.pocoo.org/. [Accessed: 14-Nov-2018].
    [19] “Welcome | Werkzeug (The Python WSGI Utility Library).” [Online]. Available:
    http://werkzeug.pocoo.org/. [Accessed: 14-Nov-2018].
    [20] “Welcome | Jinja2 (The Python Template Engine).” [Online]. Available:
    http://jinja.pocoo.org/. [Accessed: 14-Nov-2018].
    [21] “Open Source Document Database,” MongoDB. [Online]. Available:
    https://www.mongodb.com/index. [Accessed: 14-Nov-2018].
    [22] “SUS -- a quick and dirty usability scale | John Brooke.” [Online]. Available:
    https://www.researchgate.net/publication/319394819_SUS_-
    _a_quick_and_dirty_usability_scale. [Accessed: 15-Aug-2019].
    口試委員
  • 康藝晃 - 召集委員
  • 洪澤權 - 委員
  • 簡士鎰 - 委員
  • 黃三益 - 指導教授
  • 口試日期 2019-07-22 繳交日期 2019-09-03

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫