論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
構建文字探勘工作流程系統之研究 On the Construction of a Text Mining Workflow System |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
50 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2018-07-23 |
繳交日期 Date of Submission |
2018-08-01 |
關鍵字 Keywords |
服務工程、文字探勘、流程管理、自然語言處理、科學工作流程 Natural Language Processing, Process Management, Service Engineering, Text Mining, Scientific workflows |
||
統計 Statistics |
本論文已被瀏覽 6123 次,被下載 191 次 The thesis/dissertation has been browsed 6123 times, has been downloaded 191 times. |
中文摘要 |
近年來,人們對文字探勘的分析充滿熱忱。文字探勘的分析範圍包括文檔, 電郵、網頁,系統記錄檔,用戶生成的各種文字內容,這是一個有大量資料的分析 領域,人們可以在其中得知一些潛在的資訊。文字探勘是各界共同創造的一種技術, 它可以快速理解客戶的需求和吸引客戶。本研究設計和實現一個文字探勘工作流 程系統(TMWS),該系統可讓使用者逐步的創建任務、平行視覺化文字探勘任務、 驗證工作流程和校驗工作流程的資料。為了有效驗證和得知清晰的語意,我們提出 一個專屬文字探勘的流程模型,並描述如何基於這模式實現系統原型。最後,我們 將展示如何透過我們提出的文字探勘工作流程系統快速構建一個社群監控平台。 |
Abstract |
Recent years have seen an enthusiasm on analyzing textual data, which ranges from documents, emails, web pages, to user generated content. This is an analysis domain with a large amount of data, and people can learn some potential information. Text mining is an enabling technology for value co-creation activities as it allows for quickly understanding and engaging customers. In this thesis, we report our experience in constructing a text mining workflow system (TMWS), which allows for incrementally and prototypically creating tasks, visually threading task mining tasks into a workflow, and validating and verifying the workflow. To provide clear semantics and enable validation, we propose a text-mining specific process model and describe how to implement a system prototype based on it. We show how we can quickly build such a system using our proposed text mining workflow system. |
目次 Table of Contents |
論文審訂書 ....................................................................................................................... i 摘 要 .............................................................................................................................. ii ABSTRACT ................................................................................................................... iii CHAPTER 1 - INTRODUCTION ................................................................................ 1 CHAPTER 2 - RELATED WORK ............................................................................... 6 2.1 Scientific Workflow...................................................................................... 6 2.2 Data Analytics Workflow ............................................................................. 8 2.3 Workflow for Text Mining.......................................................................... 10 CHAPTER 3 - TEXT MINING PROCESS MODEL DESIGN ............................... 12 3.1 A Process Model for Text Mining .............................................................. 12 3.2 Text Mining Tasks ...................................................................................... 19 CHAPTER 4 - PROTOTYPE DEVELOPMENT ..................................................... 27 4.1 Architecture of the TM Workflow System Prototype................................. 27 4.2 TM Workflow Designer.............................................................................. 28 4.3 TM Workflow Engine................................................................................. 30 4.4 TM Task Executor ...................................................................................... 31 4.5 TM Data Verification .................................................................................. 32 4.6 TM Workflow Validation ............................................................................ 33 CHAPTER 5 - A CASE STUDY: THE CONSTRUCTION OF SOCIAL LISTENING SYSTEM ................................................................................................ 35 CHAPTER 6 - CONCLUSION ................................................................................... 39 CHAPTER 7 - REFERENCE...................................................................................... 41 |
參考文獻 References |
Atkinson, M., Gesing, S., Montagnat, J., & Taylor, I. (2017). Scientific workflows: Past, present and future: Elsevier. Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kötter, T., Meinl, T., . . . Wiswedel, B. (2009). KNIME-the Konstanz information miner: version 2.0 and beyond. AcM SIGKDD explorations Newsletter, 11(1), 26-31. Common Workflow Language Specifications, v1.0.2. Retrieved from https://www.commonwl.org/v1.0/ Curcin, V., Ghanem, M., & Guo, Y. (2010). The design and implementation of a workflow analysis tool. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 368(1926), 4193-4208. De, D., Carole, R., & Stevens, G. R. (2008). The design and realisation of the myexperiment virtual research environment for social sharing of workflows. Deelman, E., Gannon, D., Shields, M., & Taylor, I. (2009). Workflows and e-Science: An overview of workflow system features and capabilities. Future Generation Computer Systems, 25(5), 528-540. Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., . . . Starič, A. (2013). Orange: data mining toolbox in Python. The Journal of Machine Learning Research, 14(1), 2349-2353. Garijo, D., Alper, P., Belhajjame, K., Corcho, O., Gil, Y., & Goble, C. (2014). Common motifs in scientific workflows: An empirical analysis. Future Generation Computer Systems, 36, 338-351. Garijo, D., Gil, Y., & Corcho, O. (2017). Abstract, link, publish, exploit: An end to end framework for workflow sharing. Future Generation Computer Systems, 75, 271-283. Goecks, J., Nekrutenko, A., & Taylor, J. (2010). Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology, 11(8), R86. Hofmann, M., & Klinkenberg, R. (2013). RapidMiner: Data mining use cases and business analytics applications: CRC Press. Ignatow, G., & Mihalcea, R. (2016). Text mining: A guidebook for the social sciences: Sage Publications. Khalajzadeh, H., Abdelrazek, M., Grundy, J., Hosking, J., & He, Q. (2018). A Survey of Current End-user Data Analytics Tool Support. Big Data (BigData Congress), 2018 IEEE International Congress on. Lewis, L., & Callahan, C. (2018). 2018 - This is what happens in an internet minute. Retrieved from Cumulus Media Liu, J., Pacitti, E., Valduriez, P., & Mattoso, M. (2015). A survey of data-intensive scientific workflow management. Journal of Grid Computing, 13(4), 457-493. Ludäscher, B., Weske, M., McPhillips, T., & Bowers, S. (2009). Scientific workflows: Business as usual? Paper presented at the International Conference on Business Process Management. Perovšek, M., Kranjc, J., Erjavec, T., Cestnik, B., & Lavrač, N. (2016). TextFlows: A visual programming platform for text mining and natural language processing. Science of Computer Programming, 121, 128-152. . Text Analytics Market Analysis, Market Size, Application Analysis, Regional Outlook, Competitive Strategies And Forecasts, 2014 To 2020. Global Text Analytics Market: Research and Markets. TrustwOrthy model-awaRE Analytics Data platform (TOREADOR). Retrieved from http://www.toreador-project.eu/ Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., Owen, S., . . . Fisher, P. (2013). The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic acids research, 41(W1), W557-W561. |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |