Responsive image
博碩士論文 etd-0701118-023859 詳細資訊
Title page for etd-0701118-023859
On the Construction of a Text Mining Workflow System
Year, semester
Number of pages
Advisory Committee
Date of Exam
Date of Submission
Natural Language Processing, Process Management, Service Engineering, Text Mining, Scientific workflows
本論文已被瀏覽 6064 次,被下載 188
The thesis/dissertation has been browsed 6064 times, has been downloaded 188 times.
近年來,人們對文字探勘的分析充滿熱忱。文字探勘的分析範圍包括文檔, 電郵、網頁,系統記錄檔,用戶生成的各種文字內容,這是一個有大量資料的分析 領域,人們可以在其中得知一些潛在的資訊。文字探勘是各界共同創造的一種技術, 它可以快速理解客戶的需求和吸引客戶。本研究設計和實現一個文字探勘工作流 程系統(TMWS),該系統可讓使用者逐步的創建任務、平行視覺化文字探勘任務、 驗證工作流程和校驗工作流程的資料。為了有效驗證和得知清晰的語意,我們提出 一個專屬文字探勘的流程模型,並描述如何基於這模式實現系統原型。最後,我們 將展示如何透過我們提出的文字探勘工作流程系統快速構建一個社群監控平台。
Recent years have seen an enthusiasm on analyzing textual data, which ranges from documents, emails, web pages, to user generated content. This is an analysis domain with a large amount of data, and people can learn some potential information. Text mining is an enabling technology for value co-creation activities as it allows for quickly understanding and engaging customers. In this thesis, we report our experience in constructing a text mining workflow system (TMWS), which allows for incrementally and prototypically creating tasks, visually threading task mining tasks into a workflow, and validating and verifying the workflow. To provide clear semantics and enable validation, we propose a text-mining specific process model and describe how to implement a system prototype based on it. We show how we can quickly build such a system using our proposed text mining workflow system.
目次 Table of Contents
論文審訂書 ....................................................................................................................... i
摘 要 .............................................................................................................................. ii
ABSTRACT ................................................................................................................... iii
CHAPTER 1 - INTRODUCTION ................................................................................ 1
CHAPTER 2 - RELATED WORK ............................................................................... 6
2.1 Scientific Workflow...................................................................................... 6
2.2 Data Analytics Workflow ............................................................................. 8
2.3 Workflow for Text Mining.......................................................................... 10
CHAPTER 3 - TEXT MINING PROCESS MODEL DESIGN ............................... 12
3.1 A Process Model for Text Mining .............................................................. 12
3.2 Text Mining Tasks ...................................................................................... 19
CHAPTER 4 - PROTOTYPE DEVELOPMENT ..................................................... 27
4.1 Architecture of the TM Workflow System Prototype................................. 27
4.2 TM Workflow Designer.............................................................................. 28
4.3 TM Workflow Engine................................................................................. 30
4.4 TM Task Executor ...................................................................................... 31
4.5 TM Data Verification .................................................................................. 32
4.6 TM Workflow Validation ............................................................................ 33
CHAPTER 5 - A CASE STUDY: THE CONSTRUCTION OF SOCIAL LISTENING SYSTEM ................................................................................................ 35
CHAPTER 6 - CONCLUSION ................................................................................... 39
CHAPTER 7 - REFERENCE...................................................................................... 41
參考文獻 References
Atkinson, M., Gesing, S., Montagnat, J., & Taylor, I. (2017). Scientific workflows: Past, present and future: Elsevier.
Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kötter, T., Meinl, T., . . . Wiswedel, B. (2009). KNIME-the Konstanz information miner: version 2.0 and beyond. AcM SIGKDD explorations Newsletter, 11(1), 26-31.
Common Workflow Language Specifications, v1.0.2. Retrieved from
Curcin, V., Ghanem, M., & Guo, Y. (2010). The design and implementation of a workflow analysis tool. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 368(1926), 4193-4208.
De, D., Carole, R., & Stevens, G. R. (2008). The design and realisation of the myexperiment virtual research environment for social sharing of workflows.
Deelman, E., Gannon, D., Shields, M., & Taylor, I. (2009). Workflows and e-Science: An overview of workflow system features and capabilities. Future Generation Computer Systems, 25(5), 528-540.
Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., . . . Starič, A. (2013). Orange: data mining toolbox in Python. The Journal of Machine Learning Research, 14(1), 2349-2353.
Garijo, D., Alper, P., Belhajjame, K., Corcho, O., Gil, Y., & Goble, C. (2014). Common motifs in scientific workflows: An empirical analysis. Future Generation Computer Systems, 36, 338-351.
Garijo, D., Gil, Y., & Corcho, O. (2017). Abstract, link, publish, exploit: An end to end framework for workflow sharing. Future Generation Computer Systems, 75, 271-283.
Goecks, J., Nekrutenko, A., & Taylor, J. (2010). Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology, 11(8), R86.
Hofmann, M., & Klinkenberg, R. (2013). RapidMiner: Data mining use cases and business analytics applications: CRC Press.
Ignatow, G., & Mihalcea, R. (2016). Text mining: A guidebook for the social sciences: Sage Publications.
Khalajzadeh, H., Abdelrazek, M., Grundy, J., Hosking, J., & He, Q. (2018). A Survey of Current End-user Data Analytics Tool Support. Big Data (BigData Congress), 2018 IEEE International Congress on.
Lewis, L., & Callahan, C. (2018). 2018 - This is what happens in an internet minute. Retrieved from Cumulus Media
Liu, J., Pacitti, E., Valduriez, P., & Mattoso, M. (2015). A survey of data-intensive scientific workflow management. Journal of Grid Computing, 13(4), 457-493.
Ludäscher, B., Weske, M., McPhillips, T., & Bowers, S. (2009). Scientific workflows: Business as usual? Paper presented at the International Conference on Business Process Management.
Perovšek, M., Kranjc, J., Erjavec, T., Cestnik, B., & Lavrač, N. (2016). TextFlows: A visual programming platform for text mining and natural language processing. Science of Computer Programming, 121, 128-152.
. Text Analytics Market Analysis, Market Size, Application Analysis, Regional Outlook, Competitive Strategies And Forecasts, 2014 To 2020. Global Text Analytics Market: Research and Markets.
TrustwOrthy model-awaRE Analytics Data platform (TOREADOR). Retrieved from
Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., Owen, S., . . . Fisher, P. (2013). The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic acids research, 41(W1), W557-W561.
電子全文 Fulltext
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available

紙本論文 Printed copies
開放時間 available 已公開 available

QR Code