國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,應用決策樹演算法於點膠製程瑕疵之預測與決策法則之建立,Using Decision Tree Algorithm for the Detection and Decision Rule Construction of Defect on Dispensing Process

論文名稱 Title	應用決策樹演算法於點膠製程瑕疵之預測與決策法則之建立 Using Decision Tree Algorithm for the Detection and Decision Rule Construction of Defect on Dispensing Process
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	106 學年度第 2 學期 The spring semester of Academic Year 106	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	63
研究生 Author	許永錚 Yung-Cheng Hsu
指導教授 Advisor	黃三益 San-Yih Hwang
召集委員 Convenor	徐士傑 Shih-Chieh Hsu
口試委員 Advisory Committee	許惠媚 Hui-Mei Hsu
口試日期 Date of Exam	2018-04-16	繳交日期 Date of Submission	2018-05-29
關鍵字 Keywords	資料探勘、羅吉斯迴歸、隨機森林、決策樹、C5.0 C5.0, Decision tree, Random forest, Data mining, Logistic regression
統計 Statistics	本論文已被瀏覽 6238 次，被下載 4 次 The thesis/dissertation has been browsed 6238 times, has been downloaded 4 times.

中文摘要
近年來由於資訊科技發達，大數據的相關分析研究在各個產業領域的應用廣泛，為各個產業帶來不同的新思維在於產業創新應用，IC封裝測試常使用的表面黏著技術(SMT)-點膠製程，因許多非預期的因素影響，容易導致瑕疵產品的發生，且是在作業後，透過產品的檢查、測試才會發現瑕疵的現象。瑕疵除了造成的低產出之外、生產材料的成本消耗以及發生後續要花費人力以及時間進行瑕疵分析，對於製造業來說是相當昂貴的成本支出。本研究欲透過資料探勘技術，建立一瑕疵預測模型進而產生瑕疵的決策法則，作為瑕疵分析的參考依據，透過決策法則的內容作為作業檢查結果判定的參考依據，判斷檢查結果的作業狀態是否適合進行生產，適當的做出決策以避免/減少瑕疵發生的機會。本研究透過移除正常數據以凸顯異常數據，以羅吉斯迴歸檢定異常數據欄位與瑕疵種類的關係顯著性，以隨機森林以及決策樹C5.0建立分類模型，並且將潛在規則發掘出來。研究結果顯示資料集的異常數據建立模型的平均預測正確率98.5%，平均敏感度76.33%，平均特異度99.86%，萃取出的規則與專家已知的經驗值相符。
Abstract
In recent years, due to the advancement of information technology, the analysis and research of big data has been widely applied in various industrial fields. The new thinking that brings different industries to various industries lies in the application of industrial innovation. Surface Mount Technology (SMT) - Dispensing Processes often used in IC packaging and testing, can easily lead to the occurrence of defective products due to many accidental factors, defects are then discovered through product inspection and testing after the dispensing process. In addition to the low-yield caused by the production, the cost of the production materials, and the time and labor spent for follow-up analysis, are quite expensive for the manufacturing industry. This study intends to establish a predictive model through data mining technique and identy decision rules. As a reference for defect analysis, the set of decision rules can be used for adjusting operational inspections, including judging whether the operating status of the inspection results is appropriate for production, and making decisions to avoid/reduce the occurrence of defects opportunely. In this study, we try to identify some data fields that are highly related to defect type as verified by Logistic regression. The classification model was established using random forest and decision tree C5.0, and potential rules were discovered. The results show the predict model achieves the average prediction accuracy 98.5%, the average sensitivity 76.33%, and the average specificity 99.86%. The extracted rules are consistent with the conventional wisdom known to the experts.

目次 Table of Contents
論文審定書 i 誌謝 ii 摘要 iii Abstract iv 圖目錄 vii 表目錄 viii 第一章緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究架構 3 第二章文獻探討 4 2.1 資料探勘 4 2.3 資料探勘技術 6 2.3.1 羅吉斯迴歸 6 2.3.2 決策樹 7 2.3.2.1 決策樹的分枝準則 9 2.3.3 集成方法/隨機森林 10 2.4 應用決策樹於事故排除之相關研究 11 第三章研究方法 12 3.1 研究架構/說明 12 3.2 點膠製程描述/瑕疵定義 13 3.2.1 點膠製程描述 13 3.2.2 點膠製程瑕疵定義 15 3.2.3 專家訪談 15 3.3 資料前置處理 16 3.3.1 資料說明 16 3.3.2 資料統計 17 3.3.3 資料前置處理 18 3.3.4 瑕疵欄位特徵說明 20 3.3.5 瑕疵特徵欄位擷取欄位 22 3.4 羅吉斯迴歸分析重要欄位 24 3.4.1 P-Value 28 3.5 決策樹C5.0分析重要欄位 36 3.6 隨機森林分析重要欄位重要性 38 第四章實驗結果 40 4.1 分類模型建置與評估方法 40 4.2 模型的評估方法 41 4.3 預測結果 44 4.4 決策樹規則整理表(trial=10) 45 4.5 決策樹規則解釋 48 第五章結論 51 5.1 研究結論 51 5.2 研究限制 51 參考文獻 52

參考文獻 References
1.簡禎富，資料探勘與大數據分析，2017。 2.彭新傑，利用資料挖礦找出影響產品品質之關鍵因素:以LED封裝為例，碩士論文，國立清華大學工業工程與工程管理學系碩士在職專班，2016。 3.廖勝全，PCBA製程不良數據探勘以建構智慧決策規則，碩士論文，國立臺北科技大學工業工程與管理研究所，2015。 4.徐仙陽，應用支援向量機建構半導體黃光升溫製程之FDC系統，碩士論文，國立清華大學工業工程與工程管理研究所，2011。 5.王宣傑，應用資料採礦於台灣中小尺寸面板廠品質異常原因分析模式，碩士論文，東海大學工業工程與經營資訊研究所，2009。 6.郭信宏，應用資料探勘技術於面板檢測實證研究，碩士論文，國立中央大學工業管理研究所，2008。 7.Dietterich,T.G.(2000).“Ensemble methods in machine learning”，Proceedings of the First International Workshop on Multiple Classifier Systems (MCS00),pp. 1-15. 8.Breiman L. (2001a). Random Forests, Machine Learning, 45, 5-32 9.Japkowicz, N. and Stephen, S. (2002) The class imbalance problem: A systematic study. Intelligent Data Analysis, pp. 4-21. 10.Gary M. Weiss.(2004，November).C5.0: An Informal Tutorial，http://storm.cis.fordham.edu/～gweiss/dmrg/c5-tutorial.html 11.Lunardon.N, Menardi.G, Torelli.N，ROSE: A Package for Binary Imbalanced Learning.N - The R Journal Vol. 6/1, June 2014. 12.Fayyad,U.M.,“ Data mining and knowledge discovery: making sense out of data”, IEEE Expert, Vol. 11 , no. 5 , 1996（a）, pp.20 – 25 13.Kusiak, A.and Kurasek, C., “Data mining ofprinted-circuit board defects”, Robotics and Automation, IEEE Transactions on, Vol. 17（2）, 2001,pp.191 – 196. 14.Tsuda H., Shiri H., Takagi O. and Take R.,“Yield analysis and improvement by reducing manufacturing fluctuation noise” ISSM 2000 proceeding,pp. 249-251. 15.Berry, M., and Linoff, G., Data Mining Techniques for Marketing, Sales and Customer Support, John Wiley and Sons, New York, 1997. 16.Cabena, P., Hadjinian P., Stadler R., Verhess J., and Zanasi A., Discovering Data Mining From Concept to Implementation, Prentice Hall PTR, Upper Saddle River, New Jersey, 1997. 17.Thuraisingham, B., “A primer for understanding and applying data mining”, IT Professional, Vol. 2（1）, 2000, pp. 28–31. 18.Feelders, A., Daniels, H. and Holsheimer, M., “Methodological and practical aspec of data mining”, Information and Management, vol. 37（5）,2000,pp. 271-281.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0419118-080930.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS