國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於深度規則森林的可解釋表徵學習演算法,Interpretable representation learning based on Deep Rule Forests

論文名稱 Title	基於深度規則森林的可解釋表徵學習演算法 Interpretable representation learning based on Deep Rule Forests
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	106 學年度第 2 學期 The spring semester of Academic Year 106	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	59
研究生 Author	郭博文 Bo-Wen Kuo
指導教授 Advisor	康藝晃 Yihuang, Kang
召集委員 Convenor	林耕霈 Keng-Pei Lin
口試委員 Advisory Committee	李珮如 Pei-Ju, Lee
口試日期 Date of Exam	2018-07-20	繳交日期 Date of Submission	2018-08-28
關鍵字 Keywords	規則學習、隨機森林、表徵學習、解釋性、深度規則森林 Rule Learning, Random Forest, Representation Learning, Interpretability, Deep Rule Forest
統計 Statistics	本論文已被瀏覽 6094 次，被下載 68 次 The thesis/dissertation has been browsed 6094 times, has been downloaded 68 times.

中文摘要
以樹為基礎的方法之精神在於學習規則，有很多的機器學習演算法皆以樹為基礎，一些較複雜的樹學習器可能提供更高準確的預測模型，卻也犧牲了模型的解釋性。表徵學習的精神在於從表面的資料中萃取抽象的概念，深度神經網路是最熱門的表徵學習方法，然而，不可究責的表徵學習特性一直以來是深度神經網路的缺點。在本篇論文中，我們提出了一種方法名為深度規則森林，能夠在層疊結構中藉由隨機森林學習區域的表徵，學習到的區域表徵能夠搭配其他機器學習演算法，我們以深度規則森林習得的區域表徵來訓練 CART 決策樹，並發現預測率有時甚至會高於整體學習的方法。
Abstract
The spirit of tree-based methods is to learn rules. A large number of machine learning techniques are tree-based. More complicated tree learners may result in higher predictive models, but may sacrifice for model interpretability. On the other hand, the spirit of representation learning is to extract abstractive concepts from manifestations of the data. For instance, Deep Neural networks (DNNs) is the most popular method in representation learning. However, unaccountable feature representation is the shortcoming of DNNs. In this paper, we proposed an approach, Deep Rule Forest (DRF), to learn region representations based on random forest in the deep layer-wise structures. The learned interpretable rules region representations combine other machine learning algorithms. We trained CART which learned from DRF region representations, and found that the prediction accuracies sometime are better than ensemble learning methods.

目次 Table of Contents
[論文審定書+i] [中文摘要+ii] [Abstract+iii] [Table of content+iv] [1. Introduction+1] [2. Background Review+3] [2.1. Tree-based methods+3] [2.2. Representation learning+9] [2.3. Forward thinking+11] [2.4. Explainable A+14] [2.5. Information processing+15] [3. Building Deep Rule Forest+19] [3.1. Forward forest structur+19] [3.2. Growing stage and pruning stage+22] [3.3. Interpretabilit+24] [3.4. Hyper-parameters+26] [4.Experiment+28] [4.1. Prediction accuracy+30] [4.2. Information bottleneck principle+31] [4.3. Influence of hyper-parameters+34] [4.4. Backtracking rules+38] [5. Conclusion+43] [6. Reference+45] [7. Appendix A+48]

參考文獻 References
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. Bengio, Y., Courville, A., & Vincent, P. (2012). Representation Learning: A Review and New Perspectives. ArXiv:1206.5538 [Cs]. Retrieved from http://arxiv.org/abs/1206.5538 Bolton, A., Huang, A., Guez, A., Silver, D., Hassabis, D., Hui, F., … Chen, Y. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354. https://doi.org/10.1038/nature24270 Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655 Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324 Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees. Taylor & Francis. Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). New York, NY, USA: ACM. https://doi.org/10.1145/2939672.2939785 Chen, T., He, T., & Benesty, M. (2015). Xgboost: extreme gradient boosting. R Package Version 0.4-2, 1–4. Cover, T. M., & Thomas, J. A. (2012). Elements of information theory. John Wiley & Sons. Dietterich, T. G. (2002). Ensemble learning. The Handbook of Brain Theory and Neural Networks, 2, 110–125. Freund, Y., Schapire, R., & Abe, N. (1999). A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence, 14(771–780), 1612. Fürnkranz, J., Gamberger, D., & Lavrač, N. (2012). Foundations of rule learning. Springer Science & Business Media. Goodman, B., & Flaxman, S. (2016). European Union regulations on algorithmic decision-making and a" right to explanation". ArXiv Preprint ArXiv:1606.08813. Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46). John Wiley & Sons. Kohavi, R., John, G., Long, R., Manley, D., & Pfleger, K. (1994). MLC++: A machine learning library in C++. In Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94 (pp. 740–743). IEEE. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf Kuhn, M., Weston, S., Coulter, N., & Quinlan, R. (2014). C50: C5. 0 decision trees and rule-based models. R Package Version 0.1. 0-21, URL Http://CRAN. R-Project. Org/Package C, 50. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436. Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788. Lending Club Statistics \| LendingClub. (n.d.). Retrieved July 17, 2018, from https://www.lendingclub.com/info/download-data.action Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22. Lichman, M. (2013). UCI Machine Learning Repository. Retrieved November 22, 2017, from https://archive.ics.uci.edu/ml/index.php Lipton, Z. C. (2016). The mythos of model interpretability. ArXiv Preprint ArXiv:1606.03490. Miller, K., Hettinger, C., Humpherys, J., Jarvis, T., & Kartchner, D. (2017). Forward Thinking: Building Deep Random Forests. ArXiv:1705.07366 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1705.07366 Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251 Quinlan, J. Ross. (2014). C4. 5: programs for machine learning. Elsevier. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1602.04938 Samek, W., Wiegand, T., & Müller, K.-R. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ArXiv:1708.08296 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1708.08296 Schapire, R. E. (2013). Explaining AdaBoost. In Empirical Inference (pp. 37–52). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41136-6_5 Su, G., Wei, D., Varshney, K. R., & Malioutov, D. M. (2016). Interpretable Two-level Boolean Rule Learning for Classification. ArXiv:1606.05798 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1606.05798 Therneau, T., Atkinson, B., & Ripley, B. (2015). rpart: Recursive Partitioning and Regression Trees. R package version 4.1–10. Tishby, N., & Zaslavsky, N. (2015). Deep learning and the information bottleneck principle. In Information Theory Workshop (ITW), 2015 IEEE (pp. 1–5). IEEE. Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1–3), 37–52. Wright, M. N., & Ziegler, A. (2015). Ranger: a fast implementation of random forests for high dimensional data in C++ and R. ArXiv Preprint ArXiv:1508.04409. Zhou, Z.-H., & Feng, J. (2017). Deep Forest: Towards An Alternative to Deep Neural Networks. ArXiv:1702.08835 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1702.08835

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0727118-134901.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS