國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於輪廓分析識別內部威脅,Identifying Insider Threats Based on Profiling

論文名稱 Title	基於輪廓分析識別內部威脅 Identifying Insider Threats Based on Profiling
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	108 學年度第 1 學期 The fall semester of Academic Year 108	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	94
研究生 Author	黃脩閔 Hsiu-Min Huang
指導教授 Advisor	陳嘉玫 Chia-Mei Chen
召集委員 Convenor	鄭伯炤 Bo-Chao Cheng
口試委員 Advisory Committee	江明朝, 賴谷鑫, 康藝晃 Ming-Chao Chiang; Gu Hsin Lai; Yi-Huang Kang
口試日期 Date of Exam	2020-01-13	繳交日期 Date of Submission	2020-02-07
關鍵字 Keywords	偏差行為、內部威脅、輪廓分析、機器學習、文字探勘 Deviant behavior, Insider Threat, Profiling, Machine Learning, Text Mining
統計 Statistics	本論文已被瀏覽 5910 次，被下載 0 次 The thesis/dissertation has been browsed 5910 times, has been downloaded 0 times.

中文摘要
近年來網際網路的快速普及，許多企業組織已經將文件檔案電子化與業務轉到網際網路上來進行。過去企業組織與人們普遍認為駭客對網路世代的資訊安全來說是最大的威脅，但近幾年來內部威脅的實際案例發生的頻率已經越來越多，對於內部人員帶來的威脅也應當有同樣的重視。比起駭客等外部因素，內部威脅這種由內部人員（組織員工、契約關係的廠商等）進行的惡意行為，由於過去企業組織認為與員工跟合作廠商之間是相互信任的關係，因此在應對內部產生威脅的能力會相對較弱，在現今社會惡意攻擊類型的改變趨勢中就會陷於劣勢。本研究詳細整理關於內部威脅與輪廓分析的相關文獻，對內部威脅與輪廓分析特質與元素進行研究。導入輪廓分析（Profiling）對使用者進行建立行為輪廓（Profile）進行資料處理，並使用TFIDF來處理具有文字內容的資料增加行為輪廓特徵，使用單類支持向量機（One-Class Support Vector Machine，OCSVM）、隨機森林（Random Forest）、多重感知器（Multilayer perceptron，MLP）等機器學習的方法對資料進行分類訓練，以此偵測使用者偏離習慣的行為辨別內部威脅。
Abstract
In recent years, the rapid popularity of the Internet has led many corporate organizations to storage documents and businesses to the Internet. In the past, corporate organizations and people generally believed that hackers are the biggest threat to the information security of the Internet age. However, actual cases of insider threats have become more and more frequent. We should attach importance to the threats of internal personnel. Compared with external factors such as hackers, internal threats, such malicious acts performed by internal personnel (organizational employees, contractual relationship vendors, etc.), are considered by companies as a mutual trust relationship with employees and cooperating vendors. The corporate organization is weak in solving insider threats problem, and it will become the disadvantage on defense changing trend types of malicious attacks. This study summarizes relevant literature of insider threat and profiling, we also study the characteristics and elements in insider threat and profiling. In this study, we use profile analysis to establish user behavior profile, and use TFIDF to process text content data to increase profile features. To identify insider threats, we use One-Class Support Vector Machine (OCSVM), Random Forest, Multilayer Perceptron (MLP), in order to detect the behavior of users deviating from their habits to identify internal threats.

目次 Table of Contents
目錄論文審定書 i 論文公開授權書 ii 摘要 iii Abstract iv 目錄 v 圖次 vii 表次 viii 第一章緒論 1 1.1 研究背景 1 1.2 研究動機 2 第二章文獻探討 3 2.1 內部威脅 3 2.2 輪廓分析 16 2.3 單類支持向量機（One-Class Support Vector Machine） 19 2.4 隨機森林 22 2.5 多層感知器 23 第三章研究方法 25 3.1 資料預處理 26 3.2 輪廓分析與建立 26 3.3 行為輪廓資料處理 28 3.3.1 數值資料處理 33 3.3.2 文字資料處理 33 3.4 偵測模型 35 3.4.1 單類支持向量機偵測模型 35 3.4.2 隨機森林 36 3.4.3 多層感知器偵測模型 36 3.4.4 SMOTE樣本合成 36 3.5.5 滑動窗口 37 第四章系統評估 38 4.1 實驗1 輪廓特徵與偵測模型參數實驗 38 4.1.1 實驗1-1 偵測模型成效 43 4.1.2 實驗1-2 SMOTE平衡資料實驗 45 4.1.3 實驗1-3 特徵集A、B成效比較 49 4.1.4 實驗1-4 Profile使用與否成效實驗 52 4.1.5 實驗1-5 文字探勘特徵成效實驗 55 4.1.6 實驗1-6 輪廓特徵與偵測模型參數實驗評估 58 4.2 實驗2 過往文獻比較 65 4.3 實驗3 偵測模型成效評估 68 4.3.1 實驗3-1 時間序列-滑動窗口實驗 69 4.3.2 實驗3-2 以人為基礎評估實驗 72 第五章結論與未來展望 75 參考文獻 76 附錄一 81 圖次圖2-1、個人獨立主導犯案動機與實行過程流程圖[8] 8 圖2-2、主謀共謀犯罪動機與實行過程流程圖[8] 9 圖2-3、OCSVM示意圖[45, 50] 20 圖2-4、隨機森林示意圖[4] 23 圖2-5、多層感知器示意圖[23] 24 圖3-1、系統架構 25 圖3-2、文字資料處理流程圖 34 圖3-3、滑動窗口機制[9] 37 表次表2-1、內部威脅實例整理 3 表2-2、內部威脅手法歸納 10 表2-3、內部威脅文獻方法與系統整理表 15 表3-1、特徵資料說明 28 表4-1、實驗項目總表 38 表4-2、實驗1之子項目實驗列表 39 表4-3、CMU-CERT資料集資訊 39 表4-4、CMU-CERT資料集場景 40 表4-5、資料集內容檔案 41 表4-6、OCSVM對正常資料檢測成效-特徵集A+Profile 43 表4-7、OCSVM對異常資料檢測成效-特徵集A+Profile 44 表4-9、隨機森林對異常資料檢測成效-特徵集A+Profile 44 表4-10、多層感知器對異常資料檢測成效-特徵集A+Profile 45 表4-11、隨機森林、多層感知器平衡資料成效比較-以資料為單位 46 表4-12、隨機森林、多層感知器平衡資料成效比較-場景一 47 表4-13、隨機森林、多層感知器平衡資料成效比較-場景二 47 表4-14、隨機森林、多層感知器平衡資料成效比較-場景三 48 表4-15、OCSVM特徵集A+Profile、B+Profile比較 49 表4-16、隨機森林、多層感知器特徵集A+Profile、B+Profile比較-以資料為單位 50 表4-17、隨機森林、多層感知器特徵集A+Profile、B+Profile比較-場景一 50 表4-18、隨機森林、多層感知器特徵集A+Profile、B+Profile比較-場景二 51 表4-19、隨機森林、多層感知器特徵集A+Profile、B+Profile比較-場景三 51 表4-20、OCSVM特徵集B、特徵集B+Profile比較 52 表4-21、隨機森林、多層感知器特徵集B、特徵集B+Profile-以資料為單位 53 表4-22、隨機森林、多層感知器特徵集B、特徵集B+Profile-場景一 53 表4-23、隨機森林、多層感知器特徵集B、特徵集B+Profile-場景二 54 表4-24、隨機森林、多層感知器特徵集B、特徵集B+Profile-場景三 54 表4-25、OCSVM特徵集B+Profile、特徵集C+Profile比較 55 表4-26、隨機森林、多層感知器特徵集B+Profile、特徵集C+Profile-以資料為單位 56 表4-27、隨機森林、多層感知器特徵集B+Profile、特徵集C+Profile-場景一 56 表4-28、隨機森林、多層感知器特徵集B+Profile、特徵集C+Profile-場景二 57 表4-29、隨機森林、多層感知器特徵集B+Profile、特徵集C+Profile-場景三 57 表4-30、OCSVM輪廓特徵與偵測模型參數實驗評估表-以資料為單位 58 表4-31、OCSVM輪廓特徵與偵測模型參數實驗評估表-場景一 59 表4-32、OCSVM輪廓特徵與偵測模型參數實驗評估表-場景二 59 表4-33、OCSVM輪廓特徵與偵測模型參數實驗評估表-場景三 60 表4-34、隨機森林輪廓特徵與偵測模型參數實驗評估表-以資料為單位 60 表4-35、隨機森林輪廓特徵與偵測模型參數實驗評估表-場景一 61 表4-36、隨機森林輪廓特徵與偵測模型參數實驗評估表-場景二 61 表4-37、隨機森林輪廓特徵與偵測模型參數實驗評估表-場景三 62 表4-38、多層感知器輪廓特徵與偵測模型參數實驗評估表-以資料為單位 62 表4-39、多層感知器輪廓特徵與偵測模型參數實驗評估表-場景一 63 表4-40、多層感知器輪廓特徵與偵測模型參數實驗評估表-場景二 63 表4-41、多層感知器輪廓特徵與偵測模型參數實驗評估表-場景三 64 表4-42、本研究與文獻比較表-場景一 65 表4-43、本研究與文獻比較表-場景二 66 表4-44、本研究與文獻比較表-場景三 67 表4-45、Window Size大小與實驗門檻值參數 69 表4-46、Random Forest-以資料為單位+Sliding Window參數實驗結果總表 69 表4-47、Random Forest-場景一（S1）+Sliding Window參數實驗結果總表 70 表4-48、Random Forest-場景二（S2）+Sliding Window參數實驗結果總表 70 表4-49、Random Forest-場景三（S3）+Sliding Window參數實驗結果總表 71 表4-50、總資料以人為單位實驗結果 72 表4-51、場景一（S1）以人為單位實驗結果 73 表4-52、場景二（S2）以人為單位實驗結果 73 表4-53、場景三（S3）以人為單位實驗結果 73

參考文獻 References
[1] "Insider Threat Test Dataset." Carnegie Mellon University. https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=508099 (accessed Nov 16, 2018). [2] "企業內部威脅案例：被炒魷魚的網絡工程師，造成花旗銀行北美90%網絡癱瘓." Freebuf. https://read01.com/zh-tw/ke7eaQ.html#.W-p3-JMks2y (accessed Nov 13, 2018). [3] M. Ahmed, A. N. Mahmood, and J. Hu, "A survey of network anomaly detection techniques," Journal of Network and Computer Applications, vol. 60, pp. 19-31, 2016. [4] August. "整合模型之隨機森林(二)." https://zhuanlan.zhihu.com/p/38484624 (accessed 2020-01-18. [5] A. Azaria, A. Richardson, S. Kraus, and V. Subrahmanian, "Behavioral analysis of insider threat: A survey and bootstrapped prediction in imbalanced data," IEEE Transactions on Computational Social Systems, vol. 1, no. 2, pp. 135-155, 2014. [6] O. Brdiczka, J. Liu, B. Price, J. Shen, A. Patil, R. Chow, E. Bart, and N. Ducheneaut, "Proactive insider threat detection through graph learning and psychological context," in 2012 IEEE Symposium on Security and Privacy Workshops, 2012: IEEE, pp. 142-149. [7] N. Cao, C. Shi, S. Lin, J. Lu, Y.-R. Lin, and C.-Y. Lin, "Targetvue: Visual analysis of anomalous user behaviors in online communication systems," IEEE transactions on visualization and computer graphics, vol. 22, no. 1, pp. 280-289, 2015. [8] D. M. Cappelli, A. P. Moore, and R. F. Trzeciak, The CERT guide to insider threats: how to prevent, detect, and respond to information technology crimes (Theft, Sabotage, Fraud). Addison-Wesley, 2012. [9] P. Chattopadhyay, L. Wang, and Y.-P. Tan, "Scenario-based insider threat detection from cyber activities," IEEE Transactions on Computational Social Systems, no. 99, pp. 1-16, 2018. [10] T. Cruz, L. Rosa, J. Proença, L. Maglaras, M. Aubigny, L. Lev, J. Jiang, and P. Simoes, "A cybersecurity detection framework for supervisory control and data acquisition systems," IEEE Transactions on Industrial Informatics, vol. 12, no. 6, pp. 2236-2246, 2016. [11] M. Dahmane and S. Foucher, "Combating Insider Threats by User Profiling from Activity Logging Data," in 2018 1st International Conference on Data Intelligence and Security (ICDIS), 2018: IEEE, pp. 194-199. [12] Y. Dong, B. Du, and L. Zhang, "Target detection based on random forest metric learning," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 4, pp. 1830-1838, 2015. [13] J. Esmaily, R. Moradinezhad, and J. Ghasemi, "Intrusion detection system based on multi-layer perceptron neural networks and decision tree," in 2015 7th Conference on Information and Knowledge Technology (IKT), 2015: IEEE, pp. 1-5. [14] N. Farnaaz and M. Jabbar, "Random forest modeling for network intrusion detection system," Procedia Computer Science, vol. 89, pp. 213-217, 2016. [15] Z. Ferdousi and A. Maeda, "Anomaly Detection Using Unsupervised Profiling Method in Time Series Data," in ADBIS Research Communications, 2006. [16] G. Fernandes Jr, J. J. Rodrigues, and M. L. Proenca Jr, "Autonomous profile-based anomaly detection system using principal component analysis and flow analysis," Applied Soft Computing, vol. 34, pp. 513-525, 2015. [17] G. Gavai, K. Sricharan, D. Gunning, R. Rolleston, J. Hanley, and M. Singhal, "Detecting insider threat from enterprise social and online activity data," in Proceedings of the 7th ACM CCS international workshop on managing insider security threats, 2015: ACM, pp. 13-20. [18] M. Govindarajan and R. Chandrasekaran, "Intrusion detection using neural based hybrid classification methods," Computer networks, vol. 55, no. 8, pp. 1662-1671, 2011. [19] D. Haidar and M. M. Gaber, "Adaptive One-Class Ensemble-based Anomaly Detection: An Application to Insider Threats," in 2018 International Joint Conference on Neural Networks (IJCNN), 2018: IEEE, pp. 1-9. [20] D. Haidar and M. M. Gaber, "Data Stream Clustering for Real-Time Anomaly Detection: An Application to Insider Threats," in Clustering Methods for Big Data Analytics: Springer, 2019, pp. 115-144. [21] D. Haidar and M. M. Gaber, "Outlier detection in random subspaces over data streams: an approach for insider threat detection," Expert Update, vol. 17, no. 1, pp. 1-16, 2017. [22] M. A. M. Hasan, M. Nasser, B. Pal, and S. Ahmad, "Support vector machine and random forest modeling for intrusion detection system (IDS)," Journal of Intelligent Learning Systems and Applications, vol. 6, no. 01, p. 45, 2014. [23] T. Huang. "機器學習- 神經網路(多層感知機 Multilayer perceptron, MLP)運作方式." https://medium.com/@chih.sheng.huang821/%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92-%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF-%E5%A4%9A%E5%B1%A4%E6%84%9F%E7%9F%A5%E6%A9%9F-multilayer-perceptron-mlp-%E9%81%8B%E4%BD%9C%E6%96%B9%E5%BC%8F-f0e108e8b9af (accessed 2020-01-18. [24] I. Insider Threat, "DoD Insider Threat Mitigation: Final Report of the Insider Threat Integrated Process Team," Department of Defense, April, vol. 24, p. 2000, 2000. [25] J. Jiang, J. Chen, K.-K. R. Choo, K. Liu, C. Liu, M. Yu, and P. Mohapatra, "Prediction and Detection of Malicious Insiders' Motivation Based on Sentiment Profile on Webpages and Emails," in MILCOM 2018-2018 IEEE Military Communications Conference (MILCOM), 2018: IEEE, pp. 1-6. [26] W. Jiang, Y. Tian, W. Liu, and W. Liu, "An Insider Threat Detection Method Based on User Behavior Analysis," in International Conference on Intelligent Information Processing, 2018: Springer, pp. 421-429. [27] T. Karagiannis, K. Papagiannaki, N. Taft, and M. Faloutsos, "Profiling the end host," in International Conference on Passive and Active Network Measurement, 2007: Springer, pp. 186-196. [28] D.-W. Kim, S.-S. Hong, and M.-M. Han, "A study on Classification of Insider threat using Markov Chain Model," TIIS, vol. 12, no. 4, pp. 1887-1898, 2018. [29] P. A. Legg, "Visualizing the insider threat: challenges and tools for identifying malicious user activity," in 2015 IEEE Symposium on Visualization for Cyber Security (VizSec), 2015: IEEE, pp. 1-7. [30] P. A. Legg, O. Buckley, M. Goldsmith, and S. Creese, "Automated insider threat detection system using user and role-based profile assessment," IEEE Systems Journal, vol. 11, no. 2, pp. 503-512, 2017. [31] P. A. Legg, O. Buckley, M. Goldsmith, and S. Creese, "Caught in the act of an insider attack: detection and assessment of insider threat," in 2015 IEEE International Symposium on Technologies for Homeland Security (HST), 2015: IEEE, pp. 1-6. [32] Y. Li, T. Zhang, Y. Y. Ma, and C. Zhou, "Anomaly detection of user behavior for database security audit based on OCSVM," in 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), 2016: IEEE, pp. 214-219. [33] L. Lin, S. Zhong, C. Jia, and K. Chen, "Insider Threat Detection Based on Deep Belief Network Feature Representation," in 2017 International Conference on Green Informatics (ICGI), 2017: IEEE, pp. 54-59. [34] G. B. Magklaras and S. Furnell, "Insider threat prediction tool: Evaluating the probability of IT misuse," Computers & Security, vol. 21, no. 1, pp. 62-73, 2002. [35] L. A. Maglaras and J. Jiang, "Intrusion detection in SCADA systems using machine learning techniques," in 2014 Science and Information Conference, 2014: IEEE, pp. 626-631. [36] F. Meng, F. Lou, Y. Fu, and Z. Tian, "Deep learning based attribute classification insider threat detection for data security," in 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC), 2018: IEEE, pp. 576-581. [37] M. Moradi and M. Zulkernine, "A neural network based system for intrusion detection and classification of attacks," in Proceedings of the IEEE International Conference on Advances in Intelligent Systems-Theory and Applications, 2004, pp. 15-18. [38] K. Muandet and B. Schölkopf, "One-class support measure machines for group anomaly detection," arXiv preprint arXiv:1303.0309, 2013. [39] J. Nowak, M. Korytkowski, R. Nowicki, R. Scherer, and A. Siwocha, "Random forests for profiling computer network users," in International Conference on Artificial Intelligence and Soft Computing, 2018: Springer, pp. 734-739. [40] A. Parres-Peredo, I. Piza-Davila, and F. Cervantes, "Towards a user network profiling for internal security using top-K rankings similarity measures," in 2017 40th International Conference on Telecommunications and Signal Processing (TSP), 2017: IEEE, pp. 16-19. [41] P. Parveen, Z. R. Weger, B. Thuraisingham, K. Hamlen, and L. Khan, "Supervised learning for insider threat detection using stream mining," in 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence, 2011: IEEE, pp. 1032-1039. [42] T. Rashid, I. Agrafiotis, and J. R. Nurse, "A new take on detecting insider threats: exploring the use of hidden markov models," in Proceedings of the 8th ACM CCS International Workshop on Managing Insider Security Threats, 2016: ACM, pp. 47-56. [43] R. Trzeciak, "Understanding and Protecting Against Multiple Faces of Insider Threats.," presented at the CSO Perspectives on Cyber Security Conference, 2013. [44] A. Tuor, S. Kaplan, B. Hutchinson, N. Nichols, and S. Robinson, "Deep learning for unsupervised insider threat detection in structured cybersecurity data streams," in Workshops at the Thirty-First AAAI Conference on Artificial Intelligence, 2017. [45] R. Vlasveld. "Introduction to One-class Support Vector Machines." http://rvlasveld.github.io/blog/2013/07/12/introduction-to-one-class-support-vector-machines/ (accessed 1/8, 2019). [46] C. Wood. "Insider threat examples: 7 insiders who breached security." https://www.csoonline.com/article/3263799/security/insider-threat-examples-7-insiders-who-breached-security.html#slide1 (accessed Nov 13, 2018). [47] H. Yao, Y. Liu, and C. Fang, "An abnormal network traffic detection algorithm based on big data analysis," International Journal of Computers, Communications & Control, vol. 11, no. 4, 2016. [48] F. Yuan, Y. Cao, Y. Shang, Y. Liu, J. Tan, and B. Fang, "Insider threat detection with deep neural network," in International Conference on Computational Science, 2018: Springer, pp. 43-54. [49] R. Zhang, S. Zhang, Y. Lan, and J. Jiang, "Network anomaly detection using one class support vector machine," in Proceedings of the International MultiConference of Engineers and Computer Scientists, 2008, vol. 1. [50] 吴定海, 张培林, 任国全, and 陈. J. 计算机工程, "基于支持向量的单类分类方法综述," vol. 37, no. 5, pp. 187-189, 2011. [51] 李建興. "不付贖金就公開個資！GDPR反成勒索攻擊Ransomhack的威脅武器." iThome. https://www.ithome.com.tw/news/124136 (accessed Nov 16, 2018). [52] 邱瑩青. "資訊安全的最大威脅-人員安全." https://www.informationsecurity.com.tw/article/article_detail.aspx?aid=672 (accessed Nov 13, 2018).

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2025-02-07 校外 Off-campus：開放下載的時間 available 2025-02-07 您的 IP(校外) 位址是 3.145.173.107 現在時間是 2024-07-27 論文校外開放下載的時間是 2025-02-07 Your IP address is 3.145.173.107 The current date is 2024-07-27 This thesis will be available to you on 2025-02-07.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2025-02-07

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS