國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於大型語言模型之自動化攻擊情境重構：威脅情資提取策略與技術,Automatic Attack Scenario Reconstruction with Large Language Models: Threat Intelligence Extraction Strategies and Techniques

論文名稱 Title	基於大型語言模型之自動化攻擊情境重構：威脅情資提取策略與技術 Automatic Attack Scenario Reconstruction with Large Language Models: Threat Intelligence Extraction Strategies and Techniques
系所名稱 Department	資訊工程學系資訊安全碩士班 Master Program in Information Security, Department of Computer Science and Engineering
畢業學年期 Year, semester	113 學年度第 1 學期 The fall semester of Academic Year 113	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	75
研究生 Author	林哲緯 CHE-WEI LIN
指導教授 Advisor	范俊逸 Fan,Chun-I
召集委員 Convenor	王智弘 Wang,Chih-Hung
口試委員 Advisory Committee	徐瑞壕, 蔡崇煒, 林志學 Hsu,Ruei-Hau; Tsai,Chun-Wei; Lin,Chih-Hsueh
口試日期 Date of Exam	2025-01-24	繳交日期 Date of Submission	2025-01-24
關鍵字 Keywords	大型語言模型、提示詞工程、自動化攻擊、威脅情資、威脅重構 Large Language Models, Prompt Engineering, Attack Automation, Threat Intelligence, Threat Reconstruction
統計 Statistics	本論文已被瀏覽 91 次，被下載 0 次 The thesis/dissertation has been browsed 91 times, has been downloaded 0 times.

中文摘要
在現代人工智慧與自動化技術快速發展的浪潮下，攻擊者陸續運用這些新興的技術來精進攻擊手法與加快攻擊速度。與此同時，資安研究人員亦可藉由這些技術，更有效地掌握與分析攻擊者的戰術、技術與程序 (Tactics, Techniques, and Procedures, TTPs)。近期，大型語言模型的應用已逐步擴及至資安領域，協助資安人員進行漏洞識別、分類以及滲透測試等工作。然而，在重現整體資安攻擊鏈的研究領域，受限於商業利益與技術門檻等因素下，相關的公開研究仍相對稀少。本研究提出了一套系統架構，將攻擊重現的流程系統化地劃分為威脅情資整理、場域建置與攻擊腳本產生三個階段。透過開源軟體打造的此系統，整合了大型語言模型與虛擬機器建置的功能，讓使用者能夠於一站式的系統下，透過視覺化的網頁介面，重現威脅情資中所記載的攻擊過程，深入理解攻擊事件的來龍去脈。在大型語言模型的輔助下，本研究成功運用此系統重現常見漏洞與暴露(Common Vulnerabilities and Exposures, CVE)的漏洞攻擊場景與進階持續性威脅(Advanced Persistent Threat, APT)組織的攻擊手法，且大幅縮短以往重現資安攻擊繁雜的流程與作業時間。研究結果顯示，在本研究的系統輔助下，能提升資安研究人員的工作效率，在資安威脅的分析上具有實質的效益。
Abstract
With the rapid development of modern artificial intelligence and automation technologies, attackers have been utilizing these emerging technologies to refine their attack techniques and speed up their attacks. At the same time, information security researchers can use these technologies to more effectively understand and analyze attackers' Tactics, Techniques, and Procedures (TTPs). Recently, the application of Large Language Models (LLM) has been gradually extended to the information security domain to assist security professionals in vulnerability identification, classification, and penetration testing. However, research on replicating the entire cyber kill chain remains relatively scarce, with relatively few public studies available due to constraints related to commercial interests and technical thresholds. This research proposes a system architecture that systematically divides the process of attack reproduction into three stages: threat intelligence organization, scenario construction, and attack script generation. The system is built by open source software, which integrates the functions of Large Language Model and virtual machine construction, allowing users to reproduce the attack process recorded in threat intelligence through a visualized web interface in a one-stop platform, to gain a deeper understanding of the sequence of the attack events. With the aid of a Large Language Model, this research successfully applies the system to reproduce the Common Vulnerabilities and Exposures (CVE) vulnerability attack scenarios and the attack techniques of Advanced Persistent Threat (APT) organizations, as well as to significantly shorten the time required to reproduce the complicated process and operation time of previous security attacks. The study also significantly reduces the time needed to reproduce the complex processes and operations of security attacks in the past. The results of this study show that the system can improve the efficiency of security researchers and provide substantial benefits in the analysis of security threats.

目次 Table of Contents
論文審定書 i 摘要 ii Abstract iii 目錄 iv 圖次 vi 表次 vii 第一章緒論 1 1.1 論文貢獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 第二章相關研究 4 2.1 以威脅情資作為攻擊識別的基礎 . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 攻擊模擬與重現 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 運用大型語言模型輔助滲透測試 . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 大型語言模型於漏洞檢測與識別的潛力 . . . . . . . . . . . . . . . . . . . . 5 第三章背景知識 7 3.1 整合大型語言模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.1 大型語言模型分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.2 介接 OpenAI 應用程式介面 . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 虛擬化實驗場域建置 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 虛擬化機器平台 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.2 虛擬化網路架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 第四章系統設計 14 4.1 系統架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 系統與大型語言模型整合 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 威脅情資整理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4 場域建置 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4.1 虛擬機器建置 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4.2 自動化軟體安裝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.5 自動化攻擊腳本產生 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 第五章實驗結果 33 5.1 CVE-2024-4577 模擬 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.1.1 攻擊重現 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.1.2 實驗小結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.2 APT TAG-70 模擬 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.2.1 攻擊重現 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.2.2 實驗小結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.3 討論與比較 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 第六章結論與未來展望 55 參考文獻 57 附錄 A 系統環境配置檔 63

參考文獻 References
[1] M. Hughes, “The growing cyberthreat to critical national infrastructure,” Network Security,2023. [2] IBM, “Cost of a data breach report 2024.” Accessed: Sep. 30, 2024. [Online.] Available: https://www.ibm.com/reports/data-breach. [3] M. E. Oka and M. Hromada, “Analysis of current preventive approaches in the context of cybersecurity,” in 2022 IEEE International Carnahan Conference on Security Technology (ICCST), pp. 1–5, 2022. [4] M. Vierhauser, I. Groher, T. Antensteiner, and C. Sauerwein, “Towards integrating emerging ai applications in se education,” in 2024 36th International Conference on Software Engineering Education and Training (CSEE&T), pp. 1–5, 2024. [5] M. Parmar and A. Domingo, “On the use of cyber threat intelligence (cti) in support of developing the commander’s understanding of the adversary,” in MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM), pp. 1–6, 2019. [6] “The MITRE Corporation. ATT&CK.” Accessed: Nov. 30, 2024. [Online.] Available: https://attack.mitre.org/. [7] M. Mundt and H. Baier, “Threat-based simulation of data exfiltration toward mitigating multiple ransomware extortions,” Digital Threats, vol. 4, Oct. 2023. [8] S. Karagiannis, A. Tokatlis, S. Pelekis, M. Kontoulis, G. Doukas, C. Ntanos, and E. Magkos, “A-demo: Att&ck documentation, emulation and mitigation operations: Deploying and documenting realistic cyberattack scenarios - a rootkit case study,” in Proceedings of the 25th Pan-Hellenic Conference on Informatics, PCI ’21, (New York, NY, USA), p. 328–333, Association for Computing Machinery, 2022. [9] E. Iturbe, J. Arcas, E. Rios, and N. Toledo, “A multi-layer approach through threat modelling and attack simulation for enhanced cyber security assessment,” in Proceedings of the 19th International Conference on Availability, Reliability and Security, ARES ’24, (New York, NY, USA), Association for Computing Machinery, 2024. [10] “Common Vulnerabilities and Exposures (CVE).” [Online]. Available: https://cve.mitre.org/. [11] A. Happe and J. Cito, “Getting pwn＇d by ai: Penetration testing with large language models,” in Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2023, (New York, NY, USA), p. 2082–2086, Association for Computing Machinery, 2023. [12] G. Deng, Y. Liu, V. Mayoral-Vilches, P. Liu, Y. Li, Y. Xu, T. Zhang, Y. Liu, M. Pinzger, and S. Rass, “PentestGPT: Evaluating and harnessing large language models for automated penetration testing,” in 33rd USENIX Security Symposium (USENIX Security 24), (Philadelphia, PA), pp. 847–864, USENIX Association, Aug. 2024. [13] “HackTheBox.” [Online]. Available: https://www.hackthebox.com/. [14] “VulnHub.” [Online]. Available: https://www.vulnhub.com/. [15] “ChatGPT.” Accessed: Nov. 30, 2024. [Online.] Available: https://chat.openai.com/. [16] M. M. Kholoosi, M. A. Babar, and R. Croft, “A qualitative study on using chatgpt for software security: Perception vs. practicality,” in 2024 IEEE 6th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA), pp. 107–117, 2024. [17] M. Fu, C. K. Tantithamthavorn, V. Nguyen, and T. Le, “Chatgpt for vulnerability detection, classification, and repair: How far are we?,” in 2023 30th Asia-Pacific Software Engineering Conference (APSEC), pp. 632–636, 2023. [18] MITRE, “Common Weakness Enumeration (CWE).” Accessed: Dec. 30, 2024. [Online.] Available: https://cwe.mitre.org/. [19] “Common Vulnerability Scoring System (CVSS).” Accessed: Nov. 30, 2024. [Online.] Available: https://www.first.org/cvss/. [20] ichael Fu, C. Tantithamthavorn, T. Le, Y. Kume, V. Nguyen, D. Phung, and J. Grundy, “Aibughunter: A practical tool for predicting, classifying and repairing software vulnerabilities,” Empirical Software Engineering, vol. 29, no. 1, p. 4, 2024. [21] Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, “Codebert: A pre-trained model for programming and natural languages,” 2020. arXiv:2002.08155. [22] D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, C. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, and M. Zhou, “Graphcodebert: Pre-training code representations with data flow,” 2021. arXiv:2009.08366. [23] A. Cheshkov, P. Zadorozhny, and R. Levichev, “Evaluation of chatgpt model for vulnerability detection,” 2023. arXiv:2304.07232. [24] L. M, V. Agarwal, S. Kamthania, P. Vutkur, and M. C. S, “Benchmarking llm for zero-day vulnerabilities,” in 2024 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), pp. 1–6, 2024. [25] V. Akuthota, R. Kasula, S. T. Sumona, M. Mohiuddin, M. T. Reza, and M. M. Rahman, “Vulnerability detection and monitoring using llm,” in 2023 IEEE 9th International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE), pp. 309–314, 2023. [26] “OWASP Benchmark.” Accessed: Dec. 30, 2024. [Online.] Available: https://owasp.org/www-project-benchmark/. [27] C. Zhang, H. Liu, J. Zeng, K. Yang, Y. Li, and H. Li, “Prompt-enhanced software vulnerability detection using chatgpt,” in Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, ICSE-Companion ’24, (New York, NY, USA), p. 276–277, Association for Computing Machinery, 2024. [28] A. Happe, A. Kaplan, and J. Cito, “Llms as hackers: Autonomous linux privilege escalation attacks,” 2024. arXiv:2310.11409. [29] OpenAI et al., “GPT-4 Technical Report,” 2024. arXiv:2303.08774. [30] “Claude.” Accessed: Jun. 30, 2024. [Online.] Available: https://docs.anthropic.com/en/docs/intro-to-claude. [31] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “LLaMA: Open and Efficient Foundation Language Models,” 2023. arXiv:2302.13971. [32] Gemini Team et al., “Gemini: A family of highly capable multimodal models,” 2024. arXiv:2312.11805. [33] “How can I access GPT-4, GPT-4 Turbo, GPT-4o, and GPT-4o mini?.” Accessed: Jun. 30, 2024. [Online.] Available: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4-gpt-4-turbo-gpt-4o-and-gpt-4o-mini. [34] “OpenAI API Libraries.” Accessed: Jun. 30, 2024. [Online.]. Available: https://platform.openai.com/docs/libraries. [35] “Composer - A Dependency Manager for PHP.” [Online]. Available: https://getcomposer.org/. [36] “OpenAI PHP Client.” Accessed: Jun. 30, 2024. [Online.]. Available: https://github.com/openai-php/client. [37] “Tectalic OpenAI REST API Client.” Accessed: Jun. 30, 2024. [Online.]. Available: https://github.com/tectalichq/public-openai-client-php. [38] “Orhan erday OpenAI API Client in PHP.” Accessed: Jun. 30, 2024. [Online.]. Available: https://github.com/orhanerday/open-ai. [39] “OpenNebula.” [Online]. Available: https://opennebula.io/. [40] B. Sotomayor, R. S. Montero, I. M. Llorente, and I. Foster, “Virtual infrastructure management in private and hybrid clouds,” IEEE Internet Computing, vol. 13, no. 5, pp. 14–22, 2009. [41] “OpenStack.” [Online]. Available: https://www.openstack.org/. [42] T. Rosado and J. Bernardino, “An overview of openstack architecture,” in Proceedings of the 18th International Database Engineering & Applications Symposium, IDEAS ’14, (New York, NY, USA), p. 366–367, Association for Computing Machinery, 2014. [43] M. Lamourine, “Openstack,” login Usenix Mag., vol. 39, 2014. [44] “CloudStack.” [Online]. Available: https://cloudstack.apache.org/. [45] “Proxmox Virtual Environment.” [Online]. Available: https://www.proxmox.com/. [46] X. Wen, G. Gu, Q. Li, Y. Gao, and X. Zhang, “Comparison of open-source cloud management platforms: Openstack and opennebula,” in 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2457–2461, 2012. [47] “Open vSwitch.” [Online]. Available: https://www.openvswitch.org/. [48] “OpenFlow.” Accessed: Nov. 30, 2024. [Online.] Available: https://opennetworking.org/sdn-resources/customer-case-studies/openflow/. [49] “Opnsense.” [Online]. Available: https://opnsense.org/. [50] “pfsense.” [Online]. Available: https://www.pfsense.org/. [51] “WireGuard.” [Online]. Available: https://www.wireguard.com/. [52] “Unbound.” [Online]. Available: https://www.nlnetlabs.nl/projects/unbound/about/. [53] “OpenNebula Linux VM Contextualization.” Accessed: Jun. 30, 2024. [Online.]. Available: https://github.com/OpenNebula/addon-context-linux. [54] “OpenNebula Windows VM Contextualization.” Accessed: Jun. 30, 2024. [Online.]. Available: https://github.com/OpenNebula/addon-context-windows. [55] “DEVCORE Blog - Security Alert: CVE-2024-4577 - PHP CGI Argument Injection Vulnerability.” Accessed: Jul. 30, 2024. [Online.] Available: https://devco.re/blog/2024/06/06/security-alert-cve-2024-4577-php-cgi-argument-injection-vulnerability. [56] “OWASP Top 10.” Accessed: Jul. 30, 2024. [Online.] Available: https://owasp.org/Top10/. [57] K3ysTr0K3R, “CVE-2024-4577 - PHP CGI Argument Injection Remote Code Execution (RCE).” Accessed: Jul. 30, 2024. [Online.] Available: https://github.com/K3ysTr0K3R/CVE-2024-4577-EXPLOIT/blob/main/CVE-2024-4577.py. [58] ZephrFish, “PHP RCE PoC.” Accessed: Jul. 30, 2024. [Online.] Available: https://github.com/ZephrFish/CVE-2024-4577-PHP-RCE/blob/main/CVE-2024-4577.py. [59] Chocapikk, “CVE-2024-4577: PHP CGI Argument Injection (XAMPP).” Accessed: Jul. 30, 2024. [Online.] Available: https://github.com/Chocapikk/CVE-2024-4577/blob/main/exploit.py. [60] “Russia-Aligned TAG-70 Targets European Government and Military Mail Servers in New Espionage Campaign.” Accessed: Aug. 30, 2024. [Online.] Available: https://www.recordedfuture.com/research/russia-aligned-tag-70-targets-european-government-and-military-mail. [61] S. Lee, S. W. Shieh, and M. Tsai, “Enterprise-centric intelligence: A prioritization scheme for cyberthreat intelligence and common vulnerabilities and exposures,” Computer, vol. 57, no. 7, pp. 141–152, 2024. [62] J. Liu and J. Zhan, “Constructing knowledge graph from cyber threat intelligence using large language model,” in 2023 IEEE International Conference on Big Data (BigData), pp. 516–521, 2023.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2030-01-24 校外 Off-campus：開放下載的時間 available 2030-01-24 您的 IP(校外) 位址是 216.73.216.187 現在時間是 2025-06-19 論文校外開放下載的時間是 2030-01-24 Your IP address is 216.73.216.187 The current date is 2025-06-19 This thesis will be available to you on 2030-01-24.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2030-01-24

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS