||The threat of malware is definitely the most important topic of internet security. As the growth of malware is faster ever and ever, the defense method of security must evolve. Unfortunately the IT expert only can start to deal with attack problem after the new malware have already invaded our system. The usual steps for malware attack issue is to collect the evidence first. Then the IT expert can analyze these evidence to find out the solution. At last, we need to improve our system in case that there will be another malware attack.|
In this paper, we propose a malware analysis system to accurately cluster new malware. We extract the significant feature from malware sample. For source code file, we extract the syntax string as the feature. For binary file, we transform the binary file to image file, and extract the matrix vector from the image as the feature. Then we adopt two different clustering algorithm, advanced incremental clustering and extended 1-NN, to cluster our malware sample. Finally, our system can offer a detailed report abou the malware family relationship. In our research, there are four experiments to verify our system. We compare the performance and accuracy about the two different clustering algorithm, and verify the system’s maturity with random sample analysis order. We also compare our system with Virustotal.com and Avira software, and the result confirms that our system can do better efficient clustering.