Researchers Propose MME Framework for Enhanced API Sequence-Based Malware Detection

admin

2 years ago

Researchers Propose MME Framework for Enhanced API Sequence-Based Malware Detection

A recent development in the field of malware detection has brought about a groundbreaking framework known as MME, aimed at enhancing the capabilities of deep learning models analyzing API sequences for Windows malware detection. The significance of this framework lies in its ability to tackle the challenges posed by evolving malware variants, which have made traditional detection methods less effective over time.

The MME framework, proposed by a group of researchers, leverages API knowledge graphs and system resource encodings to improve existing detectors. By utilizing contrastive learning, MME is able to capture similar malicious semantics in evolved malware samples, thereby enhancing the overall accuracy of detection.

Experimental results have shown that the MME framework leads to a 13.10% reduction in false positive rates and an 8.47% improvement in F1-Score over a five-year dataset compared to Regular Text-CNN. This demonstrates the effectiveness of MME in significantly enhancing the performance of malware detection models.

Moreover, one of the key advantages of the MME framework is its ability to reduce model maintenance costs. With just 1% of the monthly budget, MME achieves an 11.16% decrease in false positives and a 6.44% increase in F1-Score, making it a cost-effective solution for long-term detection accuracy.

The MME framework introduces two key innovations to enhance API sequence-based Windows malware detection models. Firstly, it includes a sophisticated API embedding method that combines API knowledge graphs for semantic representation and feature hash embedding for system resource encoding. Secondly, it incorporates a contrastive learning strategy that improves the model’s ability to recognize similar malicious behaviors across evolving samples.

When applied to LSTM and Text-CNN models using a dataset of 76K Windows PE samples from 2017-2021, MME significantly reduced false negative rates and decreased required human labeling efforts by a significant margin. This approach demonstrates enhanced stability against malware evolution, effectively slowing down model aging and improving long-term detection accuracy without altering the original model structure.

The MME framework augments API sequence-based Windows malware detection models by targeting the evolving nature of malware families. It introduces three major elements that contribute to its effectiveness: a knowledge graph for APIs with semantic proximity, hierarchical system resource encoding based on feature hashing, and a contrastive learning strategy that enforces attention to persisting API pieces over malware generations.

By implementing MME into LSTM and Text-CNN models, the working life of these models is significantly extended, and false negative rates are reduced. In maintenance scenarios, MME shows a decrease in human annotation efforts required by up to 94.42% without compromising performance.

Overall, the MME-enhanced models require just 1% labeled data per month to achieve high F1 scores and low false negative rates, making them a cost-effective and efficient solution for countering malware evolution impacts. The reduced analyst involvement and improved precision of detection make MME a valuable tool for maintaining sustainable long-term operation of malware detectors.

Source link