Zero-Day Malware Detection
Abstract
The phrase "zero-day malicious software" (malware) describes a recently identified or unidentified software vulnerability. Enhancing detection for similar zero-day malware by effective learning to plausibly produced data is the main goal of this work. Hardware-supported Malware Detection (HMD), which uses Machine Learning (ML) techniques applied to Hardware Performance Counter (HPC) data, has proven effective in detecting malware at the micro-architecture level of processors in order to overcome the high complexity of traditional software-based detection techniques. In this study, we investigate the appropriateness of many common machine learning classifiers for zero-day malware detection on novel data streams in the actual operation of Internet of Things devices. We show that these approaches are unable to provide a high detection rate for unknown malware signatures. We begin our study by reviewing current ML-based HMDs that use information from built-in HPC registers. We next investigate the appropriateness of several common machine learning classifiers for zero-day malware detection and show that they are unable to identify unknown malware signatures with a high detection rate.Last but not least, we suggest an ensemble learning-based method to improve the performance of the conventional malware detectors in order to overcome the difficulty of run-time zero-day malware detection, even if it only uses a few micro architectural elements that are recorded at run-time by current HPCs. The experimental results show that our suggested method, which uses only the top 4 micro architectural features to detect zero-day malware, achieves 92% F-measure and 95% TPR with only 2% false positive rate when Ada-Boost ensemble learning is applied to Random Forrest classifier as a regular classifier.