Hardware and software for distributed and supercomputer systems
Research Article
Building robust malware detection through conditional Generative Adversarial Network-based data augmentation
Elshan Baghirov
Institute of Information Technology, Baku. Azerbaijan | |
elsenbagirov1995@gmail.com |
Abstract. Malware detection is essential in cybersecurity, yet its accuracy is often compromised by class imbalance and limited labeled data. This study leverages conditional Generative Adversarial Networks (cGANs) to generate synthetic malware samples, addressing these challenges by augmenting the minority class.
The cGAN model generates realistic malware samples conditioned on class labels, balancing the dataset without altering the benign class. Applied to the CICMalDroid2020 dataset, the augmented data is used to train a LightGBM model, leading to improved detection accuracy, particularly for underrepresented malware classes.
The results demonstrate the efficacy of cGANs as a robust data augmentation tool, enhancing the performance and reliability of machine learning-based malware detection systems.
Keywords: malware detection, Generative Adversarial Networks, machine learning, cybersecurity, data augmentation
MSC-2020 97P30; 97P20, 97R40For citation: Elshan Baghirov. Building robust malware detection through conditional Generative Adversarial Network-based data augmentation. Program Systems: Theory and Applications, 2024, 15:4, pp. 97–110. https://psta.psiras.ru/2024/4_97-110.
Full text of article (PDF): https://psta.psiras.ru/read/psta2024_4_97-110.pdf.
The article was submitted 05.12.2024; approved after reviewing 07.12.2024; accepted for publication 07.12.2024; published online 10.12.2024.