Enhancing Network Security: A Study on Classification Models for Intrusion Detection Systems

Mahmood, Abeer Abd Alhameed; Hadi, Azhar A.; Al-Masoody, Wasan Hashim

doi:https://doi.org/10.24138/jcomss-2024-0064

Enhancing Network Security: A Study on Classification Models for Intrusion Detection Systems

Published online: May 13, 2025 Full Text: PDF (1.31 MiB) DOI: https://doi.org/10.24138/jcomss-2024-0064

Cite this paper

Authors:

Abeer Abd Alhameed Mahmood, Azhar A. Hadi, Wasan Hashim Al-Masoody

Abstract

Computer users face a constant influx of internet packets, ranging from legitimate ones to those sent by malicious entities. With the exponential growth in user numbers and evolving attack types, traditional countermeasure methods are becoming ineffective. Artificial intelligence (AI) techniques offer a promising solution to address these challenges. This study leverages AI methods to develop nine classification models using supervised machine learning classifiers. The author has implemented several machine learning models, including bagging, multi-layer perceptron, logistic regression, extreme gradient boosting, and random forest. The authors utilize three datasets (Knowledge Discovery in Databases 1999 dataset, used for network intrusion detection research), UNSW-NB15 (a dataset capturing contemporary network attack patterns generated at the University of New South Wales), and CICIDS2017 (Canadian Institute for Cybersecurity Intrusion Detection System dataset, containing modern attack scenarios)(KDD99, UNSW NB15, and CICIDS2017) with varying train-test ratios to train the classifiers. The author employs accuracy and F1 score metrics to evaluate the model’s performance. The Extreme Gradient Boosting classifier exhibits the highest performance across all three datasets, especially with an 80% feature reduction. Various oversampling and undersampling techniques balance the dataset to improve falsenegative rates. Performance metrics show improvements across all dataset types, with extreme gradients boosting accuracy. The meta-ensemble learning model does better at sub-multiclass classification than decision trees, random forests, and extreme gradient boosting. It also does better than logistic regression and multi-layer perceptron in multiclass classification. Two hidden layers achieved the highest accuracy for binary classification on the KDD99 dataset. Multiclass classification presents challenges with identifying minor classes, but performance improves with additional hidden layers. Random Forest outperforms other classifiers in accuracy, which is consistent with simulation results.

Keywords

Intrusion Detection Systems (IDS), Machine Learning, Balanced Dataset, Network Security

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.