Anotace:
With the rapid expansion and widespread adoption of the Internet of Things (IoT), maintaining secure connections among active devices can be challenging. Since IoT devices are limited in power and storage, they cannot perform complex tasks, which makes them vulnerable to different types of attacks. Given the volume of data generated daily, detecting anomalous behavior can be demanding. However, machine learning (ML) algorithms have proven successful in extracting complex patterns from big data, which has led to active applications in IoT. In this paper, we perform a comprehensive analysis, including 4 ML algorithms and 3 neural networks (NNs), and propose a pipeline which analyzes the influence data reduction (loss) has on the performance of these algorithms. We use random undersampling as a data reduction technique, which simulates reduced network traffic data. The pipeline investigates several degrees of data loss. The results show that models trained on the original data distribution obtain accuracy that verges on 100%. XGBoost performs best from the classic ML algorithms. From the deep learning models, the 2-layered NN provides excellent results and has sufficient depth for practical application. On the other hand, when the models are trained on the undersampled data, there is a decrease in performance, most notably in the case of NNs. The most prominent change is seen in the 4-layered NN, where the model trained on the original dataset detects attacks with a success of 93.53%, whereas the model trained on the maximally reduced data has a success of only 39.39%.