Modern computer threats and intrusion attacks are far more complicated than those seen in the past. In IOT devices, these threats and attacks become even more visible and predominant as the heterogeneous and distributed characters of the devices make conventional intrusion detection methodologies hard to deploy. [1] discusses the principal cyber threats for IoT devices like Denial of Service, Malware based attacks, data breaches and weakening parameters, enlists the security issues identified in IoT as per the Open Web Application Security Project (OWASP) and also highlights few of the past example attacks identified towards IoT. Detecting these threats requires new tools, which are able to capture the essence of their behavior, rather than looking for fixed signatures in the attacks. Anomaly detection algorithms, which are able to learn the normal behavior of systems and alert for abnormalities, with or without any prior knowledge on the system model, nor any knowledge on the characteristics of the attack, can be a key to handle such complexities.
The importance of anomaly detection is due to the fact that anomalies in data translate to significant (and often critical) actionable information in a wide variety of application domains.
This paper discusses the use of Machine Learning based Network Traffic Anomaly detection, to approach the challenges in securing devices and detect network intrusions.
Anomaly detection [2] [3] is the “identification of items, events or observations which do not conform to an expected pattern or other items in a dataset”. Some examples of where anomaly detection is useful are provided below:
Generally, there are two approaches to tackle the detection problem [4]. Most commercial intrusion detection systems employ some kind of signature-matching algorithms to detect malicious activities. Generally, such systems have very low false alarm rates and work very well in the event that the corresponding attack signatures are present, but they also have a potential drawback: missing signatures inevitably lead to undetected attacks. One of the common counter-measures taken by Trojans, for example, was to dynamically change their code signatures during run time, or during replication, making signature based tracking ineffective. Here, the second approach, termed anomaly detection, becomes more useful. Anomaly detection maps normal behavior to a baseline profile and tries to detect deviations. One of the ways to create a baseline profile can be using supervised learning which uses data instances from the past with pre-labeled intrusion instances to train a supervised learning model using supervised learning algorithms. [5] discusses the various machine learning techniques, the related classifiers and algorithms which can be used in intrusion detection systems.
Anomaly based detection systems rely on artificial intelligent (AI) and machine learning (ML) to detect anomalies. The idea behind AI and ML is to make a machine capable of learning by itself and distinguish between normal and abnormal behavior on the system. The process of teaching a machine takes different forms; supervised, unsupervised and reinforcement learning.
Regardless of the means of teaching, a machine needs to be trained in order to be able to predict. Several ML algorithms have been used in intrusion detections. Most of the intrusion detection systems use a combination of algorithms to cluster sample data into groups, label them, and then use a classifier to train the intrusion detection systems to distinguish between these groups. Over the past, a lot of study has been conducted on the intrusion detection systems using various machine learning techniques. The study varies from the use of single classifier solutions where a single classifier algorithm is used to create a machine learning model to hybrid classifiers where a combination of more than one machine learning algorithm is used to improve the intrusion detection systems performance. [5] describes the various machine learning techniques, the classifiers, algorithms, the existing study done around creation of machine learning based intrusion detection systems based on single, hybrid and ensemble classifier design and also provides a statistical comparison of the classifier algorithms used in all these research work. Table IV and V in [5] statically compares the intrusion detection model related research using the single classifier and the hybrid classifier, the algorithms used and the performance accuracy results.
Anomaly based intrusion detection systems are said to be computing intensive systems. It requires a lot of processing power and memory to work fast especially if the system is a real time intrusion detection system. Using anomaly based detection in IoT is more challenging and harder than using it with non-IoT networks for several reasons.
Research has been conducted around these challenges and showcase the use of anomaly based detection approach to detect anomalies in IoT. Following are few of the IoT based anomaly detection solutions:
HSC& key areas of focus are: