Please login first
On Entropy in Network Traffic Anomaly Detection
* ,
1  Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Campus Guadalajara


Different systems have been continuously developed in order to ensure integrity, availability, and confidentiality of networks. An important approach is the anomaly-based network intrusion detection system (A-NIDS). In this paper, we provide a structured and comprehensive overview of the research on entropy-based A-NIDS with the intention of providing researchers a quick introduction of essential aspects of this topic.  As help to this point, a general architecture of A-NIDS based on Entropy is described and their main components are discussed. We also highlight some open issues in the entropy-based network traffic anomaly detection.

Keywords: Network traffic anomaly detection; entropy; generalized entropies; network security; A-NIDS, Mutual information, KL divergence; conditional entropy
Comments on this paper
Francisco Chinesta
On Entropy in Network Traffic Anomaly Detection
Dear collagues,

I very much appreciated your wok,

You write

Among the algorithms used to reduce the number of features in network traffic anomaly detection
are: PCA [11], Mutual Information and linear correlation [12], decision tree [13], and maximum entropy

but among all these choices, can you explain why your choice?
Jayro Santiago
Dear Francisco Chinesta,
Thanks for reading our work.

The authors do not mention that criteria used to select the method fot the selection/reduction of variables.

However, as is well known, PCA is the most used dimensionality reduction technique. The algorithm C4.5 usign Entropy, is top in the Algorithms in Data Mining, and the concepts of Kl divergence
and Mutual Information can be applied for evaluating any arbitrary dependency between random variables.

C4.5 algorithm is less complex and has a detection rate of 90.86% for the Mit-Darpa database.

The complexity of PCA is O(np^2+p^3)), O(np^2) for Covariance matrix computation and O(p^3) for its eigen-value decomposition. The mutual information(MI) has complexity of O(p^2n^2).
The algorithm C4.5 in each level i, in the tree, it should examine the p-i characteristics for each instance remaining at that level for calculating information gain, with complexity of O(np^2).

In maximum entropy method, the KL divergence is used in feature selection. The complexity of KL divergence is O(n^2p^2)

For DoS attacks from the MIT-DARPA database, the detection rate is: usign MI 85.81%, linear correlation 87.84, decision tree 90.86%