System Behavior Characterization via Information Content Clustering of System Logs


Adetokunbo Makanju
A. Nur Zincir-Heywood
Evangelos E. Milios

Author Addresses: 

Faculty of Computer Science
Dalhousie University
6050 University Ave.
PO Box 15000
Halifax, Nova Scotia, Canada
B3H 4R2


{makanju, zincir, eem}


Self-awareness or the ability to be informed about system internal state is an important attribute for any system to have before it is capable of self-management. This is irrespective of which of the self-* properties of self-management in autonomic systems we choose to achieve. A system needs to have a continuous stream of real-time data to analyze to allow it be aware of its internal state. To this end, previous approaches have utilized system performance metrics and system log data as a means of characterizing system behaviour and internal state.

In this work, we propose a scheme which utilizes the entropy- based information content to group spatio-temporal partitions of system log data into conceptual clusters. We evaluate our method using cluster cohesion, cluster separation and cluster conceptual purity as metrics on High Performance Cluster (HPC) system log data. The results show that our proposed method not only produces well-formed clusters but also clusters that can mapped to different kinds of alert behavior with a high degree of confidence. These results provide evidence that clusters produced by the proposed method characterize the different behaviors of the system and hence capture information about internal state. Hence they have value for the enhancement of self-awareness.

The ability to differentiate among types of behavior (both normal and abnormal) is also valuable for self-monitoring and fault detection as deviations from types of normal behavior could be indicative of a fault.

Tech Report Number: 
Report Date: 
May 5, 2011
PDF icon CS-2011-02.pdf1.42 MB