An Analysis of Clustering Objectives for Feature Selection Applied to Encrypted Traffic Identification

Authors: 

Carlos Bacquet
Nur A. Zincir-Heywood
Malcolm I. Heywood

Author Addresses: 

Faculty of Computer Science
Dalhousie University
6050 University Ave.
PO Box 15000
Halifax, Nova Scotia, Canada
B3H 4R2

Abstract: 

This work explores the use of clustering objectives in a Multi-Objective Genetic Algorithm (MOGA) for both, feature selection and cluster count optimization, under the application of flow based encrypted traffic identification. We first explore whether it is possible to achieve the performance of a gold standard model (i.e., classification objectives), using a MOGA based on clustering objectives. Then, we explore the performance gain (if it exists) of applying a logarithmic transformation to the data prior to running the MOGA. Results show that MOGA trained with clustering objectives can closely reproduce the behavior of a gold standard model, not only in terms of the selected features, but also in terms of the achieved detection rate and false positives rate, above 90% and less than 1% respectively. On the other hand, no gain was observed by applying logarithmic transformation to the data.

Tech Report Number: 
CS-2010-01
Report Date: 
February 15, 2010
AttachmentSize
PDF icon CS-2010-01.pdf675.16 KB