Model Selection For Social Networks Using Graphlets

Authors: 

Matthew S. Hurshman
J. Janssen
Nauzer Kalyaniwalla

Author Addresses: 

Matt Hurshman: mhursh@mathstat.dal.ca
J. Janssen: janssen@mathstat.dal.ca
Nauzer Kalyaniwalla: nauzerk@cs.dal.ca

Faculty of Computer Science
Dalhousie University
6050 University Ave.
PO Box 15000
Halifax, Nova Scotia, Canada
B3H 4R2

Abstract: 

Several network models have been proposed to explain the link structure observed in online social networks. Such models are usually justified by showing that they can repli- cate certain observed features of real networks, such as a power law degree distribution, a high degree of clustering, and the small world property. This paper addresses the problem of choosing the model that best fits a given real world network. We implement a model selec- tion method based on un-supervised learning. An alternating decision tree is trained using synthetic graphs generated according to each of the models under consideration. Graphs are represented as feature vectors incorporating the frequency counts of all connected subgraphs on 3 and 4 vertices as well as features describing the degree distribution and small world prop- erty. We show that the subgraph counts alone are sufficient in separating the training data. For the test data, we take four Facebook graphs from various American Universities. Our results show that models incorporating some form of the preferential attachment mechanism tend to perform the best.

Tech Report Number: 
CS-2011-07
Report Date: 
November 16, 2011
AttachmentSize
PDF icon CS-2011-07.pdf1.87 MB