Spectral Clustering with Links and Attributes

Neville, J., M. Adler and D. Jensen (2004). Spectral clustering with links and attributes. University of Massachusetts Amherst, Technical Report 04-42.

Abstract
If relational data contain communities—groups of inter-related items with similar attribute values—a clustering technique that considers attribute information and the structure of relations simultaneously should produce more meaningful clusters than those produced by considering attributes alone. We investigate this hypothesis in the context of a spectral graph partitioning technique, considering a number of hybrid similarity metrics that combine both sources of information. Through simulation, we find that two of the hybrid metrics achieve superior performance over a wide range of data characteristics. We analyze the spectral decomposition algorithm from a statistical perspective and show that the successful hybrid metrics exaggerate the separation between cluster similarity values, at the expense of increased variance. We cluster several relational datasets using the best hybrid metric and show that the resulting clusters exhibit significant community structure, and that they significantly improve performance in a related classification task.
Text
A PDF version of this paper is available.

Feedback Back to main page Fineprint