Spectral Clustering with Links and Attributes
Neville, J., M. Adler and D. Jensen (2004). Spectral clustering with links and attributes. University of Massachusetts Amherst, Technical Report 04-42.
- Abstract
- If relational data contain communities—groups of inter-related
items with similar attribute values—a clustering technique
that considers attribute information and the structure of
relations simultaneously should produce more meaningful
clusters than those produced by considering attributes alone.
We investigate this hypothesis in the context of a spectral
graph partitioning technique, considering a number of hybrid
similarity metrics that combine both sources of information.
Through simulation, we find that two of the hybrid
metrics achieve superior performance over a wide range of
data characteristics. We analyze the spectral decomposition
algorithm from a statistical perspective and show that the
successful hybrid metrics exaggerate the separation between
cluster similarity values, at the expense of increased variance.
We cluster several relational datasets using the best
hybrid metric and show that the resulting clusters exhibit
significant community structure, and that they significantly
improve performance in a related classification task.
- Text
- A PDF version of this paper is available.