Document Type

Technical Report

Publication Date

2005-01-04

Filename

WUCSE-2005-2.pdf

Technical Report Number

WUCSE-2005-2

Abstract

Given a matrix of values in which the rows correspond to objects and the columns correspond to features of the objects, rearrangementclustering is the problem of rearranging the rows of the matrix such that the sum of the similarities between adjacent rows is maximized. Referred to by various names and reinvented several times, this clustering technique has been extensively used in many ﬁelds over the last three decades. In this paper, we point out two critical pitfalls that have been previously overlooked. The ﬁrst pitfall is deleterious when rearrangement clustering is applied to objects that form natural clusters. The second concerns a similarity metric that is commonly used. We present an algorithm that overcomes these pitfalls. This algorithm is based on a variation of the Traveling Salesman Problem. It oﬀers an extra beneﬁt as it automatically determines cluster boundaries. Using this algorithm, we optimally solve four benchmark problems and a 2,467-gene expression data clustering problem. As expected, our new algorithm identiﬁes better clusters than those found by previous approaches in all ﬁve cases. Overall, our results demonstrate the beneﬁts of rectifying the pitfalls and exemplify the usefulness of this clustering technique. Our code is available at our websites.

Comments

Permanent URL: http://dx.doi.org/10.7936/K79G5K4K

Recommended Citation

Climer, Sharlee and Zhang, Weixiong, "Rearrangement Clustering: Pitfalls, Remedies, and Applications" Report Number: WUCSE-2005-2 (2005). All Computer Science and Engineering Research.
https://openscholarship.wustl.edu/cse_research/937

Download

COinS

DOI

https://doi.org/10.7936/K79G5K4K

All Computer Science and Engineering Research

Rearrangement Clustering: Pitfalls, Remedies, and Applications

Document Type

Publication Date

Filename

Technical Report Number

Abstract

Comments

Recommended Citation

DOI

Search

Links

Browse

Author Corner

All Computer Science and Engineering Research

Rearrangement Clustering: Pitfalls, Remedies, and Applications

Authors

Document Type

Publication Date

Filename

Technical Report Number

Abstract

Comments

Recommended Citation

Share

DOI

Search

Links

Browse

Author Corner