"Rearrangement Clustering: Pitfalls, Remedies, and Applications" by Sharlee Climer and Weixiong Zhang

All Computer Science and Engineering Research

Title

Rearrangement Clustering: Pitfalls, Remedies, and Applications

Authors

Sharlee Climer, Washington University in St. Louis
Weixiong Zhang, Washington University in St. LouisFollow

Document Type

Technical Report

Publication Date

2005-01-04

Filename

WUCSE-2005-2.pdf

DOI:

10.7936/K79G5K4K

Technical Report Number

WUCSE-2005-2

Abstract

Given a matrix of values in which the rows correspond to objects and the columns correspond to features of the objects, rearrangementclustering is the problem of rearranging the rows of the matrix such that the sum of the similarities between adjacent rows is maximized. Referred to by various names and reinvented several times, this clustering technique has been extensively used in many ﬁelds over the last three decades. In this paper, we point out two critical pitfalls that have been previously overlooked. The ﬁrst pitfall is deleterious when rearrangement clustering is applied to objects that form natural clusters. The second concerns a similarity metric that is commonly used. We present an algorithm that overcomes these pitfalls. This algorithm is based on a variation of the Traveling Salesman Problem. It oﬀers an extra beneﬁt as it automatically determines cluster boundaries. Using this algorithm, we optimally solve four benchmark problems and a 2,467-gene expression data clustering problem. As expected, our new algorithm identiﬁes better clusters than those found by previous approaches in all ﬁve cases. Overall, our results demonstrate the beneﬁts of rectifying the pitfalls and exemplify the usefulness of this clustering technique. Our code is available at our websites.

Comments

Permanent URL: http://dx.doi.org/10.7936/K79G5K4K

Recommended Citation

Climer, Sharlee and Zhang, Weixiong, "Rearrangement Clustering: Pitfalls, Remedies, and Applications" Report Number: WUCSE-2005-2 (2005). All Computer Science and Engineering Research.
https://openscholarship.wustl.edu/cse_research/937

Download

COinS

All Computer Science and Engineering Research

Title

Rearrangement Clustering: Pitfalls, Remedies, and Applications

Authors

Document Type

Publication Date

Filename

DOI:

Technical Report Number

Abstract

Comments

Recommended Citation

Search

Links

Browse

Author Corner

All Computer Science and Engineering Research

Title

Rearrangement Clustering: Pitfalls, Remedies, and Applications

Authors

Document Type

Publication Date

Filename

DOI:

Technical Report Number

Abstract

Comments

Recommended Citation

Share

Search

Links

Browse

Author Corner