Document Type

Technical Report


Computer Science and Engineering

Publication Date






Technical Report Number



High-performance document clustering systems enable similar documents to automatically self-organize into groups. In the past, the large amount of computational time needed to cluster documents prevented practical use of such systems with a large number of documents. A full hardware implementation of K-means clustering has been designed and implemented in reconfigurable hardware that rapidly clusters a half million documents. Documents and concepts are represented as vectors with 4000 dimensions. The circuit was implemented in Field Programmable Gate Array (FPGA) logic and uses four parallel cosine distance metrics to cluster document vectors together. An exploration of the effect of the integer approximation of the cosine theta distance metric was investigated. Through experiments, measurements were performed to determine the effect of utilizing different numeric representations for the concept vectors. As compared to a full K-means implementation in software, it was found that using carefully chosen integer representations yielded clustering results that were nearly identical to results obtained using full foating-point representations. Hardware was synthesized and run on the Field Programmable Port Extender (FPX) platform. This implementation on the Virtex-E 2000 FPGA ran 26 times faster than an algorithmically equivalent software running on an Intel 3.60 GHz Xeon. The same architecture was scaled to implement a faster and larger design for the Xilinx-4 LX200. This larger implementation outperformed the equivalent software version on the same Xeon by a factor of 328.


Permanent URL: