Technical Report Number
This thesis presents a Field Programmable Gate Array (FPGA) based, high-speed search system that is intended to perform simple data mining operations on the data streaming from an off-the-shelf hard drive. This system includes the search engine itself and a device that snoops the traffic on an ATAPI/IDE peripheral bus, capturing data transmitted by a hard drive attached to the bus and forwarding that data to the search engine. The search engine, which is an adaption of the Smith-Waterman local sequence alignment algorithm, can process search data at a rate of 100 MB/sec, with a query string up to 38 bytes long. The motivation for developing this system is to move most of the processing burden in data mining applications from the CPU to a level closer to the hard drive, while at the same time achieving search throughput gains by taking advantage of massive parallelism possible in FPGA-based implementations. To demonstrate the magnitude of performance gain that is possible, this thesis also includes the results of simple performance tests that compare this system to traditional, CPU-based search applications like the UNIX tool "grep." The search engine and related components were developed and implemented on the Field Programmable Port Extender (FPX), an FPGA-based component of the Washington University Gigabit Switch (WUGS).
West, Benjamin M., "An FPGA-Based High-Speed Search Engine for Off-the-Shelf Hard Drives" Report Number: WUCSE-2003-4 (2003). All Computer Science and Engineering Research.