Author's School

School of Engineering & Applied Science

Author's Department/Program

Electrical and Systems Engineering


English (en)

Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)

Chair and Committee

Joseph O'Sullivan


This thesis consists of four parts, three of which study issues related to theories and applications of biometric systems, and one which focuses on clustering. We establish an information theoretic framework and the fundamental trade-off between utility of biometric systems and security of biometric systems. The utility includes person identification and secret binding, while template protection, privacy, and secrecy leakage are security issues addressed. A general model of biometric systems is proposed, in which secret binding and the use of passwords are incorporated. The system model captures major biometric system designs including biometric cryptosystems, cancelable biometrics, secret binding and secret generating systems, and salt biometric systems. In addition to attacks at the database, information leakage from communication links between sensor modules and databases is considered. A general information theoretic rate outer bound is derived for characterizing and comparing the fundamental capacity, and security risks and benefits of different system designs. We establish connections between linear codes to biometric systems, so that one can directly use a vast literature of coding theories of various noise and source random processes to achieve good performance in biometric systems. We develop two biometrics based on laser Doppler vibrometry: LDV) signals and electrocardiogram: ECG) signals. For both cases, changes in statistics of biometric traits of the same individual is the major challenge which obstructs many methods from producing satisfactory results. We propose a ii robust feature selection method that specifically accounts for changes in statistics. The method yields the best results both in LDV and ECG biometrics in terms of equal error rates in authentication scenarios. Finally, we address a different kind of learning problem from data called clustering. Instead of having a set of training data with true labels known as in identification problems, we study the problem of grouping data points without labels given, and its application to computational stemmatology. Since the problem itself has no "true" answer, the problem is in general ill-posed unless some regularization or norm is set to define the quality of a partition. We propose the use of minimum description length: MDL) principle for graphical based clustering. In the MDL framework, each data partitioning is viewed as a description of the data points, and the description that minimizes the total amount of bits to describe the data points and the model itself is considered the best model. We show that in synthesized data the MDL clustering works well and fits natural intuition of how data should be clustered. Furthermore, we developed a computational stemmatology method based on MDL, which achieves the best performance level in a large dataset.


Permanent URL: