Computer Science and Engineering
Date of Award
Doctor of Philosophy (PhD)
Chair and Committee
Kilian Q Weinberger
The past decade has witnessed how the field of machine learning has established itself as a necessary component in several multi-billion-dollar industries. The real-world industrial setting introduces an interesting new problem to machine learning research: computational resources must be budgeted and cost must be strictly accounted for during test-time. A typical problem is that if an application consumes x additional units of cost during test-time, but will improve accuracy by y percent, should the additional x resources be allocated? The core of this problem is a trade-off between accuracy and cost. In this thesis, we examine components of test-time cost, and develop different strategies to manage this trade-off.
We first investigate test-time cost and discover that it typically consists of two parts: feature extraction cost and classifier evaluation cost. The former reflects the computational efforts of transforming data instances to feature vectors, and could be highly variable when features are heterogeneous. The latter reflects the effort of evaluating a classifier, which could be substantial, in particular nonparametric algorithms. We then propose three strategies to explicitly trade-off accuracy and the two components of test-time cost during classifier training.
To budget the feature extraction cost, we first introduce two algorithms: GreedyMiser and Anytime Representation Learning (AFR). GreedyMiser employs a strategy that incorporates the extraction cost information during classifier training to explicitly minimize the test-time cost. AFR extends GreedyMiser to learn a cost-sensitive feature representation rather than a classifier, and turns traditional Support Vector Machines (SVM) into test- time cost-sensitive anytime classifiers. GreedyMiser and AFR are evaluated on two real-world data sets from two different application domains, and both achieve record performance.
We then introduce Cost Sensitive Tree of Classifiers (CSTC) and Cost Sensitive Cascade of Classifiers (CSCC), which share a common strategy that trades-off the accuracy and the amortized test-time cost. CSTC introduces a tree structure and directs test inputs along different tree traversal paths, each is optimized for a specific sub-partition of the input space, extracting different, specialized subsets of features. CSCC extends CSTC and builds a linear cascade, instead of a tree, to cope with class-imbalanced binary classification tasks. Since both CSTC and CSCC extract different features for different inputs, the amortized test-time cost is greatly reduced while maintaining high accuracy. Both approaches out-perform the current state-of-the-art on real-world data sets.
To trade-off accuracy and high classifier evaluation cost of nonparametric classifiers, we propose a model compression strategy and develop Compressed Vector Machines (CVM). CVM focuses on the nonparametric kernel Support Vector Machines (SVM), whose test-time evaluation cost is typically substantial when learned from large training sets. CVM is a post-processing algorithm which compresses the learned SVM model by reducing and optimizing support vectors. On several benchmark data sets, CVM maintains high test accuracy while reducing the test-time evaluation cost by several orders of magnitude.
Xu, Zhixiang, "Supervised Machine Learning Under Test-Time Resource Constraints: A Trade-off Between Accuracy and Cost" (2014). All Theses and Dissertations (ETDs). 1370.