Abstract

The sheer demand for machine learning in fields as varied as: healthcare, web-search ranking, factory automation, collision prediction, spam filtering, and many others, frequently outpaces the intended use-case of machine learning models. In fact, a growing number of companies hire machine learning researchers to rectify this very problem: to tailor and/or design new state-of-the-art models to the setting at hand.

However, we can generalize a large set of the machine learning problems encountered in practical settings into three categories: cost, space, and privacy. The first category (cost) considers problems that need to balance the accuracy of a machine learning model with the cost required to evaluate it. These include problems in web-search, where results need to be delivered to a user in under a second and be as accurate as possible. The second category (space) collects problems that require running machine learning algorithms on low-memory computing devices. For instance, in search-and-rescue operations we may opt to use many small unmanned aerial vehicles (UAVs) equipped with machine learning algorithms for object detection to find a desired search target. These algorithms should be small to fit within the physical memory limits of the UAV (and be energy efficient) while reliably detecting objects. The third category (privacy) considers problems where one wishes to run machine learning algorithms on sensitive data. It has been shown that seemingly innocuous analyses on such data can be exploited to reveal data individuals would prefer to keep private. Thus, nearly any algorithm that runs on patient or economic data falls under this set of problems.

We devise solutions for each of these problem categories including (i) a fast tree-based model for explicitly trading off accuracy and model evaluation time, (ii) a compression method for the k-nearest neighbor classifier, and (iii) a private causal inference algorithm that protects sensitive data.

Committee Chair

Kilian Q. Weinberger, Roman Garnett

Committee Members

Sanmay Das, Ben Moseley, Robert Pless, Fei Sha

Comments

Permanent URL: https://doi.org/10.7936/K7XS5TSP

Degree

Doctor of Philosophy (PhD)

Author's Department

Computer Science & Engineering

Author's School

McKelvey School of Engineering

Document Type

Dissertation

Date of Award

Summer 8-2016

Language

English (en)

DOI

https://doi.org/10.7936/K7XS5TSP

Recommended Citation

Kusner, Matt J., "Learning in the Real World: Constraints on Cost, Space, and Privacy" (2016). McKelvey School of Engineering Graduate Student Theses & Dissertations. 305.

The definitive version is available at https://doi.org/10.7936/K7XS5TSP

Download

Included in

Computer Engineering Commons

COinS

DOI

https://doi.org/10.7936/K7XS5TSP

McKelvey School of Engineering Graduate Student Theses & Dissertations

Learning in the Real World: Constraints on Cost, Space, and Privacy

Abstract

Committee Chair

Committee Members

Comments

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Recommended Citation

Included in

DOI

Search

Links

Browse

Author Corner

McKelvey School of Engineering Graduate Student Theses & Dissertations

Learning in the Real World: Constraints on Cost, Space, and Privacy

Author

Abstract

Committee Chair

Committee Members

Comments

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Recommended Citation

Included in

Share

DOI

Search

Links

Browse

Author Corner