This item is under embargo and not available online per the author's request. For access information, please visit

Date of Award

Spring 5-15-2019

Author's School

Graduate School of Arts and Sciences

Author's Department


Degree Name

Doctor of Philosophy (PhD)

Degree Type



Decision and choice theory is a topic of interest in both econometrics and microeconomic theory. We contribute to the theory of decision under both contexts, that is, the theory of model selection in econometrics, and the theory of rational decision in microeconomics.

There is a long-lasting theoretical interest in model selection. More recently, research on sparse estimators, a class of estimation methods that select and estimate important parameters simultaneously, has been the central focus on model selection. The methods become especially relevant when the problem is of high-dimensional nature. Theoretically, sparse methods can perform well when the true data generating process (DGP) is assumed to have a low-dimensional structure. But empirically, a sparse estimator can be outperformed by some dense estimators when this assumption does not hold. In Chapter 1, we propose a test of sparsity for linear regression models. Our null hypothesis is that the number of non-zero parameters does not exceed a small preset fraction of the total number of parameters. It can be interpreted as a family of Bayesian prior distributions where each parameter equals zero with a large probability. For the alternative, we consider the case where all parameters are nonzero and of order $1/\sqrt{p}$ for all $p$ number of parameters. Formally, the alternative is a normal prior distribution, the maximum entropy prior with the mean being zero, and the variance determined by the ANOVA identity. We derive a test statistic using the theory of robust statistics. This statistic is minmax-optimal when the design matrix is orthogonal and can be used for general design matrices as a conservative test.

Sometimes, there is a natural ordering in which the importance among the parameters is arranged. Typical examples are the representation of a function by series or the estimation of a spectrum by a long autoregressive process. Chapter 2 and Chapter 3 are devoted to the analysis under this framework. In Chapter 2, we adapt concepts of the classical Hajek-Blackwell-Lecam theory to develop a theory of asymptotically optimal estimation of the parameters. In many of these cases, maximum likelihood estimators do not exist, and hence there is no canonical candidate for a good estimator. We define suitable loss functions for the estimation error, which allows us to uniquely characterize some estimators. In estimation procedures, it is quite common to assume higher order differentiability or smoothness conditions of the parameters. We construct some simple prior distributions that force the parameters to obey the smoothness conditions. We show that the class of shrunken sieve estimators is asymptotically efficient. I.e. the sieve estimator is multiplied with a matrix that shrinks the estimates towards zero, analogous to Ridge regressions or Bayesian estimators in a linear model.

In Chapter 3, we show that, in linear models with increasing dimension, the estimator resulting from the maximization of Akaike's Information Criterion is asymptotically equivalent to some Bayesian estimators. The family of prior distributions that generates our estimators is normal, defined on the space of all sequences, and is characterized by an exponential decay of the variance for the higher order components of the parameter.

The last two Chapters are devoted to decision theory in microeconomics. In contrast to the decision theory in econometrics where the loss (utility) function is predefined, the focus of microeconomics is to recover a well-defined preference (utility). A well-defined or a rational preference is one that satisfies certain consistency axioms. The most notable consistency axiom is arguably the transitive axiom. The most studied transitivity axiom in the stochastic choice literature is the strong stochastic transitivity (SST). However, individual choice data often violate SST while conforming to moderate stochastic transitivity (MST). Chapter 4 focuses on the analysis of this axiom and its relevance to recovering the underlying preference. Our first theorem shows that a binary choice rule satisfies a slightly stronger version of the MST postulate, which we call MST+, if and only if it can be represented by a moderate utility model (MUM). Choices in the MUM are a function of utility difference divided by a distance metric, which determines the degree of comparability of the options. Our second theorem introduces the moderate expected utility model (MEM) and shows how our parameters can be identified from the choice data over lotteries.

Sometimes the choice data do not even satisfy the weakest form of transitivity and violate other classical axioms such as the independence of irrelevant alternatives. The main source of such observations comes from contextual choices. Chapter 5 is devoted to rationalizing such choice behaviors. We build a choice model with a fixed underlying utility function and explain contextual choices with a novel information friction: the agent's perception of the options is affected by an attribute-specific noise. Under this friction, the agent obtains useful information when additional options are introduced. Therefore, the agent chooses contextually, exhibiting intransitivity, joint-separate evaluation reversal, attraction effect, compromise effect, similarity effect, and phantom decoy effect. Nonetheless, because the noise is attribute-specific and common across alternatives, the agent chooses perfectly rationally whenever there is clear dominance between options.


English (en)

Chair and Committee

Werner Ploberger

Committee Members

Siddhartha Chib, George-Levi Gayle, Paulo Natenzon, Jonathan Weinstein,


Permanent URL:

Available for download on Sunday, April 18, 2021