This item is under embargo and not available online per the author's request. For access information, please visit


Date of Award

Summer 8-15-2021

Author's School

Graduate School of Arts and Sciences

Author's Department


Degree Name

Doctor of Philosophy (PhD)

Degree Type



Since the last decade, algorithmic trading has become one of the most significant developments in electronic security markets. Several types of problems and practices have been studied such as optimal execution, market making, statistical arbitrage, latency arbitrage, and so on. Among these, high-frequency market making plays a crucial role since it provides large liquidity to the market, which makes trading and investing cheaper for other market participants, and also creates sizable profits for high-frequency market makers (HFM) from the large quantity of round-trip executions involved in such practices. In this thesis, we discuss two approaches to solve the high-frequency market making problem within the stochastic optimal control framework: a conventional model-based method and a more modern model-free reinforcement learning (RL) technique.

We start with a stochastic control problem with the objective to maximize the HFM’s end-of-day “profit”, which is specified as the summation of her intraday trading cashflow and the terminal value of her inventory, estimated using the market price at the terminal time, with an end-of-day inventory liquidation cost taken into consideration. We extend the existingliterature on market making modeling in three novel ways: (1) allowing random demand to model the demand’s variability in the market, (2) allowing the possibility of simultaneous arrivals of buy and sell market orders between actions and (3) incorporating an innovative general forecast of future price changes in the market making pricing policy. Our result shows that the enhanced model is extremely flexible and the corresponding analytical optimal policy achieves substantially improved performance. This is validated based on a first-of-its kind empirical study where the historical transaction data from real-world stock market is used.

For our second approach, we adopt model-free RL algorithms to solve the problem of market making. By suitably incorporating the end-of-day inventory liquidation cost in the sequence of immediate rewards, our RL method manages to control the terminal inventory level. There are several unique aspects of our study compared with the existing literature on market making with RL: (1) we carefully design the immediate reward to start penalizing large end-of-day inventory near the terminal time to help accelerating the learning process; (2) we impose less manual intervention on the inventory control during the trading time, and our RL agent manages to control the inventory level only through the observed rewards; (3) we carefully check the stability of RL over various algorithms, parameter configurations and random seeds, which is often neglected by other studies, and (4) we conduct an insightful scrutiny on the influence of immediate rewards and state variables (especially the price change variable), and this provides key ideas for the future improvement of the application of RL techniques in market making studies.

We finally propose two constrained linear demand models to take into account the upper and lower bound of the market demand, and derive two alternative sub-optimal solutions by imposing some extra constrains on the policy (i.e., myopic or time-invariant). The analysis provides new thoughts on how the HFM’s LO size would affect the optimal prices and can serve as a reference for the choice of the volume of LOs. We also solve the problem numerically using RL methods in a simulated environment. We found out that the performance of RL is competitive to that of the sub-optimal analytical solutions.


English (en)

Chair and Committee

Jose E. Figueroa-Lopez

Committee Members

Agostino Capponi, Jimin Ding, Renato Feres, Nan Lin,

Available for download on Thursday, July 28, 2022