Abstract

While reinforcement learning has been a vital component in artificial intelligence and machine learning, there exist many open questions about its implementations and how to improve them, in both minds and machines. Among these are i) the contribution of non-neuronal cell types to reinforcement learning, and ii) information-seeking behavior during reinforcement learning. In this thesis, we studied these main topics pertaining to reinforcement learning. In the first chapter, we examined the role of astrocytes in reinforcement learning, and in the second, we investigated human information seeking during reinforcement learning. Neurons in the human and animal brain have been known to support value-based decision-making and reinforcement learning in the brain [222, 131, 7]. However, to date, the role of astrocytes in reinforcement learning has not been thoroughly studied. Here, we trained mice on a bandit task, and attenuated the astrocyte activity in Ventral Striatum (VS), Dorsolateral Striatum (DLS), and Dorsomedial Striatum (DMS). We found that mice, whose astrocytes in VS were targeted, showed decreased bandit performance and increased win-stay behavior. Throughout computational modeling, we showed that these patterns could be explained by increased decision randomness. We then showed that these observations could be recapitulated by the deep neural network simulations, by attenuating input sharing across units. In the 2���� chapter, we investigate human information seeking during reinforcement learning. Information-seeking (IS) refers to intrinsic motivation towards obtaining information and is a critical aspect of human and animal behavior. While IS has been studied in the context of decision making, to the best of our knowledge, a thorough study of IS in the context of RL did not exist to date. We found that humans pay monetary rewards for obtaining information during RL. Moreover, their IS increases with reward uncertainty, decreases with learning, and correlates negatively with bandit performance. Furthermore, extensive reinforcement learning modeling revealed that IS correlates with the degree of random exploration, and having access to early information during learning increases the speed of learning in both humans and artificial neural networks. Overall, our work demonstrates a specific algorithmic relation between non-neuronal cell activity and reinforcement learning, and the coordination between human information seeking during reinforcement learning for the first time. xii

Committee Chair

Ilya Monosov

Committee Members

Gaia Tavoni; Michael Frank; Naoki Hiratani; ShiNung Ching

Degree

Doctor of Philosophy (PhD)

Author's Department

Electrical & Systems Engineering

Author's School

McKelvey School of Engineering

Document Type

Dissertation

Date of Award

12-19-2025

Language

English (en)

Share

COinS