Abstract

Big or small, many decisions incorporate the tradeoff between exploration and exploitation—whether to take advantage of what we know to be good, or to take a chance on something new. Recent research suggests we make this choice via a combination of stochasticity and directedness, and that the directedness involves prioritizing lesser seen options. Through a series of multi-armed bandit experiments, we extend this conceptualization of directed exploration to incorporate opportunistic choice in dynamic decision contexts. Participants chose between two bandits and, unlike in prior work, did not explore undersampled bandits more when more future trials remained, even though there was increased strategic value in learning about those choice options. Crucially, however, on each trial they chose one bandit by multiplying two randomly selected numbers, and the other bandit by adding the two numbers. We found that people seized the context to opportunistically explore, such that they were more likely to pass on the multiplication bandit for a hard problem when more subsequent trials remained. The results echo recent machine learning work on “opportunism” but in humans and suggest directed exploration reflects not just whether, but also when to explore.

Committee Chair

Wouter Kool

Committee Members

Todd Braver, Leonard Green

Degree

Master of Arts (AM/MA)

Author's Department

Psychology

Author's School

Graduate School of Arts and Sciences

Document Type

Thesis

Date of Award

12-2023

Language

English (en)

Author's ORCID

https://orcid.org/0000-0002-6451-4663

Included in

Psychology Commons

Share

COinS