Abstract
Confidence, the “feeling of knowing” that accompanies every cognitive process, plays a critical role in human reinforcement learning; yet its computational bases in learning scenarios have only recently begun to be studied. Prior work has distinguished between value confidence (certainty in value estimates) and decision confidence (certainty that a choice is correct), but how these two forms of confidence are computed and interact has not been directly tested.
Here we combine two experiments and previously published datasets to test competing computational hypotheses. We find that value confidence is best explained by a Bayesian computation reflecting the precision of value estimates, and that it adaptively guides behaviour by reducing exploration and promoting exploitation as certainty increases. In contrast, decision confidence departs from Bayesian predictions, especially on errors. A hybrid model integrating Bayesian probability with the precision of the decision variable better accounts for decision confidence.
Moreover, the relative weights assigned to these two sources of information characterize individual differences in confidence reporting and, in addition, they are predictive of task and metacognitive performance, where the more Bayesian the subject, the higher the performance. Together, these results provide a unified computational mechanism through which distinct forms of confidence shape learning and choices in uncertain environments.