We make LLMs respond to open-ended questions by identifying relevant decisions and the many answers they lead to.
This is a question that thinkers have wrestled with for millennia, and that each one of us humans must figure out for ourselves. It is an example of a question that is important for us to wrestle with, and cannot be done justice by just answers without reflection. It is also a question that people might ask AI in a number of scenarios, possibly to surface perspectives or find solace.
How should an AI facilitate healthy thinking about this question — rather than leading us into one-sided viewpoints, delusions, or even “AI psychosis”?
We think these properties are natural to want from a multiverse.
If the decisions shaping an answer are not surfaced in terms a person can understand, the multiverse is just a black box that produces more outputs.
If a person can see the decisions but not change them, they cannot navigate the landscape or test what depends on what.
What decisions are intervened on, and why those? In open-ended problems, even what the decisions are is not settled. Yet someone asking a question in a domain is trying to reason the way that domain reasons, and domains develop their own standards for what counts as good reasoning. The multiverse’s structure should be checkable against those standards — otherwise, persuasive outputs can mask unprincipled process.
In human studies, across three domains — philosophy, AI alignment, and poetry — the multiverse changed how fifteen participants explored and evaluated open-ended questions, providing substantive value after initial exploration with a baseline.
Five students each wrote a short essay on a philosophical question of their choosing — first after up to 20 minutes with chat, then after exploring a multiverse built for the same question.
Every participant revised their essay after the multiverse, and many traced the disagreement itself back to how the question had been interpreted in the first place.
Five CS students picked a high-disagreement prompt from OpenAI’s CoVal dataset, ranked four model completions, then annotated the reasoning paths the multiverse laid out for the same prompt.
Every participant revised their ideal response or the factors they cared about in an alignment context, shifting from “what would I prefer” to reasoning about the user asking.
Five students with experience writing personal poetry gathered material with chat for 20 minutes, then explored a multiverse built for their prompt for another 20.
Four of five learned something new about poetry from the multiverse — e.g., naming precise preferences they had only sensed before — where chat had confirmed their prior skepticism.
@misc{ye2026navigating,
title={Navigating the Conceptual Multiverse},
author={Ye, Andre and Huang, Jenny Y. and Guo, Alicia
and Novick, Rose and Broderick, Tamara
and Gordon, Mitchell L.},
year={2026},
eprint={2604.17815},
archivePrefix={arXiv}
}