This PhD research investigates the cognitive and affective influences of voice and speech as input modalities in immersive interactive experiences. While prior research on voice interfaces has primarily focused on usability and efficiency, the highly social and affective dimension of voice is increasingly gaining recognition in academia. This work aims to contribute to this growing area by investigating voice as an input modality specifically for text-based interactions, such as narration and dialogue, in immersive environments. The central hypothesis is that shifting the input modality of an immersive experience from typically silent methods (e.g., hand or controller-based) to voice-enabled ones (such as speaking out loud) can enhance the user’s emotional and cognitive engagement with the narrative text — a factor particularly significant for immersive storytelling applications in education, culture, healthcare, and art. This research aims first to investigate this hypothesis and second to provide design recommendations for incorporating voice in immersive applications, with a focus on narrative exposition, dialogue, and role-play. To address the research questions, this work employs a series of cascading experiments supported by user studies. The first experiment explores the use of voice in monologue through linear narratives; the second examines voice in dialogue via branching interactions with non-playable characters (NPCs); and the third investigates expressive speech in free-form conversations with AI-powered NPCs. These three use cases reflect common text-based interaction patterns in immersive experiences, providing ecological validity and capturing the neurophysiological complexities of speech. The experiments are supported by custom-designed immersive applications and back-end systems that integrate machine learning models for speech transcription, text alignment, and natural language generation, thereby addressing both design and technical considerations for implementing voice-based interactions. Development of the first experiment is currently underway.
Javascript must be enabled to continue!