Personal tools
You are here: Home Public News Multimodal Interaction in Augmented and Virtual Reality

Multimodal Interaction in Augmented and Virtual Reality

A project which focuses on exploring multimodal interaction in immersive environments, particularly on the problem of target disambiguation while selecting an object in 3D. Scientists have created an interactive 3D environment as a test bed and have used it in a variety of augmented reality (AR) and virtual reality (VR) scenarios.

More info here.

Often in 3D immersive environments the user is faced with many selection problems, such as imprecise pointing at a distance, selection of occluded/hidden objects, and recognition errors (e.g., speech recognition errors). The goal here is to reduce selection errors by considering multiple input sources and trying to compensate for errors in some of these sources through the results of others. For example, if one object (e.g., a "chair") is occluded by another object (e.g., a "desk"), simple ray-based selection will fail to select the "chair" since it cannot be seen. However, if one is able to specify for example that the object of interest is a "chair," and it is "behind the desk," then speech can help disambiguate the pointing gesture and yield the correct result.

In a multimodal environment, each input source (e.g., spoken language) has its associated uncertainties. In this system, these are represented as the n-best lists for each associated input source, as well as corresponding probabilities representing the actual prediction certainty. In addition to spoken language, the sources in this system include 3D gestures and a set of visibility and spatiality perceptors that use the SenseShapes approach.

This multimodal system fuses symbolic and statistical information from these sources and employs mutual disambiguation of these modalities to improve the decision-making process. Thus, it is possible (and probable) that the top choice of each recognizer will not always be the selected one, and the the choice that provides a best fit across all available inputs will more likely be selected. User studies conducted with the system demonstrate that such mutual disambiguation corrections account for over 45% of the successful 3D multimodal interpretations.
Document Actions
Powered by Plone

Portal usage statistics