VR/AR Input Is Hard… Still



In late February 2015, Palmer Luckey stated in an /r/oculus Reddit post,

‘Don’t get too hyped on the possibility of seeing anything at GDC. VR input is hard – in some ways, tracking hands well enough to maintain a sense of proprioceptive presence is even more technically challenging than getting perfect head tracking.
We will show something if and when we get it working well, but we have to avoid showing off prototypes that are not on a clear path to being shipped at the same or higher quality level. Throwing together very expensive or impossible to manufacture prototypes for internal R&D is one thing, using them to publicly set expectations around the near future is another.
Not naming anything specific here, but the history of technology is littered with the corpses of companies that over promised and under delivered by shipping real products with real limitations that were glossed over in promotional materials. Oculus can’t afford to do that.’

touchHe was referring to the fact that Oculus had not, at that time, shown any type of hand input to accompany their upcoming consumer Rift headset. A few days later, at GDC 2015, HTC unveiled their Vive development kit, which included two Lighthouse tracked controllers. Almost as if the timing was an intentional stab at Luckey’s comments regarding input.

Three months later, during a press conference, Oculus announced their ‘Touch’ controllers.

Luckey’s comment this time was, ‘You’ve heard us say that input is hard, but we got it right.’

The Rift shipped in early 2016, with an emphasis on gamepad input, or a non tracked ‘clicker’ remote. The Touch controllers will ship in December 2016, completing the Rift input solution.

Sony’s Playstation VR had, from the beginning, adapted it’s already existing ‘Playstation Move’ controller solution to be their VR input method of choice. ‘Move’, along with their ‘Playstation Camera’, These pieces of tech were already five years, and a console generation behind, but somehow provided adequate input for a VR console headset.

The Samsung GearVR had always relied on either the built in touchpad, a bluetooth gamepad, or some level of gaze-based input to navigate menus and interact with the wide assortment of bite-sized, mobile friendly, VR experiences. With no definitive standard of input, users struggle to decide if gaze, tap, button A, or something else was the right way to interact with mobile VR.

In the early days of this generation of VR, the Leap Motion input device was attempting to become the right input solution. Giving us hands without controllers or gloves, Leap Motion had promise, but struggled with field of view, consistency, and occlusion. Some developers supported it, but not many.

Let’s look at generation one of Google Cardboard. A magnet on the side of the phone-holding contraption would generate a ‘click’, assuming that your phone was compatible, and the developer decided to utilize that input method. Again, gaze-based input and gamepads slide in as alternate solutions that not every user understands or enjoys. Generation two of Google Cardboard eliminates the magnet clicker, and instead allows a screen tap through a cleverly placed flap inside the origami gizmo. ‘Cardboard Knockoffs’, or third party phone-holding HMDs, go in every direction possible, preventing a consistent input method for developers to embrace.

Google now has Daydream, which comes with a remote input controller, that is tracked, and is being pushed as the new standard of input for mobile, phone-based HMDs. Google stresses that they do not want developers to use gazed-based input, and to use their new remote as a ‘laser pointer’ for experiences. Whether developers will embrace this or not is still up in the air. With only a small number of phones compatible with Daydream at launch, it will be interesting to see if older Cardboard experiences will be upgraded by developers, or will they neglect these old titles while they wait for a unified input solution.


How about AR? Hololens has the ‘air tap’ and the ‘bloom’ inputs. The ‘air tap’ basically gives you a mouse click input, while aiming your gaze-based reticle onto the desired icon/button. The ‘bloom’ is your Windows button, and will always bring you back to the main menu. Microsoft was also smart enough to include a non-tracked clicker, which emulates the ‘air tap’, for those folks who cannot seem to figure it out (which is a large number of people, believe it or not). Hololens also has the capability to use voice input, again, assuming that the developer has taken the time to build that in.

To prevent myself from writing a short novel here, I am going to skip over anything that includes haptics, IMU tracked gloves/suits, wearable ring joysticks, hamster balls, unreleased hand controllers, or circular treadmills.

What do we really want? Do we want to hold something in our hands, like a wand or a clicker? Do we want it to be tracked in 3D space? Do we want gloves or just tracked hands/fingers? Do we stick with the comfort of the gamepad?

Unfortunately, input IS still hard, and mostly because we do not have a unified, and fully functional solution. We are iterating right now, and might be getting closer, but we do not have a true solution yet. As a developer, it is difficult to decide on which input methods to adopt, since the VR customer base is still being defined. Gamers love their gamepads, but are they willing to adopt new controllers simply because they are tracked in space? Will new VR users, and average consumers have an expectation that we will have full gloves/suits, due to the promise of 90s Hollywood movies?

These questions need to be addressed, and nobody can claim to have the ultimate VR/AR input solution until then.

The simplest solution may not be the correct one.