Developing software professionally since 1998. Focused on AR/VR through the complete 3D pipeline. Also experienced in full-stack product development including native-mobile, web, server ops, analytics and AI.
Updated Jul 10, 2019
This is called Augmented Reality or AR. The HoloLens is a head mounted display (HMD) with stereoscopic “waveguide” displays that project light onto lenses in front of your eyes.
One of the major features of the HoloLens is its use of Kinect style cameras built into the headset that allow it to do simultaneous location and mapping (SLAM) to detect exactly where you are in a real non-virtual room. Unlike Cardboard (Daydream) or Oculus Go, the HoloLens uses SLAM to give you all Six degrees of freedom (6 DoF) instead of just 3 DoF. 6 DoF means not only can you rotate in pitch, yaw, and roll, but you can also translate or move forward, up, and right. This 3D room data can also be used by applications to place objects on stationary surfaces like a desk or wall.
The HoloLens can detect basic hand gestures like pinch, bloom, and “air-tap”. And it comes with a small single button controller to make air tapping easier. The HoloLens 2 has much better hand recognition that allows you to do more “Minority Report” style hand tracking with stretching type gestures.
The $3000 HoloLens is self-contained and does not require a PC. VR HMDs like the $480 HTC Vive, $1000 Valve Index, or $400 Oculus Rift all require a fairly beefy PC to run. This PC might cost you $1500 for a moderately spec’d machine. Additionally, a VR that requires a PC, feels less mobile because the user is tethered via a cable. This trade-off is made because the PC produces far better graphics. The $2300 Magic Leap One is a self-contained direct competitor to the HoloLens. And the 2019 Oculus Quest is also self-contained but it is about $400.
Why use a HoloLens when the Oculus Quest is so much cheaper? The major difference, Oculus is VR and HoloLens is AR. In VR the entire world is blacked out. In AR light from the real world is visible, but “augmented” with extra images. Some VR headsets have cameras that “pass-through” a video of the real world, but it feels unnatural, and quickly strains your eyes. AR devices like the HoloLens are generally considered more appropriate for work, and VR devices are generally considered more appropriate for play.
In VR your friends like to sneak up behind you because they know you can’t see them. But in AR you are safe to look anywhere you want with your eyes, only having a piece of glass between the virtual world and the real world. When you’re trying to work, you don’t want to worry about people sneaking up on you. At first it’s funny to sneak up on your friends, but it gets old quickly, because of this some people feel safer in AR.
|Valve Index||VR||6-DoF||PC required||$1000|
|HTC Vive||VR||6-DoF||PC required||$480|
|Oculus Rift S||VR||6-DoF||PC required||$400|
|Daydream View||VR||3-DoF||Phone required||$60|
|Google Cardboard||VR||3-DoF||Phone required||$10|
One UI idea that our clients love, myself included, is the UI in Minority Report. Tom Cruise’s character is waving his hands around dragging pictures and things. It’s really cool. The HoloLens will track your hand, but not nearly as well as in the movie. In fact it’s quite poor, so poor that the included Bluetooth button is nearly essential for 5 minute AR experiences, otherwise you’ll need to spend a minute training people how to air-tap. The dragging and scrolling gestures are not smooth, and most of the time lose tracking in the middle of the gesture. Imaging trying to drag a file with your mouse from one window to another, using a touchpad, but most of the time the system cancels your drag halfway across the screen; this is what it’s like dragging on the HoloLens. The HoloLens 2 claims to have better hand interactions, we’ve seen some videos of this, and they look cool, but we aren’t convinced that they will be as seamless as the movies. So we don’t recommend relying on hand gestures for manipulating or creating things in 3D in HoloLens.
When putting on the HoloLens one of the first things you will notice is it’s extremely limited field-of-view. According to news reports of the unreleased HoloLens 2, this issue is still not fully resolved, even though the HoloLens 2 claimed to have two times the field-of-view of the original HoloLens. One could imagine an AR application where you could be surrounded in a 360 degree circle of monitors, in fact you could even look up and down and have a whole sphere of monitors around you. However, this experience may be jarring on the actual HoloLens hardware, because the field of view is so narrow, you would have to constantly move your head. There are valid technical reasons why this field of view is so small. Including that it’s state-of-the-art, power consumption, and pixel density. The Magic Leap is also criticized for having a small field-of-view.
To imagine a physical representation of the field of view. Take your phone, and put it approximately one hand length in front of your head. I have an iPhone XS Max, which is fairly large, but any 4-6” phone will work. Imagine your phone was a piece of glass. You can see there are huge sections of your field of view, outside the area of your phone, where no holograms would be drawn.
We had lots of issues mirroring the HoloLens screen on a conference room TV. Embarrassingly, this was the least stable on Microsoft’s campus using their Guest WiFi. The casting functionality is really important for client demos, so that the team can narrate the experience that the HoloLens user is having as well as answer any questions about visuals the user might be seeing. In one conference room we were able to successfully cast to the TV. However in four other conference rooms at Microsoft casting would either end in an error before it began, or not show up in the list of casting locations. In our office where our machines are not locked down by group-policy admins, we had nearly zero trouble getting casting to work. Our best solution to this issue is to hard wire everything, and use the debugging console at localhost:10080 to view a test render, and then plugging the laptop looking at the test render into HDMI. Unfortunately with this wired solution the user is connected to a computer and cannot walk around freely.
As mentioned before, the clicking gesture “air-tap” is not as simple as poking the air in front of you. Basic 3D cursor navigation, including gaze hover to click, were a bit more stable. We found that we had to implement multiple methods for navigation as redundancies, and have an in-person guide falling back to the next layer when the user is experiencing difficulty.
We didn’t expect the Mixed Reality ToolKit (MRTK), the official Unity library to HoloLens, to have so many errors and warnings when importing it. In the HoloLens community it’s well known to ignore some errors. When finally getting through all the steps just to get the MRTK to load, we noticed frame rates at an unacceptable 7 to 8 frames per second using their demonstration example scenes. AR and VR require performance of 60 fps, rates lower than that are widely known to cause VR sickness. You don’t want your users to be physically sick because they use your app.
Over a short amount of time, you’ll notice many discarded holograms around your room, especially in difficult to reach places like behind monitors or under tables. These are difficult to close. There is no setting to clean up the room. In fact if you ask Cortana “Hey Cortana, close all holograms,” she will reply “Sorry, I can’t do this right now. Check back again after future updates.” She’s been making this reply for the last 3 years since HoloLens was released.
HoloLens is quite capable of understanding verbal commands. If you program a few verbs into the voice recognizer, it’s as simple as registering a callback function to find out which commands were recognized.
In fact, using a voice command like “reset” is often better and faster than pointing your head at a reset button and air-tapping it. This is a surprising contrast to using your phone or computer where tapping a button is much easier than speaking a voice command.
Also the wall detection, and SLAM features mentioned at the beginning are very sophisticated. You could place anchors in various rooms in a sprawling office, and the HoloLens will have little trouble telling your application exactly where the user is relative to the room down to the centimeter, and what angle they’re facing. Anchors are 3D points in real physical space. It’s like putting a post-it note on something in a room.
In our experience, HoloLens development is best if you focus on quick visual wins right away rather than focusing on 3D object tracking. We spent a few months working with things like Vuforia to try to detect object’s 3D orientation only to see them fall flat due to the limited processing power of the HoloLens. 3D object tracking doesn’t even work well on high powered phones that have good GPUs in them. Instead focus on adding quick polish like DOTween animations, and making cool 3D models for people to look at.
We’ve seen a few HoloLens apps that seem like they are ported from VR. This is because they try to draw background elements like the ground, sky, and mountains. This does not look good in AR, and emphasizes the small field-of-view. Instead, when designing, imagine the background is transparent. Even better, take a picture of your desk from your head elevation. Reduce the brightness by about 10% and use that as a background. The HoloLens’s waveguides cannot make things darker, it can only make things lighter. So they use a method similar to sunglasses to first make everything slightly less bright. If you are familiar with Photoshop the HoloLens waveguide is more like the “Screen” blend mode, not the “Multiply” blend mode. In fact the darker the walls and furniture of your demo room, the better.