Apple betting on Windows-based AR

Apple's ARKit for Augmented Reality

Well, well, well, look who wants to join and play! Apple’s developer conference WWDC just started and it’s finally the moment to see some in-house augmented reality development from Apple hit the stage! I didn’t even have the time to sort all my AWE conference notes, check all videos or talk about Ori’s keynote to push superheroes out into the world! Hm, guess I can´t resist, but need to write up the Apple news today:

One more thing… AR

So, Apple talks about their own big brother speaker for your living room, some other hardware and iOS updates, etc.pp, but then finally we get to learn about Apple’s plans to jump into AR! Pokémon plays the well-known example for the masses again. But this time using the new “ARKit” by Apple. Their new SDK toolset for developers that brings AR…

…to your phone or tablet. Yep. No AR goggles (yet), but through a frame, a window to hold. As I discussed last week, this was highly expected. Apple AR will be seen through windows the next years, too. Apple won’t spearhead the glasses approach.

The presentation of this new toolkit is nicely done and it feels like AR has never been seen before. Craig Federighi is really excited – “you guys are actually in the shot here” – that one could think people at Apple were only thinking about VR lately and are surprised to see a camera feed in the same scene. He claims that so many fake videos have been around and now Apple is finally showing “something for real”. (Nice chat, but honestly, there have been others before. But, let’s focus:) Obviously Apple is good in marketing and knows their tech well. They have been investing in this a lot and now we can see the first public piece: in the demo we see how the RGB camera of the tablet finds a plain wooden surface of the table and how he can easily add a coffee cup, a vase or a lamp to it. The objects are nicely rendered (as expected in 2017) and have fun little details like steam rising from the coffee, etc. The shown demo is a developer demo snippet and shows how to move around the objects – and how they influence each other regarding lighting and shadows. The lamp causes the cup to cast a shadow on the real table and changes to object movements accordingly. In the demo section one could try it out and get a closer look – I’ve edited the short clip below to summarize on this. Next, we see a pretty awesome Unreal-rendered “Wingnut AR” demo showing some gaming content in AR on the table. Let’s take a look now! Scrubb to 1:25:29 in the linked video below or jump to youtube page to start off at the right time code directly.

The demos show pretty stable tracking (under the prepared demo conditions), Apple states that the mobile sensors (gyro, etc.) support the great visual software part using the RGB camera. They talk about “fast stable motion tracking” and as it was shown this can be given a “thumbs up”. The starting point seems to be the plane estimation to register a surface to place objects on. They don’t talk about the “basic boundaries” in detail – how is a surface registered? Does it have clear borders? In the Unreal demo we briefly see a character fall off the scenery into darkness, but maybe this works only in the prep’ed demo context. Would it work at home? Can the system register more than one surface? Or is it (today) limited to one height only level to augment stuff? We don’t learn about this and the demo (I would have done the same) avoid these questions. But let’s find out about this later below when looking at the SDK.

Apple seems pretty happy about the real-time light calculation to give a more realistic look to it. They talk about “ambient light estimation”, but in the demo we only see some shadows of the cup and vase moving in reference to the (also virtual) lamp. This is out of the box functionality of any 3D graphics engine. But it seems they plan way bigger things, actually considering the real world light, hue, white balance or other details to better integrate AR objects. Metaio (now part of Apple and probably leading this dev) showed some of these concepts during their 2014 conference in Munich (see in my video from back then) using the secondary camera (face-facing) to estimate the real world light situation. I would have been more pleased if Apple showed some more on this, too. After all, it’s the developer conference, not the consumer marketing event. Why don’t they switch off the lights or use a changing spotlight with some real reference object on the table?

Federighi briefly talks about scale estimation, support for Unity, Unreal and SceneKit to render and that developers will get Xcode app templates to start things quickly. With so many existing iOS devices out in the market they claim to have become “the largest AR platform in the world” over night. Don’t know the numbers, but agreed that the phone will stay the AR platform of everybody’s (= consumer big time market) choice these days. No doubt about that. But also no innovation by Apple seen today.

The Unreal Engine demo afterwards shows some more details on tracking stability (going closer, moving faster – it really looks rock solid to me! Well done!) and how well the rendering quality and performance can be. No real interaction concept is shown, though – what is the advantage when playing this in AR? Also, the presentation felt a bit uninspired – reading from the teleprompter in a monotone voice. Let’s get more excited, shall we? Or won’t we? Maybe we are not so excited, since it has all been seen before? Even the fun Lego demo reminds us of the really cool Lego Digital Box by metaio.

A look at the ARToolkit

The toolkit’s documentation is now also available online, so I planned to spend hours there last night. But to admit, it’s quite slim as of today (was good to get some more sleep), but gives a good initial overview for developers. We learn a thing or two:

first, multiple planes are possible. The world detection might be (today) more limited than on a Tango or Hololens device, but their system focuses on close-to-horizontal oriented surfaces. The documentation talks about “If you enable horizontal plane detection […] notifies you […] whenever its analysis of captured video images detects an area that appears to be a flat surface.” and “orientations of a detected plane with respect to gravity”. Further it seems that surfaces are rectangular areas since “the estimated width and length of the detected plane” can be read as attributes.

Second, the lighting estimation seems to include only one value to use: “var ambientIntensity: CGFloat”, that returns the estimated intensity in lumens of ambient light throughout the currently recognized scene. No light direction for cast shadows or other info so far. But obviously a solid start to help for a better integration.

They don’t talk about other things regarding world recognition. E.g. there is no reconstruction listed that would allow for assumed geometry to be used for occlusions. But, well, let’s hit F5 in our browsers during the next weeks to see what’s coming. Relying on ambient light only and stable 2D surfaces as world anchors feels like “play it safe decisions”, which will allow less nerdy fun stuff today, but will probably give the best and most stable user experience.

AR in the fall?

Speaking about what’s next. What’s next? Apple made a move that was overdue to me. I don’t want to ruin it for 3rd party developers creating great AR toolkits, but it was inevitable to come. While a third party SDK has the huge advantage of taking care of cross-platform-ness, it is obvious that companies like Apple or Google want to squeeze the best out of their devices by coding better low-level features into their systems (like ARKit or Tango). The announcement during WWDC felt more like “ah, yeah, finally! Now, please, can we play with it until you release something worthy of it in the fall?” Maybe we will see the iphone 8 shipping a tri-cam setup like Tango – or the twin-camera-setup is enough for more world scanning?

I definitely want to see more possibilities to include the real world, be it lighting conditions, reflections or object recognition and room awareness (for walls, floors and mobile objects)… AR is just more fun and useful if you really integrate it into your world and allow easier interaction. Real interaction. Not only walking around a hologram. The Unreal demo sure was only to show off rendering capabilities, but what do I do with it? Where is the advantage over a VR game (with possibly added positional tracking for my device)? AR only wins if it plays this advantage: to seamlessly integrate into our life and our real world vision, our current situation and enable a natural interaction.

Guess now it’s wait and see (and code and develop) with the SDK until we see some consumer update in November. This week, it was a geeky developer event, but we can only see if it all prevails when it hits the stores for all consumers. The race is on. While Microsoft claims the phone to be dead soon (but does not show a consumer alternative just yet), Google sure could step up and push some more Tango devices out there to take the lead during summer.

So, … let’s enjoy the sunny days waiting for more AR to arrive in 2017!