Building with RealityKit for the First Time

Last year Apple announced RealityKit at their annual developer conference, WWDC. RealityKit is a new augmented reality framework for building AR apps on iOS and iPadOS. I was shocked by the announcement because ARKit, the first Apple AR framework, was announced all of two years prior at WWDC 2017. Was Apple already abandoning the ARKit I know and love? Did RealityKit offer a new and improved way to build for augmented reality?

The answer is: ...maybe?

RealityKit is definitely a V1, rough-around-the-edges framework – but it works in conjunction with ARKit as opposed to trying to replace ARKit. I think Apple is looking toward an AR future where we don an iPhone, AirPods, Apple Watch, and their forthcoming AR device to interact with apps overlaid on the real world instead of tapping apps on a screen. To accelerate the arrival of such a future, RealityKit is Apple’s attempt to build a new type of framework from the ground up; a framework that deemphasizes building for screens and emphasizes building for realities.

That AR future and that revolutionary framework aren’t here quite yet. There are many things missing from RealityKit that you can do with SceneKit, Apple’s old and neglected 3D engine that many AR apps are built atop right now. RealityKit can be frustrating in its incompleteness and delightful in its simplicity.

RealityKit is Apple’s take on a Swift-only, beginner-friendly way to build things for augmented reality. Here’s what I built and my first impressions of RealityKit.

The Project

After hearing iOS 13 added body tracking in augmented reality apps, I instantly knew I had to give it a try. I just needed a quick and fun demo to build. I was likely listening to some heavy guitars and that’s all it took to decide on: AR Guitar.

It checked all the boxes:

The code is open-source and released under the MIT license, so you can do whatever the hell you want with it. Audio is needed for the full experience (check it out with audio on Twitter), but here it is in all its muted glory.

The app first looks for a human body and overlays markers on the head, hips, and hands when a body is found. Then using ARKit’s body tracking it watches for placement of the left hand as if holding a guitar’s neck – in this case all that means is a left hand some distance above the hips. Moving the left hand further up the neck of the imaginary guitar will prepare different sounds. Those sounds are played when a “strum” action is detected, approximated by the right hand moving below a height threshold.

Hopefully the simple code can help others explore body-based controls. Developing AR Guitar gave me basic insight into building with RealityKit.

Reality Composer = Keynote for AR

My first thought when using RealityKit and Reality Composer (Apple’s standalone universal app for visually creating RealityKit experiences) was: This is easy.

Reality Composer is a visual editor that lets you build augmented reality scenes, interactions, and models without code. Whereas Keynote streamlines the process of creating enticing presentations, Reality Composer facilitates creating simple yet powerful AR experiences. Hence: Keynote for AR. You can quickly and visually bring in a 3D model, attach it to whichever horizontal plane ARKit detects, and then make the model respond when tapped on-screen.

I don’t believe Reality Composer is ideal for building complex apps, games, demos, or prototypes. I’ll go into technical specifics and roadblocks I came across when building a real-time game in a future post. Just like RealityKit, it’s also early days for Reality Composer. A future version may open up the doors for additional functionality and integration with the code of more intricate apps.

Testing AR Sucks

I complained about testing AR products in 2017. And again in 2018. And 2019. When building a traditional mobile product, you can usually load a simulated phone on your desktop and test things there. You don’t need to own all the differently sized iPhones, you just need to launch a simulated iPhone to test your software. That doesn’t work for AR apps that need to scan the space around you using a camera. The simulated phone on my desktop doesn’t have a camera and can’t see the space around me. Apple added the ability to record and replay an AR session last year, but it’s a half-step toward a problem that needs a robust solution. Apple should look to Snapchat’s Lens Studio for inspiration because Snap’s approach to testing is years ahead of Apple’s.

If testing AR apps is hard, testing AR apps that utilize body tracking is harder and testing multiplayer AR apps may be hardest. The current state of iOS body tracking capabilities is mediocre at best. The estimated skeleton jumps around, body tracking is very susceptible to total failure in low-light situations, and multiple times body tracking thought I was facing away from the camera instead of toward the camera (that last one may be a bug in my code). On top of those systemic shortcomings, ARKit body tracking frequently requires feet to be in-frame before recognizing a body.

If you’re iteratively making minor adjustments to an app that uses body movements for controls, how would you test such an app? Recording and replaying sessions is out because a recorded session doesn’t allow for fine-tuning body-based controls. You’re sitting at a desk, however you need to get your feet in-frame for ARKit to recognize your body. And you also need to see what your app is displaying, so mirroring the device may be necessary since the phone’s rear camera is pointed at you.

My idiotic solution is to put my phone on a tripod next to my desk, point the phone’s rear camera at me, and mirror the phone’s display to my desktop using QuickTime. The idiocy comes in when I need to get my feet in-frame so ARKit recognizes me as a human. I scoot my chair back and do some weird seated crunch exercise to bring my legs up with my arms thrown above my head to get all my limbs in-frame. It’s very dignified. And it can also lead to exuberant flailing, leaning back too far, and toppling over in a desk chair. Yesterday I injured myself debugging AR Guitar. I dread testing some AR apps. Testing hurts me.

Restrictive or Early Release?

Even without getting into the technical details, I find it interesting to contrast the capabilities of RealityKit and Reality Composer to those you get when building with SceneKit. Such a contrast begs the question of whether the Reality-based tools are purposely restrictive or that’s merely the consequence of both being V1 releases.

We will be able to extrapolate in 4 months at WWDC 2020. My expectation for later this year is that Reality Composer will receive additional bundled 3D models and minor, if any, feature updates. There are competing products like Adobe Aero, but if Reality Composer’s primary purpose is to serve as a free, bundled, no-code, intuitive, beginner-friendly tool for creating AR experiences – today’s version is a surprisingly effective “Keynote for AR” that doesn’t need many additions.

I expect RealityKit to receive a lot of love at WWDC with notable new features, additional mode capabilities, improved physics performance, and all the corresponding new APIs. I expect these because I expect Apple’s AR development leadership to continue. Replacing SceneKit with a powerful AR-focused framework is necessary if Apple wants to own a good chunk of a fantastical AR future.

However, I’m an AR-junkie without knowledge of whatever Apple’s cooking up for AR devices. My vision of an AR future may be unrealistic and incompatible with Apple’s master plan. I don’t expect RealityKit to remain restrictive in AR capabilities, but if many of those restrictions are still in place when V2 hits this summer that indicates a different path forward. To me, a restrained RealityKit V2 indicates a specific, subservient role for Apple’s AR devices. Think more Apple Watch and less “next iPhone.” Even if their AR device (which I expect at the earliest in 2022) takes a supporting role in the Apple ecosystem, I look forward to hacking it to do more than it’s supposed to.


Want more? Read my last post about developing an augmented reality browser.

Want to build some awesome AR thing? Reach out!