Guest Blog - From AR to AIR: How AI is Transforming Augmented Reality

By Greg Lindsay, senior fellow, ASU Threatcasting Lab; 2022-2023 urban tech fellow, Cornell Tech.

In April 2024, Meta’s own pair of AR glasses arrived three years ahead of schedule, albeit in a form no one would have predicted only a year earlier. The Ray-Ban Meta smart glasses were little more than a curio when they were announced in 2021. Packing cameras, a microphone, and speakers into the Wayfarer’s iconic acetate frames, the $299 glasses were seen internally as a stepping stone to CEO Mark Zuckerberg’s “holy grail” of fully immersive AR lenses by 2027. Then generative AI happened.

This spring, the glasses received a major upgrade courtesy of Meta AI — a multimodal AI assistant capable of analyzing images and vocally answering questions. Beginning with the prompt, “Hey Meta…” wearers can ask the glasses to scan famous landmarks, translate languages, and identify plants and animals, to name just a few uses. Early reviewers were charmed by its whimsical responses and forgiving of its frequent mistakes and hallucinations.

“I used to think that AR glasses wouldn’t really be a mainstream product until we had full holographic displays,” Zuckerberg told Meta investors a week later. “But now it seems pretty clear that there’s also a meaningful market for fashionable AI glasses without a display.”

Forget the metaverse. The advent of large language models (LLMs) such as OpenAI’s ChatGPT-4o and Meta’s Llama 3, coupled with a new generation of affordable devices such as Meta’s Wayfarers or Brilliant Labs’ competing $349 frames, has spun XR in a new direction. Call it AIR, for artificially intelligent reality. Unlike AR’s emphasis on visuals, AIR is more focused on making the world legible to machines and facilitating both AI-to-human and AI-to-AI interactions. Meta may have been the first, but it’s not alone, with Google’s Astra, Apple, and OpenAI close behind in marrying AI to wearables.

Meta’s larger ambitions for its glasses can be seen in Project Aria, a parallel effort by Meta Reality Labs Research to develop AIR capabilities and use cases. Its stated aim to capture “egocentric” multimodal data would on first blush appear to have more in common with the heads-up displays of the Apple Vision Pro or Google Glass than true AR. But Meta’s larger goal of creating spatially intelligent, personalized AI assistants requires capturing high-resolution scans of users’ environments — data that can later be used to create digital twins or even a VPS of its own.

The project’s trailer depicts using the glasses in urban settings for wayfinding and recommendations. “It’s not just the turn-by-turn direction you’re getting from your phone, but it’s really navigating you to anything,” intones the project’s technical lead, intimating how Facebook’s treasure trove of “likes” and personal data might be extended into physical space, steering wearers to places and products of interest.

There they will discover the second incarnation of AIR: AI assistants present in AR personas. The first may be Wol, a mixed-reality owl created by the studio Liquid City in conjunction with Niantic and Inworld, which supplied its voice and underlying AI. Users summon Wol from his home in an animated redwood forest using WebXR on their smartphone or Meta Quest headset, who then proceeds to perch on the nearest surface and regale the viewer with forest facts and stories. While Wol is a teaching tool prototype, he doesn’t follow an obvious script. Asked without prompting for his favorite karaoke song, Wol replied, “I like the Eagles — they’re a hoot.”

Wol is the brainchild of Liquid City founder Keiichi Matsuda, a filmmaker/developer best known for his 2016 short film Hyper-reality, which is still a dystopian touchstone for fully-realized urban AR. (Niantic’s John Hanke has repeatedly referred to the film as a warning, going so far as to bring Matsuda in as the company’s conscience of sorts.) His latest film, Agents, imagines a near-future in which AI manifestations like Wol are commonplace, with each possessing its own personality, features, and functions ranging from navigation to real-time matchmaking.

Matsuda explained his cosmology of Agents — and of AIR in general — in a short essay titled “Gods,” borrowing a metaphor from Shinto animism. The future of AR and AI, he argued, was not disembodied, seemingly omniscient chatbots or voice assistants, but a pantheon of helpful, playful, and occasionally silly gods communicating with their users and each other. “This world will envelop our physical cities,” he wrote, “and gods will come to be our connection to our appliances, our institutions, our applications, and much more.”

In the animist tradition, these agents will not only belong to people, but also places. Just as Shinto shrines honor local spirits, Matsuda imagines every landmark, park, and coffee shop hosting AIR incarnations of their own. Prior to creating Wol, he and his team prototyped “The Spirit of the [San Francisco] Ferry Building,” an agent in the form of an ancient AR mariner. In a world in which agents are abundant, and in which AIR glasses are continually scanning, analyzing, and describing their environments, it only makes sense to give AI as many forms as the technology will allow.

In this sense, it’s possible to imagine cities enchanted by flocks of Wols and the spirits of places conversing with one’s own agents. They might take orders, give directions, or simply banter with passers-by. It’s equally plausible to imagine malicious ones trained to scam, harass, and otherwise haunt people rather than help. (Even city-sanctioned AIs run the risk of rogue behavior, as New Yorkers discovered after the official MyCity chatbot repeatedly advised them to break laws.)

Taken together, this twin definition of AIR — as AI augmenting the world and AR giving form to AI — points to a new trajectory for both technologies that is fundamentally urban. Which is another reason why both AR and AI must come under cities’ purview as these agents start to multiply — what happens when tomorrow’s Pokémon GO begin to think for themselves?

Guest blog written for Darabase by Greg Lindsay, senior fellow, ASU Threatcasting Lab; 2022-2023 urban tech fellow, Cornell Tech.

To find out more about Darabase’s AI/AR offering, speak to our AR Immersive Studio team today.

Guest Blog – From AR to AIR: How AI is Transforming Augmented Reality

Like this:

Related

jasondarabasecom

Previous PostDarabase Meets... Smart Cities with Mansoor Hanif

Next PostWhite Paper: Building A Robust Accounting Framework for Property Digital Rights