The Godmother of AI wants everyone to be a world builder

The Godmother of AI wants everyone to be a world builder

According to market fixed techies and professional skeptics, the artificial intelligence bubble has burst and winter is back. Fei-Fei Li doesn’t believe this. In fact, Li, who has earned the nickname “the godmother of artificial intelligence,” is betting on the opposite. She is on part-time leave from Stanford University to co-found a company called World Labs. While current generative AI is language-based, she sees a frontier where systems build whole worlds with the physics, logic, and rich details of our physical reality. It’s an ambitious goal, and despite the dour nabobs who say AI progress has reached a grim plateau, World Labs is on the fast track to funding. The startup may be a year away from having a product — and it’s not at all clear how well it will perform when and if it arrives — but investors have put in $230 million and reportedly value the fledgling startup at $1 billion.

About a decade ago, Li helped artificial intelligence turn the corner by creating ImageNet, a custom database of digital images that allowed neural networks to become significantly smarter. She believes today’s deep learning models need a similar boost if AI is to create actual worlds, whether realistic simulations or entirely imaginary universes. The future George R. R. Martins can compose his dream worlds as prompts instead of prose, which you can then render and wander around in. “The physical world of computers is seen through cameras and the computer brain behind the cameras,” Li says. “Turning that vision into reasoning, generating and eventually interacting involves understanding the physical structure, the physical dynamics of the physical world. And this technology is called spatial intelligence. World Labs calls itself a space intelligence company, and its fate will help determine whether that term becomes a revolution or a punch line.

Li has been obsessed with spatial intelligence for years. While everyone was joking around on ChatGPT, she and a former student, Justin Johnson, were excitedly talking on the phone about the next iteration of AI. “The next decade will be about generating new content that takes computer vision, deep learning and AI out of the Internet world and embeds them in space and time,” says Johnson, now an assistant professor at the University of Michigan.

Li decided to start a company in early 2023. after dinner with Martin Casado, a virtual networking pioneer who is now a partner at Andreessen Horowitz. This is the venture capital firm known for its almost messianic embrace of AI. Casado sees AI going down a similar path as computer games, which started with text, moved to 2D graphics and now have dazzling 3D images. Spatial intelligence will drive the change. Ultimately, he says, “You can take your favorite book, throw it into a model, and then literally step into it and watch it play out in real time, in an immersive way,” he says. The first step to making that happen, Casado and Li agreed, is moving from large language models to large ones world models.

Li began assembling a team, with Johnson as a co-founder. Casado suggested two other people—one was Christoph Lassner, who had worked at Amazon, Meta’s Reality Labs, and Epic Games. He is the inventor of Pulsar, a rendering scheme that led to a famous technique called 3D Gaussian Splatting. This sounds like an indie band at an MIT toga party, but it’s actually a way to synthesize scenes as opposed to one-off objects. Casado’s other suggestion was Ben Mildenhall, who had created a powerful technique called NeRF – Neural Radiation Fields – that converts 2D pixel images into 3D graphics. “We took real-world objects in VR and made them look perfectly real,” he says. He left his position as a senior research fellow at Google to join Li’s team.

One obvious purpose of a large world model would be to instill a sense of the world of robots. This is indeed in the World Labs plan, but not for a while. The first phase is building a model with a deep understanding of three-dimensionality, physicality, and notions of space and time. Then there will be a phase where the models support augmented reality. After that, the company can move on to robotics. If this vision is realized, the world’s big models will improve autonomous cars, automated factories and perhaps even humanoid robots.

Leave a Reply

Your email address will not be published. Required fields are marked *