It’s Michal Kosinski a Stanford research psychologist with a nose for timely topics. He sees his work not only as advancing knowledge, but also as alerting the world to potential dangers posed by the consequences of computer systems. His best-known projects include analyzing how Facebook (now Meta) has gained a shockingly deep understanding of its users from all the times they’ve clicked “like” on the platform. Now he’s turned to studying the surprising things AI can do. He has conducted experiments, for example, that show that computers can predict a person’s sexuality by analyzing a digital photograph of their face.
I met Kosinski through my writing for Meta and reconnected with him to discuss his latest paper, published this week in the peer-reviewed Proceedings of the National Academy of Sciences. His conclusion is startling. Large language models like OpenAI’s, he argues, have crossed a line and are using techniques analogous to actual thought, once considered solely the realm of flesh-and-blood humans (or at least mammals). Specifically, it tested OpenAI’s GPT-3.5 and GPT-4 to see if they had mastered what is known as “theory of mind.” It is the ability of people, developed in childhood, to understand the thought processes of other people. This is an important skill. If a computer system cannot correctly interpret what people think, its understanding of the world will be impoverished and it will confuse many things. If the models do have a theory of mind, they are one step closer to matching and surpassing human abilities. Kosinski put LLM to the test and now says his experiments show that in GPT-4 in particular, theory of mind “may have emerged as an unwanted by-product of LLM’s improvement in language skills… They signify the emergence of more powerful and socially skilled AI.”
Kosinski sees his work in AI as a natural outgrowth of his earlier dive into Facebook Likes. “I wasn’t really studying social media, I was studying people,” he says. When OpenAI and Google started building their latest generative AI models, he says, they thought they were teaching them to primarily handle language. “But they actually trained a human mind model because you can’t predict what word I’m going to say next without modeling your mind.”
Kosinski is careful not to claim that LLMs have fully mastered theory of mind—yet. In his experiments, he presented several classic problems to chatbots, some of which they handled very well. But even the most sophisticated model, GPT-4, failed a quarter of the time. The successes, he wrote, put GPT-4 on par with 6-year-olds. Not bad considering the early state of the field. “Watching the rapid progress of AI, many wonder if and when AI can achieve ToM or consciousness,” he writes. That radioactive c-word aside, that’s a lot to chew on.
“If theory of mind emerges spontaneously in these models, it also suggests that other abilities can emerge afterwards,” he tells me. “They can be better at educating, influencing and manipulating us because of these abilities.” He is concerned that we are not really prepared for LLMs who understand the way people think. Especially if they get to the point where they understand people better than they do.
“We humans don’t fake a personality – we do they have personality,” he says. “So I’m kind of stuck with my personality. These things model personality. It has the advantage that they can have any personality they want at any time. When I mentioned to Kosinski that it sounded like he was describing a sociopath, he lit up. “I use this in my conversations!” he says. “A sociopath can put on a mask – they’re not really sad, but they can play a sad person.” This chameleon-like power can make an AI a superb trickster. With zero remorse.