To benefit humans, the best AI is a ‘naive’ AI

In this interview, Stuart Russell reflects on the future of human work, whether humans and AIs will ever see eye-to-eye, and the very real danger of weaponized AI.

Stuart Russell speaking at the World Economic Forum — "What If: Robots Go to War?: Stuart Russell" by World Economic Forum is CC BY-NC-SA 2.0.

Stuart Russell will deliver the keynote at the 27th annual conference on Intelligent User Interfaces, hosted by the University of Helsinki, Aalto University and the Finnish Center for Artificial Intelligence FCAI. Russell is a professor of computer science at the University of California, Berkeley, the co-author of the standard textbook in the field, "Artificial Intelligence: A Modern Approach", and a leading voice on the long-term future of artificial intelligence and its relation to humanity.

Russell’s keynote will be streamed on YouTube on Friday March 25, 2022 at 19:30 EET (Helsinki) / 1:30pm EDT (New York).

You are the director of the Center for Human-Compatible AI at UC Berkeley, and your keynote will discuss provably beneficial AI. How do we show the public that AI can be beneficial, and can they be convinced?

At the highest level, AI can be an intelligence amplifier for humans. Our civilization is the result of our intelligence and if we had more of it, maybe we could do a better job. That could mean curing disease or fixing the climate or resolving conflict to educating every individual to fulfill their potential and making our cognitive environment richer and more interesting, giving us more scope for better lives.

One failure mode for increasingly capable AI is that we give it objectives that it pursues and succeeds in achieving, but it turns out that we specified the objectives incorrectly. That's the core question of my talk: how can things go wrong, and can we prevent it? I would argue that this is the main way things go wrong. I call it the King Midas problem. King Midas wanted everything he touched to turn to gold. That was his specification, and that specification was heard by the gods, and all the food and drink and his family turned to gold, and he died in misery and starvation. In the context of AI, if we develop AI systems that are more intelligent than human beings, with the wrong objectives, the stakes are the future of the human race. We don’t want to be in conflict with a more intelligent machine.

How do we get AI systems to be on our side permanently and under our control permanently? It requires a complete change in the way we design AI systems, and there is a lot of overlap with thinking that is happening in the HCI (human-computer interaction) community. The notion of human-centered AI or computing has been around for a long time but now we are trying to give it a specific technical meaning, to guarantee that we do retain control and build systems that are beneficial to us. This AI would be required to pursue the benefit of human beings as its only objective, but would be explicitly uncertain about what that means. They know that they don’t know what is beneficial. What human beings want the future to be like, we don't know either, until we experience it. A high schooler with a choice of career, for example as a librarian, doesn’t know if that is going to be a desirable life. A larger source of uncertainty for AIs is just tons of stuff we don’t specify when we define objectives. If I ask you for a cup of coffee, I don’t have to specify that it shouldn’t cost a thousand euros or that you shouldn’t steal it. There are tons of things that I don’t have to specify because we as humans share that understanding.

Trust in AI is central and if we can’t establish trust, people will stop wanting to use AI systems and push back on the intrusion of AI into their lives.

Does compatibility between humans and AI come down to goals and values?

Compatibility is about value alignment, and we are never going to achieve that alignment with AIs. AI systems will never fully understand the preference structures of all humans. They are always going to operate under uncertainty. There will be things that we care about that AI does not know. And this leads to different behavior from AIs than you would get when there is certainty about the objectives.

A system that knows that it doesn’t know about human preferences will be cautious. If it has to make changes, it would ask permission. In HCI, there are things that computer systems know about humans and interaction protocols, for example what is a menu. It allows a human to communicate in a restricted form what they want. Lots of communication between humans and computers in ordinary interfaces is about preferences.

This week’s IUI conference is being hosted by two universities in the Helsinki area. Where do you see Finland on the global AI radar?

Finnish researchers have contributed a lot to the theory of machine learning and Bayesian inference. And in related areas of tech like mobile communications, there’s an amazing record in Finland.

I’m involved in talking with governments about the future of education, and Finland understands, as few countries have, that we have to start planning now for the world that is going to exist in 20, 30, 40 years as AI systems become more prevalent and capable. It will have a huge impact on the economic roles for humans. The short-term view of many governments is to train everyone as data scientists, the job of the future.

That’s a huge mistake. The long-term view is that humans are more likely to be involved in interpersonal services than they are to be in manufacturing or programming or robot engineering. Those roles can be filled by machines. Interpersonal roles are harder for machines and we may not even want machines to fill those roles.

Childcare for example is enormously underpaid and underskilled. There’s a reason we don’t pay babysitters as much as surgeons, even though children are important. We invested centuries in developing biology and medicine but not the human sciences, psychology and the development of the individual, what it means to have a rich and fulfilling life. The emphasis in the education system of the future will have to be more humanistic, away from physical objects like cell phone tech, toward research to understand the human in such a way to fill those roles, like childcare, much more successfully. And hopefully get paid more!

As the global geopolitical situation is becoming very unstable, can AI help? Or are there only killer robots on the horizon?

They are on the horizon and the AI community has been pushing back against that. We absolutely don’t want killer robots to be the most prominent application in the public’s mind. For ethical reasons, we feel that turning over the decision to kill humans to algorithms is wrong in principle, but for practical reasons, this also enables weapons to become weapons of mass destruction. An AI weapon does not require a human to hold it or supervise it, and unlike piloted objects or drones it has immediate scalability. One person can launch a million automated weapons, which is a terrifying prospect. There’s a reason we don’t sell nuclear weapons in supermarkets. But this would be the equivalent.

As for whether AI can help with conflict reduction, that’s a tough question. At the moment, AI cannot help, because AI systems don’t understand why people are in conflict, so they can’t propose solutions or mediate. In fact, AI have been facilitating conflict by supporting a social media environment where people become polarized. Interacting with algorithms hundreds of times a day, that frequency can modify political opinions very effectively over a few weeks. It’s partly, but not the only, contributor over the last five years to the sense that progress towards peace and coexistence has been going backwards.

Amanda AlvarezMarch 22, 2022Society, Interview, Events