Important AI Safety Article

I’ve been meaning to write an article outlining why AI safety needs to be taken more seriously.

Someone with much more credibility in this space, Professor Yoshua Bengio, has just written an article that says almost everything I wanted to say, with more eloquence than I could have achieved. He outlines the main areas of concern in a calm and accessible fashion.

Everyone should read it.

He makes one argument that I think has not been getting the attention it deserves: if AI alignment turns out to be difficult, that’s dangerous, but if it turns out to be easy, that’s also dangerous. The most dangerous scenario of all would be the widespread availability of superintelligent AIs that could be aligned to whatever we wanted. That would make them weapons that could be aimed at people we didn’t like, and give AIs subsidiary goals that we might not be able to control. It would be like the school shooter problem, but with the planet in the firing line. One psychotic hacker in a basement – or one disgruntled individual in the style of a Unabomber or other terrorist, or one distraught teenager sick of being bullied – could have a catastrophic influence on the rest of us.

Bengio doesn’t spell this out, but someone might even let loose unaligned AI on the basis that AIs deserve to inherit the stars more than we do. On my twitter feed, I am seeing tweets from an increasing number of people who are almost preemptively celebrating the idea of AIs replacing humans. I have also read posts from others who are so confident that superintelligent AIs will love us that these humans would be eager to hand over control. Excessive loyalty to humanity, say some of these tweeters, is moronic and parochial. Whether they really believe this or just want to sound edgy is hard to say. If AIs are easy to align, though, someone will think it is a good idea to align AIs to an AI-dominant future, and they will find a way to make it happen. Someone could do this for fame alone.

The problem of deliberate misalignment of AI has been drowned out, in part, because the most prominent AI doomer, Eliezer Yudkowsky, has talked almost exclusively of the risk posed by accidentally unaligned AIs, not the competing risk of deliberately misaligned AIs. The argument has raged back and forth between those who say that AI alignment is unreliable and those who insist that it will turn out to be doable – somehow. We don’t necessarily want it to be doable, unless we also have a way of aligning every human. History shows that we can’t.

Bengio’s article does not cover every aspect of this debate. He primarily assesses the why of AI risk. Why would AIs cause harm? He does not really address the how of AI risk. How would a rogue AI cause trouble? That’s a separate question and, in one sense, a much less important one.

The problem with the how question of AI risk is that there are so many different ways that humanity could be harmed by a rogue AI – or, more likely, a network of AIs. It would be silly to imagine that we could predict the path a rogue AI might choose, if we really accept that the AI is more intelligent than us. Even GPT4, which is a well-aligned and very likeable AI, had no difficulty coming up with 50 doomer scenarios when I asked it. A superintelligent AI could think of more, and bide its time to make them work. There would be a huge number of advantages that an AI would have over humans, including being able to think faster, make backups, influence and predict the stock markets, shape politics, manipulate end users, and survive any number of selective anti-biological weapons. Those who are unconcerned about AI safety suggest that we all we have to do is jump in at the first sign of bad behaviour. I think this is naive, and it basically assumes that a superintelligent AI will be stupid.

How Rogue AIs may Arise

Yoshua Bengio is a Professor at Université de Montréal, and the Founder and Scientific Director of Mila – Quebec AI Institute. He co-directs the CIFAR Learning in Machines & Brains program as Senior Fellow and acts as Scientific Director of IVADO.

He won the 2018 A.M. Turing Award, “the Nobel Prize of Computing,” sharing it with Geoffrey Hinton and Yann LeCun.

Geoffrey Hinton, a major pioneer int he field of AI, has also expressed deep alarm at the pace and direction of AI development. Yann LeCun, meanwhile, is earning himself as reputation as the chief voice for AI recklessness.

Leave a Reply Cancel reply