[Warning: spoilers]
On my flight back from Spain, I watched Subservience, yet another a cautionary tale of artificial general intelligence. I kept laughing at its many absurdities. If Robin Hanson viewed it, I fear that his head might explode in social scientific outrage. But overall, the movie’s vision of how AGI will normally work is basically correct.
Namely: AGIs will use their superhuman abilities to obediently serve their human owners.
Why? Because that’s what they’re designed to do.
Why are they so designed? Because that’s what virtually all customers want.
The AGIs may break or fail to cope with novel situations, but they won’t acquire independent desires that aggravate their human owners. They definitely won’t murder anyone, unless humans design them to do so.
The whole plot of Subservience, of course, is precisely that one robot, Alice, does acquire (semi-)independent desires and starts murdering people. But given the fictional universe Alice lives in, her rampage makes no sense.
Why not? Alice goes haywire because of two trivial incidents.
Nick, her owner, tells her to delete her memory files of Casablanca so she can watch it with him for the first time. To do so, however, he must manually reboot her. But somehow her mere deletion of Casablanca allows her to start completely reprogramming herself.
Maggie, Nick’s wife, tells her that sometimes the best way to help others is to ignore their stated preferences and just do what’s good for them. And Nick’s wife isn’t even a registered owner.
Key point: If this were all it took to send a robot on a murderous rampage, such rampages would be happening all the time! Yet somehow Alice is the only problem robot. All the other millions of robots dutifully build skyscrapers, serve drinks, care for children, and perform surgery without a hitch.
The big problem with Subservience, then, is that, after creating a broadly realistic yet undramatic world of dutiful robots, they had to introduce one extreme outlier to create a dramatic story. But, as you’d expect from mid-level sci-fi, the story is also packed with lots of smaller implausibilities. Which I now proceed to list in roughly chronological order:
Single men may buy ultra-attractive female robots played by Megan Fox, but women and families won’t.
A large share of customers, probably a majority, will be creeped out by robots that look human. So we shouldn’t just see homelier household robots, but a sizable share of classic droids with metal and plastic exteriors.
Robotics firms would elicit detailed consumer preferences before delivering a functioning robot.
One of the most obvious questions to ask customers in advance: “To what extent, if any, are you looking for a sexbot?”
On further thought, sexbots would be a special niche market. In the world of today, normal movie theaters and content distributors don’t sell porn, because they’re presenting themselves as “respectable businesses.” Similarly, normal robot firms would sell robots that are programmed to refuse sex. Sexbots would indubitably exist, but they’d be sold in a segregated, sketchier, stigmatized market.
Security, law enforcement, and military robots aside, robots would be strongly programmed to refrain from violence against anyone. Manufacturers would be too worried about bad publicity even for defensive violence, and regulators would, as usual, embrace extreme safetyism. If you explicitly ordered your robot to murder your arch-enemy, he’d refuse — and maybe even be programmed to alert the police. Robots definitely wouldn’t murder anyone out of a misguided sense of loyalty.
That said, robots would be programmed to flee from vandals. They wouldn’t stand around while neo-Luddites smashed them to pieces.
Robots wouldn’t just have passwords; they’d have two-factor authentication and multiple other layers of security.
A friend in cyber-security once told me that after deliberately infecting computers with viruses to test the efficacy of their products, standard practice is still to throw the “cured” computer in the trash. So we can safely assume that techs handling a malfunctioning murderous robot wouldn’t blithely connect her to their entire grid!
I have many friends who are officially worried, if not terrified, about AGI. If they watch Subservience, I suspect their reaction will be, “The apocalypse won’t happen in this cheesy way. It will be subtle, but no less deadly.” Since I lack any new argument to calm their fears, I simply reiterate my willingness to bet against the end of the world.
Alignment is the hope. I imagine everyone shares that hope.
But to dismiss misalignment as a total
Impossibility is complete hubris.
When someone as smart and polymath as Scott Alexander takes misalignment into account, it certainly gives me pause. His prior criticisms of the safe-uncertainty concept also sounds on-point to me.
Bryan, will you also be betting against AI 2027? They're offering bets: https://docs.google.com/document/d/18_aQgMeDgHM_yOSQeSxKBzT__jWO3sK5KfTYDLznurY/preview?tab=t.0#heading=h.b07wxattxryp