Bostrom, Nick. Superintelligence: Paths, Dangers, Strategies. Oxford, London. 2014. Electronic version for Kindle. NF; 12/15
How much should we worry about something possible but unlikely, but hideously dangerous? Somehow having collectively decided that all-out nuclear war just isn’t going to happen, I personally don’t worry about that as much as I used to. But what about suicidal religious radicals getting their hands on a nuclear weapon and blowing up a major city? I would say that’s possible, some people might think it’s even inevitable. The good news? That’s just a city, probably not the one I live in. But how about the destruction of everybody, suddenly and by surprise? Not many of us are misanthropic enough to find a bright side to that scenario.
Bill Gates, Elon Musk, Stephen Hawking, and other credible-sounding characters famously about a year ago expressed themselves on the dangers of runaway artificial intelligence (AI), and the author of this book appears to be one of the academic experts on this difficult-to-encompass topic. After reading it I find it hard not to conclude that a “superintelligence explosion” is possible, and might even be inevitable. It would just be a matter of time (speaking of upsides, the good news for people like me being chances are we’d be long gone).
Author Bostrom discusses various means through which this potential nightmare would occur. In general they involve a machine intelligence developed in an experimental setting and endowed with strong and recursive ability to improve itself. To write code, essentially. Eventually to think and improve million times faster than the smartest human, through that unimaginable technical and logical mastery to take control of existing means of physical accomplishment, and to be able to develop new ones at lightning speed, all to serve its values.
But what values? On the plus side, a superintelligent machine could solve all sorts of famous and bedeviling problems (avoiding war, achieving universal affluence, eradicating disease). But the devil is in, not the details, but the overriding control of such a creature. How could we define its values? Under what circumstances if any would it self-destruct or shut down? How could it understand and promote human values when we probably don’t ourselves know what they ultimately are or should be? And who would decide?
What, I found myself asking, would such an intelligence be like? Hal in Space Odyssey? My mum but a thousand times smarter than Einstein? A fabulously capable zombie intelligence with no values or soul, or with simple idiotic values? The most wonderful oracle or enlightened leader imaginable? What if it’s value was to produce paperclips?
One approach to some of the potential difficulties with this Superman Frankenstein machine could be to make it help us solve them. The simple critical path (let alone anything approaching the logistics or technology involved) of this appears daunting, since based on the logic of how such a machine would improve itself it’s hard to see it honestly helping us solve our problems without considering something approximating its own interest.
The style in this frightening and compelling book is directly expository, occasionally enjoyably figurative, and often technical and for me hard to understand. There is a fair bit of talk for example about coherent extrapolated volition (CEV), conceived by artificial intelligence theorist Eliezer Yudkowsky thus:
Our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.
When Yudkowsky wrote this, he did not purport to present a blueprint for how to implement this rather poetic prescription.
Indeed…Another compelling line of thought to me is the necessity to prevent action by our superintelligent creature, and its importance overriding that of the need for it to act. Combining this idea with what to do about inevitable human differences, Bostrom says:
It should require less consensus for the AI to prevent some particular narrowly specified outcome, and more consensus for the AI to act to funnel the future into some particular narrow conception of the good.
On issues on which there is widespread irreconcilable disagreement, even after the various idealizing conditions have been imposed, the dynamic should refrain from determining the outcome.
… and in a wide variety of situations simply shut itself down.
But none of this sensible-sounding conjecture reassures me much. Bostrom characterizes us as children playing with a bomb, suggesting that our level of ignorance of the danger, the technology once it gets out of hand, and the pretty fabulous consequences including human extinction perhaps should worry us more than it does. We are likely to get the necessary solutions wrong, and there is certainly non-trivial likelihood that in the scenario Bostrom describes, we would only get one chance.
Bostrom knows there are many ifs and unanswered questions. Is such a thing possible and will it happen? If we get our act together sufficiently to get some sort of safety lid on its development, if we can agree, if we conclude that experimentation in this area could continue (and if we can control it), and if there are enough “shutdown” commands preventing a wide range of catastrophes including ones nobody has imagined, wouldn’t it be inevitable that the machine would shut down? Wouldn’t we be better off not to let it get started in the first place? Or could we even control that?
Possibly there’s nothing to worry about. If there is, however, the hell of it would be that most people are going to do exactly what I’m doing: worry while I’m thinking about it but not think about it much. 8.9/7.1.