The fantasies around “basic AI drives”
Through this well written, but I think misguided article in Vox about why we have to take the dangers of AI seriously, I came across the paper of Stephen Omohundro on “basic AI drives”. This paper is also mentioned by Nick Bostrom in his book on Super-intelligence, and it has stoked some controversy.
The paper basically argues that AIs of the kind we already have are “goal seeking systems” and that the danger is in the fact that for such systems, those goals will always justify the means; if the goal for a computer is to win in chess, anything that stands in the way of achieving that narrowly defined goal (things like resources, humans) will be under threat, and anything that increases the chances of reaching that goal (things like obtaining more computation power, adopting harmful tactics, torturing chess grandmasters) could eventually actively be pursued. So, according to Omohundro, we even have to be afraid of simple chess algorithms; they are potentially dangerous mechanisms waiting for the right circumstances and opportunities to achieve their narrowly defined goal regardless of the means by which to achieve them.
Now, I think that is a major mistake. I have searched online if I could find somebody else having the same kind of criticism as I do, but did not succeed. So I will try to briefly sketch my counter argument here.
First of all (but this is not my main point), clearly AIs of the kind we have now (Turing machines) are not “goal seeking systems”. Every AI we have build until now is based on a classical algorithm determined by us that goes through a search space that is defined by us. For a chess program, we define the search space to be the game of chess. The program that defines how we go through that search space to find out what to do in which situation to maximise the chance of winning, has no goals: we have. And for this it does not make a difference if our algorithm is based on symbolic rules reflecting our knowledge of the game, or on neural networks, evolutionary techniques, or yet some other AI-technique: all these techniques just embody different meta-strategies of coming to good strategies within the search space of the game of chess. There are no AI techniques currently available that give rise to AIs who can be said to have goals in exactly the same way as we ourselves have goals. But, for now, let us grant AI-researchers their intentional stance, and acknowledge that saying ‘the AI has goal x‘, although probably not truthful, is at least a useful way of talking about algorithms: useful for AI designers.
My main criticism is thus not aimed at the possible misuse of the term “goal seeking systems”. Even if algorithms can have goals (and believes) only in the sense of the intentional stance, maybe one can think they could still be dangerous in the way Omohundro suggests. However, I believe they cannot. And here comes my main point: the algorithms we use for specific optimisation problems have no use or meaning whatsoever outside the search space relative to which they are defined by us to pursue their goals. To suggest that a chess playing program if given enough computation power would start to pursue ways of reaching its goal that require it to step outside of the search space relative to which we have given it that same goal, makes no sense. It is impossible by definition: its goals (I am applying the intentional stance here) only have meaning relative to the search space relative to which we defined them, and do therefore, by definition, not apply to strategies referring to the world outside of it. Those tactics suggested by Omohundro therefore cannot come off the ground.
Ok, one could throw back: then what about algorithms whose search space is extended to deal with tactics and strategies that strictly speaking are not part of the game of chess? Well, that is begging the question. Clearly if we employ a chess playing algorithm whose strategies and tactics encompass actions that are not limited by the game of chess, we can expect it to find those strategies and tactics that Omohundro is talking about. If we learn how to play a game of chess to a military drone that has the capacity to identify combatants and kill them, we should later on not complain or be surprised when it kills its chess opponent.
So, then one might ask: what is the general search space for human intelligence? I do not believe that there is a good answer to that question or that it is even a sensible question to ask. If there is such a thing as a most general search space, then probably we are indeed limited in our way of going through it (for instance, we need to reason correctly, although even that can be doubted) which would then maybe point to an algorithm of some kind (based on, or pointing at decidable logics..?). But, I have a hard time seeing how such a most general search space would look like and how it could be manageable: it would need to be infinite along so many different dimensions (concepts, time, experiences, classifications, etc.) that I find it hard to say anything more about it then I have already tried to do now.