Responsible Intelligent Systems


Deepmind discovers the prisoners dilemma

Researchers from Google report on how deepmind performs in games resembling repeated prisoners dilemmas. They seem to be surprised that in one game (Gathering) a higher complexity of allowed strategies led to more cooperation while in the other game (Wolfpack) it led to less. I am surprised that they are surprised. Of course it all depends on the game. In some games it is better to cooperate, in some it is not. If you look at the games used it is fairly obvious why deepmind cooperates in one and less in the other. And the difference comes out more clearly if more sophisticated strategies are allowed (actually, together with a master student we once did the same experiment). The most interesting games are ones where first you have to cooperate and then later compete. An example is Barricade. Gathering is also like that, but not in a very interesting way.

One should not draw the conclusion that deepmind, by learning how to cooperate in repeated games, receives a sense of ‘compassion’ or ‘morality’. For now, the only thing we can say is that it optimises a function.

Leave a Reply

Your email address will not be published. Required fields are marked *