Google Tests AI vs AI- to see if they become 'aggressive' or cooperate
Google's
artificial intelligence subsidiary DeepMind is pitting AI agents against one
another to test how they interact with each other and how they would react in
various "social dilemmas". In a new study, researchers said they used
two video games – Wolfpack and Gathering – to examine how AI agents change the
way they behave based on the environment and situation they are in using social
sciences and game theory principles.
"The
question of how and under what circumstances selfish agents cooperate is one of
the fundamental questions in the social sciences," DeepMind researchers wrote in a blog post. "One of the simplest and most
elegant models to describe this phenomenon is the well-known game of Prisoner's
Dilemma from game theory."
This
well-known principle is based on the scenario where two arrested suspects
jointly accused of a crime are questioned separately. The police only have
enough evidence to charge them for a minor offence but not enough for the main,
more serious crime.
Both
prisoners are offered the same deal: If you testify against the other prisoner
- a move known as "defecting" - you will be released and the other
will serve three years in prison. However, if both prisoners confess, both will
be handed a two-year prison sentence, leading to the social dilemma.
The
researchers found that the AI's behaviour does change based on its environment
and may react in an "aggressive manner" if it feels it may lose out.
However, the agents will cooperate and work as a team if there is more to be
gained.
In the game Gathering, two players are tasked with collecting apples from a central pile. However, they also have the option of "tagging" the other player with a laser beam to temporarily remove them from the game, giving the first player more time to gather apples (represented as green pixels in the video embedded below).
The
researchers allowed the agents to play the game thousands of times over and
"let them learn how to behave rationally using deep multi-agent
reinforcement learning". When there were enough apples in the environment,
the agents learned to "peacefully coexist" and collect as many apples
as they could. However, when the number of apples was reduced, the agents
learnt "highly aggressive policies" and fired their lasers at each
other when faced with a situation where there were fewer resources and the
possibility of not getting a reward.
"The
greed motivation reflects the temptation to take out a rival and collect all
the apples for oneself," the researchers wrote. "The fear motivation reflected the danger of
being taken out oneself by a defecting rival."
In
the second Wolfpack game, two red players acting as in-game wolves have to hunt
a third blue player, the prey, in an environment peppered with grey obstacles.
If both wolves are near the prey when it is captured, both will receive a
reward. However, if one wolf manages to capture it, there is a risk of losing
the carcass to scavengers.
The
researchers found that the smarter the AI agent is, the more likely it is to
cooperate and work together with other players. They said this is likely
because learning to work together and cooperate in Wolfpack requires more
coordination and computational power than Gathering does.
They
noted these experiments showed that AI may alter their behaviour to become more
cooperative or aggressive depending on the situation, the rules that are in
place and what is at stake.
"We
showed that we can apply the modern AI technique of deep multi-agent reinforcement
learning to age-old questions in social science such as the mystery of the
emergence of cooperation," researchers wrote. "As a consequence, we
may be able to better understand and control complex multi-agent systems such
as the economy, traffic systems, or the ecological health of our planet - all
of which depend on our continued cooperation."
Source | http://www.ibtimes.co.uk/
Regards
Pralhad
Jadhav
Senior
Manager @ Library
Khaitan & Co
No comments:
Post a Comment