AI trained to play old Atari games uncovers puzzling Q*bert bug

Shawn Knight

Posts: 15,294   +192
Staff member

Researchers from the University of Freiburg in Germany recently stumbled upon an “interesting solution” when an artificial intelligence system training to play classic Atari games uncovered a bug in Q*bert they still don’t fully understand.

In one of two curious solutions, the AI agent playing Q*bert completes the first level then starts to jump from platform to platform in what appears to be a random manner. Inexplicably, the game does not advance to the second round as expected. Instead, the platforms start to blink and the agent quickly gains a large amount of points.

(the unusual behavior starts around 20 seconds in)

Interestingly enough, the agent is not always able to exploit the in-game bug. In 22 out of 30 evaluation runs (with the same network weights but different initial environment conditions), the agent yields a low score.

Word of the quirky behavior found its way to Twitter where one of the game’s creators chimed in on the matter. Although he said he couldn’t really say much about any port as he designed and programmed the original arcade version, Warren Davis noted that this “certainly doesn’t look right” and that similar behavior likely wouldn’t be present in the arcade version (the AI was trained on a PC port of the game).

(the suicide cycle)

In the other interesting solution, the agent gathers some points at the beginning of the game then seemingly stops showing interest in completing the level. Instead, it starts tricking an enemy into killing itself. Although the agent loses a life in the process, killing the enemy yields enough points to gain an extra life. The agent then proceeds to repeat the cycle of suicide indefinitely.

Permalink to story.

 
I've never played Q*bert so is the objective to beat as many levels as you can or get as many points as you can? Interesting that the AI went for the points route when it could. Maybe it associated high points with a higher level thus trying to get as many points as it could per level.
 
The AI behavior reminds me of people playing mario and "programming" the game while playing by jumping on certain blocks and throwing stuff over and over. Similar thing in pokemon editions were your name and the things you do can trigger stuff like missingno.
 
I've never played Q*bert so is the objective to beat as many levels as you can or get as many points as you can? Interesting that the AI went for the points route when it could. Maybe it associated high points with a higher level thus trying to get as many points as it could per level.
So it is human?
 
Back