ChatGPT gets crushed at chess by a 1 MHz Atari 2600

Alfonso Maruccia · Tuesday at 1:50 PM

Editor's take: Despite being hailed as the next step in the evolution of artificial intelligence, large language models are no smarter than a piece of rotten wood. Every now and then, some odd experiment or test reminds everyone that so-called "intelligent" AI doesn't actually exist if you're living outside a tech company's quarterly report.

A cycle-exact emulation of the Atari 2600 CPU running at a meager 1.19 MHz is more than enough to utterly humiliate ChatGPT in a game of chess. Citrix engineer Robert Jr. Caruso conducted the "funny" little experiment over the weekend, pitting OpenAI's mighty chatbot against a virtual Atari 2600 console emulated by Stella. It didn't end well for the chatbot.

Caruso reportedly got the idea from ChatGPT itself, after chatting with the bot about the history of AI and chess. OpenAI's service volunteered to play "Atari Chess," which Caruso assumed referred to Video Chess – the only chess title ever released for the Atari 2600.

Despite being given a basic layout of the board to identify the pieces, ChatGPT struggled. The bot confused rooks for bishops, missed obvious pawn forks, and made a series of baffling blunders, according to Caruso. At one point, ChatGPT even blamed external factors like the abstract symbols used by Video Chess to depict the pieces for its inability to keep track of the game state.

"For 90 minutes, I had to stop it from making awful moves and correct its board awareness multiple times per turn," the engineer said about ChatGPT's performance against an emulated CPU console from the 70s.

The bot apparently kept asking to restart the game in hopes of improving its performance, but was ultimately defeated by an 8-bit chess engine. A 1 MHz CPU should, at best, be able to think one or two moves ahead, while ChatGPT relies on an endless army of modern, power-hungry GPUs to keep its chat service running. And yet, the 1 MHz CPU won, thrashing the chatbot at beginner level.

Also check out:
The Legends of Tech Series: Atari 2600 - The Atlantis of Game Consoles

Caruso's experiment is a useful reminder about what LLM models actually are: a complex, heuristics-based black box search engine designed to constantly please the final user with some sort of captivating result. They don't "know" anything, have no reasoning or deduction capabilities, and certainly they have no intelligence on their own.

And they absolutely suck at chess.

I never owned an Atari 2600 back in the day, though I did spend some glorious afternoons with my mighty Intellivision console. Next time, I'll try to humble ChatGPT by making it play a round of Battle Chess on an emulated replica of my first x86 machine: an 80286 running at a blazing 16 MHz.

Permalink to story:

ChatGPT gets crushed at chess by a 1 MHz Atari 2600

VitalyT · Tuesday at 2:01 PM

Purpose-built software will always beat any generic AI.

For the AI to become competitive, it would need to use chess-specific AI models. But it wouldn't go much further, because such models would require significantly more computational resources to compete with highly optimized chess algorithms in the specialized software.

wiyosaya · Tuesday at 2:06 PM

Despite being hailed as the next step in the evolution of artificial intelligence, large language models are no smarter than a piece of rotten wood.

That's a revelation.

I don't know how anyone would be dumb enough to think that an LLM could play chess. Heck, it probably hallucinated that it was making moves.

Dr Roboto · Tuesday at 4:06 PM

If I understand correctly, he was some how giving ChatGPT images of the screen and giving that to ChatGPT. This required it to first use its image capability to figure out the pieces. I would assume that it failed miserably at this and thus anything else would have been pointless. I would be a lot more interested if he just gave it the locations of the pieces as text, then how would it have been different.

RudyBob · Tuesday at 4:21 PM

"...no smarter than a piece of rotten wood." BEST of the Day

Tomox · Tuesday at 5:43 PM

Dr Roboto said:
If I understand correctly, he was some how giving ChatGPT images of the screen and giving that to ChatGPT. This required it to first use its image capability to figure out the pieces. I would assume that it failed miserably at this and thus anything else would have been pointless. I would be a lot more interested if he just gave it the locations of the pieces as text, then how would it have been different.

This. Image recognition of chatgpt has yet to improve a lot.

Megalomaniac · Tuesday at 7:36 PM

But can the AI design a chess computer algorithm that can't be beaten???

Hassanabi · Tuesday at 8:17 PM

After looking at his Linkedin page this seems more like:

"Citrix engineer that's out of a job since December 2024 uses AI-hype to bring attention to his page and thus increase the likelihood someone contacts him for work."

PCoder · Tuesday at 9:08 PM

Megalomaniac said:
But can the AI design a chess computer algorithm that can't be beaten???

Exactly. This has gotten old fast. Even Apple got in on it claiming AI can't reason because it actually reasoned that spitting out the solution to a 10 plate Tower of Hanoi game is too much to do in one go. But it can write an algorithm to solve it.

Liranan · Tuesday at 11:08 PM

This is what these companies are investing tens of billions a year into. Ridiculous.

AnilD · Tuesday at 11:12 PM

GenAI is incredibly good at mimicking language and sometimes reasoning through common patterns, but they don’t actually understand the world or apply consistent logic across time.

Xelions · Tuesday at 11:59 PM

Haha

Still waiting for True A.I. - not some behind the scenes heavily human moderated encyclopedia britannica that returns results like a human.

OortCloud · 2025-06-11T03:21:37-0400

Xelions said:
Haha

Still waiting for True A.I. - not some behind the scenes heavily human moderated encyclopedia britannica that returns results like a human.

Be careful what you wish for! AGI's might very well murder us all in our beds when they arrive.

Milkyjoe · 2025-06-11T05:52:32-0400

But is A I like a person it can be only as good as what its been taught, you think if it specialized in everything it would be to big.

VaRmeNsI · 2025-06-11T07:54:37-0400

Most cloud AI's are probably gimped to reduce usage costs.

An properly setup offline LLM should perform a lot better.

mbk34 · 2025-06-11T08:51:10-0400

In the early 1990's I used to play those little pocket chess computers. They had slow processors (0.6MHz) and fairly weak programs but they used to thrash me at the time. A modern program running at 1MHz would beat anyone in your local chess club. So it's not surprising that such a program would beat a chatbot playing at beginner level.

If anyone wants a laugh then I had a go at writing a chess program in Java. Google for "bikes and kites fun chess" if you want to give it a game. It's fairly easy to use and plays at a decent club player level. It was surprising difficult writing a program to play a decent game. It was even hard just writing one that played like a beginner

TheBigT42 · 2025-06-11T09:09:42-0400

I have stopped using Chat GPT for anything because it gives incorrect information all the time. It simply makes stuff up.

RF Match · 2025-06-11T09:36:01-0400

Milkyjoe said:
But is A I like a person it can be only as good as what its been taught, you think if it specialized in everything it would be to big.

Yep!!!
Garbage in...garbage out !

RF Match · 2025-06-11T09:38:50-0400

VitalyT said:
Purpose-built software will always beat any generic AI.

For the AI to become competitive, it would need to use chess-specific AI models. But it wouldn't go much further, because such models would require significantly more computational resources to compete with highly optimized chess algorithms in the specialized software.

So, if Atari 2600's were specifically programmed for world domination?? hmmm....

netlight · 2025-06-11T13:29:46-0400

LLM's can provide chess insight. I've used them on positions sometimes where they actually can identify and recognize brilliant moves. However, since they usually generate and since chess is an exact thing, a tiny error will have enormous consequence and over the course of a game that is deadly. Instead, there have been developed neural networks that are brilliant at chess; for instance google's Alpha Zero taught itself to play chess better than any chess engine or human in record time. I've played against one, Matthew Lai's Giraffe, and it played absolutely humanlike at a high level. It convinced me that such a thing as creativity is absolutely within the grasp of AI, and we've seen plenty of that lately, especially with LLMs creating art and music etc. But tell an LLM to make a realistic chessboard with pieces and it will almost always induce some type of error. I tried it on I believe it was Dalle 3, and it made a perfect chessboard but the pawns on d2 and e2 were missing, which I didn't notice at first because the king and queen standing right behind the pawns blocked the view partially. In itself that could be considered an artistic painting or perception-related art.

wiyosaya · 2025-06-11T14:57:00-0400

RF Match said:
So, if Atari 2600's were specifically programmed for world domination?? hmmm....

Sounds, perhaps, like a futuristic version of the movie "Attack of the Killer Tomatoes"?

"Attack of the world dominating Atari 2600s"

HardReset · 2025-06-11T15:12:44-0400

netlight said:
LLM's can provide chess insight. I've used them on positions sometimes where they actually can identify and recognize brilliant moves. However, since they usually generate and since chess is an exact thing, a tiny error will have enormous consequence and over the course of a game that is deadly. Instead, there have been developed neural networks that are brilliant at chess; for instance google's Alpha Zero taught itself to play chess better than any chess engine or human in record time. I've played against one, Matthew Lai's Giraffe, and it played absolutely humanlike at a high level. It convinced me that such a thing as creativity is absolutely within the grasp of AI, and we've seen plenty of that lately, especially with LLMs creating art and music etc. But tell an LLM to make a realistic chessboard with pieces and it will almost always induce some type of error. I tried it on I believe it was Dalle 3, and it made a perfect chessboard but the pawns on d2 and e2 were missing, which I didn't notice at first because the king and queen standing right behind the pawns blocked the view partially. In itself that could be considered an artistic painting or perception-related art.

Alpha Zero was provided basic information about chess and then it basically plays games against itself and "learns" from there. However chess engines are not intelligent but actually very stupid. They are only strong when they can calculate millions of possible move combinations. Take that away and they mostly suck. Few years ago I trounced Stockfish with following settings/rules: both have maximum 3 seconds per move and Stockfish has only 2 depth. When taken away ability to calculate 50 moves ahead, it just don't have enough "intelligence" to beat even amateur like me. Also evident when playing against online bots, like on chess.com. I have beaten Maximum engine ("3200 ELO") multiple times.

And yeah, I even don't have ELO rating, so as for chess "intelligence" engines have long way to go. (Friend around 2400 ELO estimated my ELO around 1600 for general chess play, on openings I suck because I really don't remember them.)

Tantor · 2025-06-11T17:54:27-0400

netlight said:
... there have been developed neural networks that are brilliant at chess; for instance google's Alpha Zero taught itself to play chess better than any chess engine or human in record time. I've played against one, Matthew Lai's Giraffe, and it played absolutely humanlike at a high level.

Thanks for that interesting post. Back in 2015, Popular Mechanics wrote an article about Giraffe.

"While it can't really compete against top-of-the-line engines, and sits somewhere in the middle rankings of the most active chess engines, it does stack up well against contemporaries."

https://www.popularmechanics.com/technology/robots/a17339/chess-engine-plays-against-itself/

gkovacs · 2025-06-12T05:34:01-0400

netlight said:
It convinced me that such a thing as creativity is absolutely within the grasp of AI, and we've seen plenty of that lately, especially with LLMs creating art and music etc.

If you talk about the "creativity" of generative AI, it's non-existent. It merely barfs up the creativity from the works of art it was trained on. These models are inherently deceptive, and people turning towards them with this naive wanderlust is making the situation even more dire.

ChatGPT gets crushed at chess by a 1 MHz Atari 2600

Posts: 1,790 +539

Posts: 7,252 +8,897

Posts: 10,620 +11,327

Posts: 296 +733

Posts: 2,102 +2,340

Posts: 298 +310

Posts: 36 +66

Posts: 216 +332

Posts: 355 +509

Posts: 169 +119

Posts: 1,504 +1,976

Posts: 64 +48

Posts: 1,495 +2,238

Posts: 1,023 +883

Posts: 928 +1,051

Posts: 40 +43

Posts: 40 +43

Posts: 10,620 +11,327

Posts: 2,743 +2,196

Posts: 579 +893

Similar threads