Thought Pokémon was a tough quality for AI? A group of researchers argue that the Super Mario Bruce is even tough.
A research organ at California University San Diego, Hao Ai Lab, threw AI directly into the Super Mario Bruce Games on Friday. Anthropic Claude 3.7 Claude 3.5, followed by the best. Google Gemini 1.5 Pro And Openai’s GPT-4O Struggle
It was not the same version of Super Mario Bruce as the original release of 1985 to be clear. The game went into an emulator and integrated with an framework, GaminggentTo control AIS on Mario.

Gamingent, who prepared HAO at home, fed AI basic instructions, such as “If any obstacle or enemy is near, leave for a dodge/jump” and screenshots in the game. The AI then developed input in the form of a coded code to control Mario.
Nevertheless, Hao says the game forced every model to “learn” complex tactics and develop a gameplay strategy. Interestingly, the lab found that the models of reasoning like Openi O1“Thinking” through step -by -step problems to reach the solution, despite being generally strong on most standards.
According to researchers, the reasoning models have difficulty playing such real-time games is that they take some time-the second, according to the researchers. In the Super Mario Bruce, time is everything. One second means clearly cleaning the jump and a palmet for your death.
Sports have been used for decades to Benchmark AI. But Some experts have raised the question on wisdom Drawing between AI’s gaming skills and technological development. Unlike the real world, sports are abstract and relatively simple easy, and they provide theoretically infinite data for AI training.
The recent shiny gaming benchmark pointed out that Andridge Carpeti, a research scientist and founder member in the open, is called the “diagnosis crisis”.
“I don’t really know what is [AI] Matrix to see now, “he wrote in a post on x. “TLLD my reaction is that I don’t really know how good these models are right now.”
At least we can see the AI game Mario.