AI Math Proof Milestone: DeepMind Cracks 9 Erdős Problems, Magnetar Confirmed

DeepMind’s AlphaProof Nexus cracked 9 open Erdős problems as Fermi confirmed a magnetar engine.

Erdős Problems
Erdosproblems.com

Artificial intelligence can now solve open research-level mathematics problems — not just competition questions — and the May 2026 issue of Science News documents the moment the field registered that shift. Published today, the issue covers three simultaneous advances that reconfigure how scientists understand language in the brain, proof-making in mathematics, and energy sources in the universe's most violent explosions. The most consequential of the three: Google DeepMind's AlphaProof Nexus, detailed in an arXiv preprint published May 21, 2026, formally solved nine open problems from Paul Erdős's legendary research list — questions that had stumped mathematicians for decades, some for more than half a century.

AI Proves Research-Level Math: AlphaProof Nexus Cracks 9 Erdős Problems

AlphaProof Nexus combines Google's Gemini 3.1 Pro language model with Lean, a formal proof assistant that checks every logical step against mathematical axioms. The system does not just sketch an argument — it generates a machine-verified Lean file, a cryptographically reliable proof in which every inference is confirmed correct. That distinction matters enormously: large language models are well-documented to hallucinate plausible-looking proofs that contain hidden errors. By anchoring every step to Lean's verification kernel, AlphaProof Nexus produces proofs that compile and are formally correct rather than merely convincingly formatted.

The system tackled 353 open problems from the Erdős problem catalog, a benchmark that Erdős himself maintained throughout his life and that researchers continue to update. It solved nine, including two that had been open for 56 years, and proved 44 conjectures from the Online Encyclopedia of Integer Sequences. Inference costs ran to a few hundred dollars per problem — a figure that signals a structural shift from expensive human-hours to affordable automated search. DeepMind CEO Demis Hassabis was careful to note the system is "still not AGI," but the output itself — formal proofs on GitHub, verified, publishable — speaks to a genuinely new threshold.

How AI Formal Proof Verification Works With Lean

The architecture behind this shift involves interactive theorem provers that have existed for years — Lean, Coq, Isabelle — but that required mathematicians to learn a formal language far removed from how mathematics is actually written. AI is dissolving that barrier. As of February and March 2026, Math, Inc.'s AI autoformalization agent Gauss completed the machine certification of Ukrainian mathematician Maryna Viazovska's Fields Medal-winning sphere-packing proofs in dimensions 8 and 24. The dimension-24 case alone runs to roughly 180,000 lines of Lean code, completed in approximately two weeks. Gauss caught and automatically fixed two subtle errors in the original arguments during the process — a demonstration that formal verification can strengthen mathematical rigor beyond what human review alone provides.

Fields Medal Proof Formalized: Viazovska Sphere-Packing Gets Machine Certification

Viazovska's 2016 proof that the E₈ lattice achieves the densest sphere packing in eight dimensions — and her subsequent proof, with collaborators, that the Leech lattice is optimal in 24 dimensions — earned her the Fields Medal in 2022, the highest distinction in mathematics. These results have implications for error-correcting codes used in smartphones and space probes. Until early 2026, the proofs had been verified by the mathematical community in the ordinary human sense: other specialists read them and judged them correct. The collaboration between Viazovska, young mathematician Sidharth Hariharan, and Math, Inc.'s Gauss produced something categorically different — a formal proof that a computer can independently confirm, leaving no room for the hidden steps that human experts often silently accept.

Jeremy Avigad, director of the Institute for Computer-Aided Reasoning in Mathematics at Carnegie Mellon University, wrote in a March 2026 essay that AI can now prove research-level theorems "formally and informally" and that the discipline of mathematics must engage with these tools before they outpace it. "We are running out of places to hide," he wrote. "We have to face up to the fact that AI will soon be able to prove theorems better than we can." Science News's May issue frames this cultural shift as the more significant story: not just the technical capacity of AI, but the question of how mathematics reorganizes itself in response.

What Does AI Math Progress Mean for Working Mathematicians?

For practitioners, the practical implication is that formal verification — once a niche research activity requiring years of specialized skill — is becoming accessible. A mathematician can now, in principle, hand a proof sketch to a system like Gauss or AlphaProof Nexus and receive back a Lean-verified proof that a computer has confirmed correct. This does not make the original mathematical insight less valuable; Viazovska's discovery remains human. What changes is the cost and reliability of checking that the discovery is sound. As Avigad noted, that is not a small change. It is a restructuring of mathematical practice.

What Powers Superluminous Supernovas? Fermi Finds First Gamma-Ray Answer

The astrophysics chapter of the May 2026 Science News issue connects to a paper published in Astronomy & Astrophysics on May 20, 2026. An international team led by Fabio Acero at the French National Centre for Scientific Research and the University of Paris-Saclay studied data from NASA's Fermi Gamma-ray Space Telescope and searched for gamma-ray signals from the six nearest superluminous supernovas observed during Fermi's first 16 years of operation. Only one — SN 2017egm, which erupted in the barred spiral galaxy NGC 3191 roughly 440 million light-years away in the constellation Ursa Major — showed confirmed evidence of gamma rays.

Superluminous supernovas are 10 to 100 times brighter than ordinary core-collapse supernovas and can briefly outshine entire galaxies. For decades the engine behind that extraordinary brightness was debated. The gamma-ray detection from SN 2017egm now provides strong observational support for the magnetar model: a newly formed neutron star with magnetic fields roughly 1,000 times stronger than those of an ordinary neutron star, spinning hundreds of times per second. That rapid rotation generates a powerful outflow of electrons and their antimatter counterparts, producing a particle cloud called a magnetar wind nebula. Within that cloud, interactions produce gamma rays — which cannot escape immediately but begin leaking out roughly two to three months after the explosion as the debris expands and cools. The gamma-ray signal from SN 2017egm appeared approximately 43 to 155 days after the supernova's discovery, consistent with exactly this timeline.

"Only SN 2017egm shows evidence for gamma rays, confirming earlier hints that some supernovas can be as luminous in gamma rays as they are in visible light," said Guillem Martí-Devesa, a fellow at the Institute of Space Sciences in Barcelona, in the NASA press release. "This opens up a new window for studying these fascinating events." The team's analysis modeled how radiation and particles from a newborn magnetar would interact with the expanding supernova debris. That magnetar model reproduced the observed gamma-ray flux and timing better than competing theories. Future observations with the Cherenkov Telescope Array Observatory could detect similar events out to roughly 500 million light-years, potentially opening a new class of gamma-ray sources.

Klingon and Dothraki Activate Same Brain Regions as English and Mandarin

The third story in the Science News May issue revisits research published in the PNAS study on constructed languages. MIT neuroscientists led by Saima Malik-Moraleda used functional MRI to scan the brains of speakers of Esperanto and four fictional constructed languages — Klingon, Na'vi, High Valyrian, and Dothraki — as they processed sentences in those languages.

The result: constructed languages activate the same left-lateralized network of frontal and temporal brain areas as natural languages such as English and Mandarin. The brain did not distinguish Dothraki, invented by linguist David J. Peterson for a television series, from a language that evolved over centuries in a speech community. What the study found matters is not the age, origin, or user base of a language, but whether the symbolic system can express an open-ended range of meanings about the interior and exterior world. Programming languages, by contrast, recruit a different brain network — the so-called multiple demand network, associated with difficult cognitive tasks — because they cannot convey that same open-ended range of meaning.

This finding has methodological implications. Constructed languages offer researchers precise experimental control over vocabulary, phonology, and grammar in ways impossible with natural languages that have evolved unpredictably over millennia. They are, in effect, tools for neurolinguistics — a way to isolate what the brain's language network actually responds to, stripped of the confounds embedded in any natural tongue.


Frequently Asked Questions

What did DeepMind's AlphaProof Nexus actually solve?

AlphaProof Nexus autonomously solved nine open problems from Paul Erdős's mathematics research list using Lean-verified formal proofs, including two problems that had been unsolved for 56 years. The system also proved 44 conjectures from the Online Encyclopedia of Integer Sequences. All proofs were published on GitHub and are machine-checkable.

How does AI verify mathematical proofs using Lean?

Lean is a formal proof assistant that checks every logical step in a mathematical argument against a set of axioms. When an AI system like AlphaProof Nexus or Gauss proposes a proof, Lean independently confirms each inference is valid. If any step fails, the proof is rejected — eliminating the possibility of plausible-sounding but subtly incorrect arguments, which large language models are known to produce on their own.

What is a magnetar and how does it power superluminous supernovas?

A magnetar is a neutron star with a magnetic field roughly 1,000 times stronger than a typical neutron star. When a massive star explodes and collapses, the resulting magnetar can spin hundreds of times per second, generating a powerful outflow of particles that dumps rotational energy into the surrounding debris, amplifying the explosion's brightness far beyond what the initial collapse alone could produce. NASA's Fermi telescope confirmed this mechanism in SN 2017egm through the detection of gamma rays consistent with the magnetar model.

Why do constructed languages activate the same brain regions as natural languages?

According to the 2025 PNAS study, the brain's language network responds to any symbolic system capable of expressing an open-ended range of meanings about the world — not to features such as how old the language is or how many people speak it. Constructed languages like Klingon and Esperanto meet this criterion, which is why they recruit the same left-lateralized frontal and temporal network. Programming languages, which cannot convey that open-ended range of meaning, recruit a different brain system.

ⓒ 2026 TECHTIMES.com All rights reserved. Do not reproduce without permission.

Join the Discussion