Google’s AI Cracks Decades-Old Math Problems — But DeepMind Says AGI Is Still Far Away
However, AI technology is fast moving past image generators and chatbots and is venturing into fields that have been deemed the reserve of human genius, namely mathematics. There has been a significant development in the field following the claims made by the Google DeepMind company about the success of their AlphaProof Nexus AI technology in solving several unsolved problems posed by mathematician Erdős and verifying all of the solutions using mathematical methods. This event has sparked global debates about the reliability of AI, math hallucinations, and the future of AGI. However, even Demis Hassabis does not expect AGI to come anytime soon.

How AlphaProof Nexus Resolved 56-Year-Old Erdős Problems
Google DeepMind’s innovative AlphaProof Nexus has become popular worldwide after the developers confirmed that the AI independently discovered solutions to nine Erdős problems that had puzzled scientists for 56 years. Erdős problems are notable mathematical challenges either proposed or developed by the renowned scientist Paul Erdős and characterised by significant difficulty. Researchers from Google DeepMind state that unlike previously when AI helped mathematicians solve problems, this time around, the AI developed its own proofs and then improved on them until correct answers were achieved. Besides resolving nine Erdős problems, AlphaProof Nexus has also been reported to discover solutions to 44 open OEIS conjectures and advance algebraic geometry and optimisation theory. Perhaps what amazed researchers most about the experiment was the ease of the procedure, seeing as the computational costs involved in solving a single problem were only a couple of hundred dollars. This represents a new trend in applying artificial intelligence technology to scientific discoveries, where AI is not just an assistant anymore but is capable of generating original mathematical thought.
The Main Challenge with AI-created Mathematics: Hallucinated Proofs
AI systems generate impressive results; however, they also create a major problem called “hallucinating proofs.” This occurs when an AI creates a sounding argument to be mathematically correct when it’s actually incorrect due to missing some logical reasoning. Some AI models invent false lemmas, misuse true lemmas, and/or simply skip over the most difficult proof steps, yet call the proof complete. Due to the advanced nature of mathematical language (e.g., words/better terminology) in which advanced mathematics exist, it can often be difficult for an individual to pick up on the flawed reasoning behind a proof created by an AI model. The current debate surrounding AI-created mathematics was just intensified by various claims by AI companies (i.e., Google DeepMind) to solve large historical important mathematical problems. Due to these claims, it has been realized that the only way to guarantee AI-created mathematics is to rely on informal human inspection is no longer sufficient; therefore, there will be much greater emphasis on using more formalized mathematical methods of proof, called Formal Verification Systems. Mathematical proofs need to have 100% certainty of logical correctness, as opposed to “mostly correct.” If/when AI systems are capable of producing hallucinated proofs, it will create a lack of trust in the scientific community through the flaws and absence of logical proof within AI systems. So, much greater emphasis will be placed on using Formal Verification Systems to mathematically prove every step of an AI-created construction without the sole use of anyone’s judgment.

How Google Developed AlphaProof Nexus with Lean, the Tool to Perform Full Validity Check on Generated Proofs
In order to enhance confidence and decrease mistakes in AI-generated proofs, Deepmind collaborated with Lean, an established method of validating mathematical proofs and designed to work with extremely high standards of rigor and detail. Unlike previous attempts at validating proofs made using AI, the Lean verification system checks the logical basis for each mathematical claim produced by the AI in a stepwise manner against very strict logical rules. If an AI-generated proof uses incorrect or unsupported assumptions, skips logical reasoning steps, or creates false “lemmas” in an effort to reach a conclusion, the Lean verification system rejects the entire proof. Reportedly, Deepmind integrated both Lean and Gemini 3.1 Pro, another AI based tool providing reasoning capability, so that the AI could create proof ideas while Lean would perform a formal verification of them. This combined method of creating valid mathematical proofs represents such a significant breakthrough because it eliminates many sources of uncertainty regarding AI-generated proofs; instead of relying solely on professional mathematicians to review long and complex proofs, the Lean formal verification process assures that each individual argument within the entire proof is correct and is based solely on established rules of mathematical logic. It is believed that the development of this type of formal verification will change how professional mathematicians will conduct mathematical research into the future by enabling them to spend their time only on parts of problems where no individuals have yet arrived at a solution and not on parts where there is already a solution. Reportedly, even when AlphaProof Nexus does not produce a proof for a mathematical problem, professional mathematicians can still utilise verified proof sketches to assist them in forming additional insights into a given problem. The current developments illustrate the potential to change the way scientists collaborate within a scientific community using both AI and formal verification systems in the not-too-distant future.
If AI Is Solving Math Problems Successfully, Is It Already AGI?
Thanks to AlphaProof Nexus, there have been heated debates on the issue of how close AI is now to AGI, which stands for artificial general intelligence. However, according to Demis Hassabis, successful solutions to complex math problems do not necessarily mean that AI has reached human-like intelligence levels. Speaking on the recent achievement, Hassabis claimed that the modern AI is still very much specialized and not able to think creatively and innovatively like Srinivasa Ramanujan. The AI is currently a useful tool but lacks the creativity and ingenuity of the great mathematician. Indeed, while the new system can handle vast amounts of information and produce formalized proofs, it still works within narrow limits set by predefined tasks. According to researchers, the development of AGI involves the creation of an intelligent machine able to learn and adapt to multiple fields with human-like thinking and imagination. There is a clear difference between doing well in specific tasks and achieving flexible thinking that allows working with any type of information and creative ideas.
Conclusion
However, the development of AI applications such as AlphaProof Nexus indicates the fast progress that is being made by artificial intelligence to revolutionize advanced scientific studies. Though this progress is impressive, there are scientists who claim that real AGI needs creativity and originality of humans. AI might not replace mathematicians anytime soon, but it sure has become their strongest tool.







