DeepSeek V4 Achieves Perfect Score on Putnam-2025, Ties with Axiom in Formal Math Reasoning

Gate News message, April 24 — DeepSeek V4 has published results from formal mathematical reasoning evaluations, achieving a perfect score of 120/120 on Putnam-2025, tying with Axiom for first place.

In the practical regime using LeanExplore and constrained sampling, V4-Flash-Max scored 81.00 on the Putnam-200 Pass@8 benchmark, significantly outperforming Seed-2.0-Prover (35.50), Gemini 3 Pro (26.50), and Seed-1.5-Prover (26.50). The frontier regime results showed V4 ahead of Seed-1.5-Prover (110/120) and Aristotle (100/120).

V4 employs a hybrid formal-informal reasoning approach: informal reasoning generates candidate natural language solutions, self-verification filters the results, and a formal agent completes rigorous proofs in Lean. The frontier results utilized large-scale computational scaling, while practical regime scores better reflect standard deployment capabilities.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Musk Testifies xAI Used OpenAI Models to Train Grok

Elon Musk testified Thursday in California federal court that his artificial intelligence company xAI partly used OpenAI models while training its Grok chatbot, according to TechCrunch. The admission represents a rare public acknowledgment by a major AI developer of a practice under growing

CryptoFrontier1h ago

Google CEO Pichai reveals that using Gemini AI to understand human nature helps build more sincere communication

Pichai said that before important meetings, he uses Gemini’s perspective to analyze and predict the other party’s psychology, thereby improving empathy and enabling more sincere communication. AI agents can also automatically organize emails, scheduling, and summaries, making everyday chores more efficient. Meanwhile, AI platforms centered on open co-creation are emerging; open-source technologies such as Gemini 4 lower the barrier to entry. At the same time, it emphasizes building AI governance frameworks, with governments and society needing to participate to address challenges such as cybersecurity, deepfakes, and sustainability.

ChainNewsAbmedia1h ago

OpenAI Launches Advanced Account Security for ChatGPT

Advanced Account Security Launch OpenAI on Thursday introduced Advanced Account Security, a new opt-in setting for ChatGPT designed for users who want stronger protection or face higher risks of digital attacks. The company said the new feature was created in response to how people are

CryptoFrontier2h ago

X (Twitter) ushers in its biggest ad platform upgrade in 20 years, with xAI involved; AI semantic targeting becomes the core

X announced that it will roll out the largest advertising platform overhaul in 20 years starting in April 2026, rebuilding the underlying technology and combining it with xAI. The new platform will focus on AI-driven performance optimization and semantic and contextual advertising to improve operational convenience and ad placement control. Its goal is to turn advertising into real-time commercial signals in context, and to serve as X’s business engine within the X ecosystem in line with its Everything App strategy.

ChainNewsAbmedia5h ago

OpenAI-Backed 1X Opens 58,000-Sq-Ft Factory in California, Targets 10,000 Robots in First Year

According to Bloomberg, 1X Technologies, an OpenAI-backed robotics startup founded in Norway, has opened a 58,000-square-foot manufacturing facility in Hayward, California, aiming to lead in mass-producing consumer-grade humanoid robots. The facility is expected to produce 10,000 robots in its

GateNews8h ago

White House Drafts AI Policy Memo Directing U.S. Agencies to Use Multiple AI Providers on April 30

According to sources cited by PANews on April 30, White House officials are drafting a broad artificial intelligence policy memo that directs U.S. government agencies to adopt multiple AI service providers and avoid reliance on a single vendor. The memo also requires all AI companies contracted

GateNews8h ago
Comment
0/400
No comments