Gate News message, April 24 — DeepSeek V4-Pro and DeepSeek V4-Flash were officially released and open-sourced on April 24, with context processing length significantly expanded from 128K to 1M, representing nearly a 10-fold capacity increase. Huawei Computing announced that its Ascend supernode products fully support DeepSeek V4 series models through close collaboration between chip and model technologies.
Huawei Ascend 950 achieves high-throughput, low-latency DeepSeek V4 model inference deployment through fused kernel and multi-stream parallelism techniques to reduce Attention computation and memory access overhead. For DeepSeek V4-Pro with 8K input, Ascend 950 achieves approximately 20ms TPOT with 4,700 TPS single-card Decode throughput; for DeepSeek V4-Flash under 8K input, it reaches approximately 10ms TPOT with 1,600 TPS throughput. Ascend A3 supernode series also achieves full compatibility, with training reference implementations provided for rapid fine-tuning. Based on Ascend A3 64-card supernode with large EP mode, DeepSeek V4-Flash achieves over 2,000 TPS single-card Decode throughput in 8K/1K input-output scenarios using vLLM inference engine. Huawei’s full Ascend A2, A3, and 950 product lines support both DeepSeek V4-Flash and V4-Pro.
Huawei Cloud announced first-mover compatibility with DeepSeek V4, providing developers with one-click API token services through its MaaS platform. Huawei Cloud optimized system layer, operator layer, and cluster layer capabilities to ensure rapid model adaptation and high-performance deployment. Enterprises including Kingsoft WPS and 360 have already integrated DeepSeek’s new model via Huawei Cloud.
Cambricon also announced Day 0 compatibility with DeepSeek V4-Flash and V4-Pro based on the vLLM inference framework, with adaptation code open-sourced to the GitHub community. Cambricon previously achieved first-mover adaptation when DeepSeek V3.2 was released last year, having conducted deep software-hardware collaborative performance optimization on DeepSeek series models.
Related Articles
Google Plans to Invest Up to $40 Billion in Anthropic, Pledges 5+ Gigawatts of Computing Power
Swiss Regulator FINMA Warns That Anthropic's Mythos AI Tool Poses Financial Risk
Fere AI Completes $1.3M Funding Round Led by Ethereal Ventures
Anthropic Rolls Back Claude Code Changes After Quality Decline; All Fixes Complete
NeoSoul Co-Founder Kaelan: AI Industry Should Allow Toys to Exist, Innovation Often Starts as Experimental Products
Meta to Deploy Tens of Millions of AWS Graviton5 Chips in Multi-Year Billion-Dollar Deal