NVIDIA Blackwell vs. Huawei Ascend: Did DeepSeek V4 prove China doesn't need Western silicon?
Every Saturday, I write a Deep Dive for my newsletter at getsuperintel.com. Given how important the China-US chip race has become, I'm publishing today's Deep Dive here on X as a full article. Yesterday, I promised to take a closer look at Huawei chips vs. NVIDIA and DeepSeek. Here it is. Enjoy the read.
For the better part of three years, the Western technology establishment slept soundly on a reassuring premise: China was hopelessly behind in AI chips, and export controls would keep it that way. Chris Miller's bestselling book "Chip War" painted a vivid and persuasive picture of a global semiconductor supply chain so intricate, so dependent on Western chokepoints, that Chinese self-sufficiency seemed a decade or more away. ASML's monopoly on extreme ultraviolet lithography, NVIDIA's stranglehold on AI training through its CUDA software ecosystem, and TSMC's unmatched manufacturing prowess formed what appeared to be an impenetrable triple lock.
Then, in April 2026, DeepSeek released V4, a 1.6 trillion parameter Mixture-of-Experts model with 49 billion active parameters and a one-million-token context window. On selected coding and reasoning benchmarks, it approaches frontier-class performance, even though CAISI's May 2026 evaluation still places it roughly eight months behind the absolute frontier; a model deeply optimized for Huawei's domestic Ascend chip ecosystem and confirmed to run on Huawei's latest Ascend 950 infrastructure for inference and deployment. While the full details of V4's training hardware remain ambiguous, with some reports suggesting pre-training still relied on NVIDIA GPUs (ChinaTalk, 04/27/2026), the strategic significance is clear: DeepSeek has built a frontier model that no longer depends on Western hardware to operate at scale, and that may soon no longer need it to train, either. Huawei's Ascend processors, manufactured domestically by China's SMIC foundry using equipment that Western analysts said could never produce chips this advanced.
NVIDIA Blackwell vs. Huawei Ascend: Did DeepSeek V4 prove China doesn't need Western silicon?
Every Saturday, I write a Deep Dive for my newsletter at getsuperintel.com. Given how important the China-US chip race has become, I'm publishing today's Deep Dive here on X as a full article. Yesterday, I promised to take a closer look at Huawei chips vs. NVIDIA and DeepSeek. Here it is. Enjoy the read.
For the better part of three years, the Western technology establishment slept soundly on a reassuring premise: China was hopelessly behind in AI chips, and export controls would keep it that way. Chris Miller's bestselling book "Chip War" painted a vivid and persuasive picture of a global semiconductor supply chain so intricate, so dependent on Western chokepoints, that Chinese self-sufficiency seemed a decade or more away. ASML's monopoly on extreme ultraviolet lithography, NVIDIA's stranglehold on AI training through its CUDA software ecosystem, and TSMC's unmatched manufacturing prowess formed what appeared to be an impenetrable triple lock.
The implications are staggering, and they demand an honest reckoning with a central question: How did China close a gap that was supposed to take 10 to 15 years, in roughly three?
The chip gap was real, but measured wrong
To understand what happened, you first need to understand what the "chip gap" actually meant, and where the framing went wrong. On the level of a single chip, Western superiority remains overwhelming. NVIDIA's current flagship, the Blackwell B200, is fabricated on TSMC's cutting-edge 4-nanometer process and delivers around 2,250 teraflops of computing power at BF16 precision, paired with 192 gigabytes of the latest HBM3e memory running at 8 terabytes per second of bandwidth.
Huawei's earlier domestic alternative, the Ascend 910C, illustrates the scale of the gap. Built on SMIC's optimized 7-nanometer process using older lithography tools, it manages roughly 700 teraflops and offers only 3.2 terabytes per second of memory bandwidth, roughly a third of the compute and less than half the bandwidth of a single B200. Huawei's newer Ascend 950 generation, which is now central to the DeepSeek V4 story, narrows the gap further but still appears to trail NVIDIA's most advanced chips significantly.
This is the metric much of the Western chip-control debate focused on, and on this metric, the diagnosis was largely correct. China remains one to two hardware generations behind. But here is where the Western analysis made a critical error: it assumed the chip-level gap would translate directly into a capability gap. It did not.
Brute Force at Scale
Huawei's answer to NVIDIA's chip-level dominance is what engineers call a "scale-out" strategy, and it is as elegant in concept as it is brutal in execution. Where NVIDIA's reference data center system, the GB200 NVL72, connects 72 Blackwell GPUs into a unified computing fabric delivering about 180 petaflops, Huawei simply built bigger. Its CloudMatrix 384 system packs 384 Ascend 910C chips into a densely interconnected cluster, delivering a theoretical 300 petaflops of BF16 compute, roughly 1.7 times the NVIDIA system's raw output. It also offers 3.6 times the aggregate memory capacity and 2.1 times the total memory bandwidth.
The trade-off is enormous. A single NVIDIA NVL72 rack consumes about 145 kilowatts. The Huawei CloudMatrix 384 devours 560 kilowatts, making it about 2.5 times less energy-efficient per unit of useful computation. In any normal commercial context, this would be economic suicide. No Western cloud provider would willingly operate hardware this inefficient when cheaper, more performant alternatives exist.
But China is not operating under normal commercial logic. The development of domestic AI infrastructure is treated as a matter of national sovereignty. State-backed telecommunications giants and government investment funds subsidize the astronomical energy costs. When the goal is strategic independence from a hostile technology embargo, electricity bills become a secondary variable.
Software Ate the Hardware Gap
The CUDA moat falls?
The brute-force hardware story only gets you halfway to an explanation. Even with 384 chips wired together, you still need software sophisticated enough to orchestrate them. This was supposed to be NVIDIA's second, even more durable advantage: its CUDA software platform, the invisible infrastructure that makes AI training on NVIDIA hardware almost effortless and that locked in developers through massive switching costs.
Huawei's alternative, called CANN (Compute Architecture for Neural Networks), was for years considered unstable and painful to use. Training runs on Huawei clusters frequently crashed. Hardware utilization rates hovered around a dismal 60 percent, meaning 40 percent of the expensive compute was being wasted to coordination failures and software bugs.
DeepSeek V4 is the proof that this barrier has been overcome. DeepSeek engineers worked directly with Huawei to write custom software kernels, specifically designed for the Ascend chip's architecture, that overlap computation, memory access, and network communication simultaneously. These optimizations pushed hardware utilization from 60 percent to over 85 percent, fundamentally changing the economics of Chinese AI clusters.
Algorithmic genius as compensation
But the truly revolutionary contribution of DeepSeek V4 is not the hardware adaptation. It is the model architecture itself, a masterclass in using software innovation to compensate for hardware limitations.
The model employs a Mixture-of-Experts (MoE) architecture. While it has 1.6 trillion total parameters, only 49 billion, roughly 3 percent, are activated for any given computation. The network consists of hundreds of specialized sub-networks, or "experts," each trained for specific tasks like mathematical reasoning, Chinese grammar, or Python code generation. A dynamic routing system decides which experts to engage for each input token. The result is a model with the knowledge capacity of a 1.6-trillion-parameter giant but the computational cost of something far smaller.
Earlier MoE systems suffered from a problem called "routing collapse," where a few popular experts got overwhelmed while others sat idle. DeepSeek solved this with what they call "Anticipatory Routing," computing expert assignments asynchronously in advance using slightly older network weights. This decouples the routing decision from the critical computation path and dramatically stabilizes training (DeepSeek-AI, Technical Report, 04/2026).
The team also deployed the Muon optimizer, a departure from the AdamW optimizer used across virtually the entire Western AI industry. Muon works by ensuring that parameter updates during training remain mathematically orthogonal to each other, preventing the kind of conflicting gradient updates that can cause training to collapse, a risk that is especially acute on less reliable hardware.
Perhaps most impressively, DeepSeek introduced FP4 quantization-aware training. While most AI labs train their models in 16-bit or 8-bit numerical precision, DeepSeek trained its expert weights in just 4-bit precision. Because each expert handles only a narrow domain, this extreme compression works without meaningful quality loss, and it dramatically reduces memory bandwidth consumption, precisely the resource where Huawei's chips are most disadvantaged relative to NVIDIA.
The cumulative effect of these innovations is staggering. DeepSeek V4-Pro can process contexts of one million tokens, the equivalent of 15 to 20 full novels, while requiring only 27 percent of the compute and 10 percent of the memory cache compared to its predecessor, DeepSeek V3.2.
The Lithography Question: Did China Copy ASML?
The question of how SMIC (Semiconductor Manufacturing International Corporation (SMIC) is the largest and most advanced pure-play semiconductor foundry in mainland China) manufactures advanced chips without access to ASML's extreme ultraviolet (EUV) lithography machines is perhaps the most technically fascinating part of this story. EUV uses light with a wavelength of 13.5 nanometers to etch transistor patterns onto silicon wafers. It is considered physically essential for chip features below 7 nanometers, and the Netherlands has banned its export to China since 2019.
SMIC's workaround is a technique called Self-Aligned Quadruple Patterning (SAQP). Since the older deep ultraviolet (DUV) light it has access to, at 193 nanometers, is too coarse to draw fine features in a single pass, SMIC exposes the wafer four times in succession with extraordinary precision, effectively creating structures equivalent to 7-nanometer and, as of late 2025, even 5-nanometer processes. Independent analysis by TechInsights confirmed that Huawei's Kirin 9030 uses SMIC's N+3 process, a scaled evolution of its 7nm-class technology that shows how close SMIC is getting to 5nm-class manufacturing without EUV, while still remaining meaningfully behind leading commercial 5nm nodes from TSMC and Samsung (TechInsights, 12/11/2025).
The catch is yield. SMIC's multi-patterning approach produces catastrophic defect rates, with only 30 to 40 percent of chips coming off the line in working condition. For comparison, TSMC achieves yields above 80 percent with its EUV processes. Each wafer takes longer to produce, the machinery wears out faster, and the cost per working chip is astronomical. For any company operating in a free market, this approach would mean bankruptcy. For China, it is a matter of state policy: hundreds of billions of yuan in subsidies from government investment funds absorb the losses.
China's EUV Manhattan Project
The long-term DUV workaround has a ceiling. Pushing beyond the current 5nm-class toward the 3nm and emerging 2nm frontier becomes exponentially harder without EUV. Each additional patterning step adds cost, defect risk, and cycle time, and the economics deteriorate rapidly. DUV can be stretched further, but not indefinitely, and not competitively.
An ASML EUV machine costs over 370 million dollars, weighs more than 180 tons, contains over 100,000 specialized components, and requires three Boeing 747 cargo planes to transport. The precision of its mirror system, supplied by Germany's Carl Zeiss, operates at tolerances measured in picometers, the width of individual atoms. You cannot reverse-engineer this from a blueprint. The knowledge is embedded in people.
China has pursued exactly this vector. Reporting from late 2025 revealed that China had initiated a classified research program of extraordinary scale, internally compared to the Manhattan Project (Reuters, 11/2025). Under high-level political coordination, a secured laboratory in Shenzhen produced a functioning EUV prototype in early 2025. The effort relied heavily on recruiting former ASML engineers, including key figures from the company's light-source development division, with signing bonuses reportedly reaching up to $700,000. Within 18 months, one recruited team filed eight critical EUV-related patents.
The prototype is far from commercially viable. It fills nearly an entire factory hall, uses secondary-market optics from Nikon and Canon rather than Zeiss-grade components, and achieves only about 3.4 percent conversion efficiency, far too low for high-volume manufacturing. It demonstrates an important proof-of-concept milestone. Western intelligence agencies, which had projected a Chinese EUV machine for 2035 at the earliest, were caught off guard. The timeline has compressed by nearly a decade, with Chinese officials targeting functional EUV chip production by 2028 to 2030.
A preliminary verdict
The evidence leads to a clear, if uncomfortable, set of conclusions. DeepSeek V4 is not a benchmark stunt. On selected coding tasks, V4-Pro is highly competitive! It achieves 80.6% on the SWE-bench Verified coding benchmark, essentially matching Claude Opus 4.6 at 80.8%, and surpasses it on LiveCodeBench with 93.5% versus 88.8% (Of course, it's also true that real-world usage differs from the benchmarks.). It accomplishes this while offering API prices 90 to 97 percent lower than Western equivalents, a cost advantage driven not by predatory pricing but by genuine architectural efficiency.
China did not close the chip gap. It went around it! The hardware remains inferior chip-for-chip, but radical system-level scaling, extraordinary software innovation, state-subsidized energy costs, and a willingness to accept manufacturing inefficiencies that would destroy any commercial enterprise combined to produce an outcome that the sanctions were specifically designed to prevent.
The sanctions paradox
The deepest irony of this story is that the export controls may have accelerated the very outcome they sought to prevent. Before October 2022, Chinese AI labs were happy NVIDIA customers, content to buy American hardware and train their models on CUDA. The sanctions forced them into an uncomfortable but ultimately productive marriage with Huawei, compelled DeepSeek to invent algorithmic solutions to hardware problems, and gave the Chinese government the political mandate to pour unlimited resources into semiconductor independence.
Chris Miller's analysis in "Chip War" was not wrong about the physics. EUV lithography is genuinely hard, and NVIDIA's chips are genuinely superior. What it underestimated was the degree to which software innovation, system-level engineering, and state-directed economic irrationality could neutralize those advantages in practice. The 10-to-15-year gap was measured in hardware generations. China's response was to make the hardware generation gap matter less.
The question going forward is not whether China can match NVIDIA chip for chip. It probably cannot, at least not soon. The question is whether chip-for-chip superiority still matters when the competition is being fought on a different axis entirely, one where algorithmic efficiency, system architecture, and political will have proven to be just as decisive as nanometers and transistors.
The West built a fortress around its silicon. China built a ladder out of software, and climbed over the wall.
A few final words and personal views
The future of AI infrastructure is more open than anyone in Washington or Silicon Valley assumed even 12 months ago, and the comfortable narrative of permanent Western dominance no longer holds. What we are watching is the emergence of a genuine two-player race between the US and China, one that will be fought across hardware, software, and industrial policy simultaneously, with escalating intensity on both sides. Europe, absent any frontier chip design capability or hyperscaler of its own, risks being reduced to a spectator in this contest. But one European lever remains decisive: as long as ASML remains the only supplier of production-grade EUV lithography, Europe is not merely watching the game. It holds one of the few choke points that still shapes the board.
P.s. This text is essentially the answer to my open question:
TechInsights: SMIC N+3 Confirmed, Kirin 9030 Analysis (12/11/2025) https://www.techinsights.com/blog/smic-n3-confirmed-kirin-9030-analysis-reveals-how-close-smic-5nm
Reuters (via Modern Diplomacy): Inside China's Secret Push to Build Its Own EUV Chip Machine (12/17/2025) https://moderndiplomacy.eu/2025/12/18/inside-chinas-secret-push-to-build-its-own-euv-chip-machine/ (Original Reuters article is paywalled; this is the most complete openly accessible version citing Reuters directly)
MIT Technology Review: Three Reasons Why DeepSeek's New Model Matters (04/24/2026) https://www.technologyreview.com/2026/04/24/1136422/why-deepseeks-v4-matters/
NIST/CAISI Evaluation of DeepSeek V4 Pro (05/02/2026) https://www.nist.gov/news-events/news/2026/05/caisi-evaluation-deepseek-v4-pro
EE Times: China EUV Breakthrough and the Rise of the 'Silicon Curtain' (12/23/2025) https://www.eetimes.com/china-euv-breakthrough-and-the-rise-of-the-silicon-curtain/
Asia Times: Made-in-China EUV Machine Targets AI Chip Output by 2028 (12/24/2025) https://asiatimes.com/2025/12/made-in-china-euv-machine-targets-ai-chip-output-by-2028/
Then, in April 2026, DeepSeek released V4, a 1.6 trillion parameter Mixture-of-Experts model with 49 billion active parameters and a one-million-token context window. On selected coding and reasoning benchmarks, it approaches frontier-class performance, even though CAISI's May 2026 evaluation still places it roughly eight months behind the absolute frontier; a model deeply optimized for Huawei's domestic Ascend chip ecosystem and confirmed to run on Huawei's latest Ascend 950 infrastructure for inference and deployment. While the full details of V4's training hardware remain ambiguous, with some reports suggesting pre-training still relied on NVIDIA GPUs (ChinaTalk, 04/27/2026), the strategic significance is clear: DeepSeek has built a frontier model that no longer depends on Western hardware to operate at scale, and that may soon no longer need it to train, either. Huawei's Ascend processors, manufactured domestically by China's SMIC foundry using equipment that Western analysts said could never produce chips this advanced.
The implications are staggering, and they demand an honest reckoning with a central question: How did China close a gap that was supposed to take 10 to 15 years, in roughly three?
The chip gap was real, but measured wrong
To understand what happened, you first need to understand what the "chip gap" actually meant, and where the framing went wrong. On the level of a single chip, Western superiority remains overwhelming. NVIDIA's current flagship, the Blackwell B200, is fabricated on TSMC's cutting-edge 4-nanometer process and delivers around 2,250 teraflops of computing power at BF16 precision, paired with 192 gigabytes of the latest HBM3e memory running at 8 terabytes per second of bandwidth.
Huawei's earlier domestic alternative, the Ascend 910C, illustrates the scale of the gap. Built on SMIC's optimized 7-nanometer process using older lithography tools, it manages roughly 700 teraflops and offers only 3.2 terabytes per second of memory bandwidth, roughly a third of the compute and less than half the bandwidth of a single B200. Huawei's newer Ascend 950 generation, which is now central to the DeepSeek V4 story, narrows the gap further but still appears to trail NVIDIA's most advanced chips significantly.
This is the metric much of the Western chip-control debate focused on, and on this metric, the diagnosis was largely correct. China remains one to two hardware generations behind. But here is where the Western analysis made a critical error: it assumed the chip-level gap would translate directly into a capability gap. It did not.
Brute Force at Scale
Huawei's answer to NVIDIA's chip-level dominance is what engineers call a "scale-out" strategy, and it is as elegant in concept as it is brutal in execution. Where NVIDIA's reference data center system, the GB200 NVL72, connects 72 Blackwell GPUs into a unified computing fabric delivering about 180 petaflops, Huawei simply built bigger. Its CloudMatrix 384 system packs 384 Ascend 910C chips into a densely interconnected cluster, delivering a theoretical 300 petaflops of BF16 compute, roughly 1.7 times the NVIDIA system's raw output. It also offers 3.6 times the aggregate memory capacity and 2.1 times the total memory bandwidth.
The trade-off is enormous. A single NVIDIA NVL72 rack consumes about 145 kilowatts. The Huawei CloudMatrix 384 devours 560 kilowatts, making it about 2.5 times less energy-efficient per unit of useful computation. In any normal commercial context, this would be economic suicide. No Western cloud provider would willingly operate hardware this inefficient when cheaper, more performant alternatives exist.
But China is not operating under normal commercial logic. The development of domestic AI infrastructure is treated as a matter of national sovereignty. State-backed telecommunications giants and government investment funds subsidize the astronomical energy costs. When the goal is strategic independence from a hostile technology embargo, electricity bills become a secondary variable.
Software Ate the Hardware Gap
The CUDA moat falls?
The brute-force hardware story only gets you halfway to an explanation. Even with 384 chips wired together, you still need software sophisticated enough to orchestrate them. This was supposed to be NVIDIA's second, even more durable advantage: its CUDA software platform, the invisible infrastructure that makes AI training on NVIDIA hardware almost effortless and that locked in developers through massive switching costs.
Huawei's alternative, called CANN (Compute Architecture for Neural Networks), was for years considered unstable and painful to use. Training runs on Huawei clusters frequently crashed. Hardware utilization rates hovered around a dismal 60 percent, meaning 40 percent of the expensive compute was being wasted to coordination failures and software bugs.
DeepSeek V4 is the proof that this barrier has been overcome. DeepSeek engineers worked directly with Huawei to write custom software kernels, specifically designed for the Ascend chip's architecture, that overlap computation, memory access, and network communication simultaneously. These optimizations pushed hardware utilization from 60 percent to over 85 percent, fundamentally changing the economics of Chinese AI clusters.
Algorithmic genius as compensation
But the truly revolutionary contribution of DeepSeek V4 is not the hardware adaptation. It is the model architecture itself, a masterclass in using software innovation to compensate for hardware limitations.
The model employs a Mixture-of-Experts (MoE) architecture. While it has 1.6 trillion total parameters, only 49 billion, roughly 3 percent, are activated for any given computation. The network consists of hundreds of specialized sub-networks, or "experts," each trained for specific tasks like mathematical reasoning, Chinese grammar, or Python code generation. A dynamic routing system decides which experts to engage for each input token. The result is a model with the knowledge capacity of a 1.6-trillion-parameter giant but the computational cost of something far smaller.
Earlier MoE systems suffered from a problem called "routing collapse," where a few popular experts got overwhelmed while others sat idle. DeepSeek solved this with what they call "Anticipatory Routing," computing expert assignments asynchronously in advance using slightly older network weights. This decouples the routing decision from the critical computation path and dramatically stabilizes training (DeepSeek-AI, Technical Report, 04/2026).
The team also deployed the Muon optimizer, a departure from the AdamW optimizer used across virtually the entire Western AI industry. Muon works by ensuring that parameter updates during training remain mathematically orthogonal to each other, preventing the kind of conflicting gradient updates that can cause training to collapse, a risk that is especially acute on less reliable hardware.
Perhaps most impressively, DeepSeek introduced FP4 quantization-aware training. While most AI labs train their models in 16-bit or 8-bit numerical precision, DeepSeek trained its expert weights in just 4-bit precision. Because each expert handles only a narrow domain, this extreme compression works without meaningful quality loss, and it dramatically reduces memory bandwidth consumption, precisely the resource where Huawei's chips are most disadvantaged relative to NVIDIA.
The cumulative effect of these innovations is staggering. DeepSeek V4-Pro can process contexts of one million tokens, the equivalent of 15 to 20 full novels, while requiring only 27 percent of the compute and 10 percent of the memory cache compared to its predecessor, DeepSeek V3.2.
The Lithography Question: Did China Copy ASML?
The question of how SMIC (Semiconductor Manufacturing International Corporation (SMIC) is the largest and most advanced pure-play semiconductor foundry in mainland China) manufactures advanced chips without access to ASML's extreme ultraviolet (EUV) lithography machines is perhaps the most technically fascinating part of this story. EUV uses light with a wavelength of 13.5 nanometers to etch transistor patterns onto silicon wafers. It is considered physically essential for chip features below 7 nanometers, and the Netherlands has banned its export to China since 2019.
SMIC's workaround is a technique called Self-Aligned Quadruple Patterning (SAQP). Since the older deep ultraviolet (DUV) light it has access to, at 193 nanometers, is too coarse to draw fine features in a single pass, SMIC exposes the wafer four times in succession with extraordinary precision, effectively creating structures equivalent to 7-nanometer and, as of late 2025, even 5-nanometer processes. Independent analysis by TechInsights confirmed that Huawei's Kirin 9030 uses SMIC's N+3 process, a scaled evolution of its 7nm-class technology that shows how close SMIC is getting to 5nm-class manufacturing without EUV, while still remaining meaningfully behind leading commercial 5nm nodes from TSMC and Samsung (TechInsights, 12/11/2025).
The catch is yield. SMIC's multi-patterning approach produces catastrophic defect rates, with only 30 to 40 percent of chips coming off the line in working condition. For comparison, TSMC achieves yields above 80 percent with its EUV processes. Each wafer takes longer to produce, the machinery wears out faster, and the cost per working chip is astronomical. For any company operating in a free market, this approach would mean bankruptcy. For China, it is a matter of state policy: hundreds of billions of yuan in subsidies from government investment funds absorb the losses.
China's EUV Manhattan Project
The long-term DUV workaround has a ceiling. Pushing beyond the current 5nm-class toward the 3nm and emerging 2nm frontier becomes exponentially harder without EUV. Each additional patterning step adds cost, defect risk, and cycle time, and the economics deteriorate rapidly. DUV can be stretched further, but not indefinitely, and not competitively.
An ASML EUV machine costs over 370 million dollars, weighs more than 180 tons, contains over 100,000 specialized components, and requires three Boeing 747 cargo planes to transport. The precision of its mirror system, supplied by Germany's Carl Zeiss, operates at tolerances measured in picometers, the width of individual atoms. You cannot reverse-engineer this from a blueprint. The knowledge is embedded in people.
China has pursued exactly this vector. Reporting from late 2025 revealed that China had initiated a classified research program of extraordinary scale, internally compared to the Manhattan Project (Reuters, 11/2025). Under high-level political coordination, a secured laboratory in Shenzhen produced a functioning EUV prototype in early 2025. The effort relied heavily on recruiting former ASML engineers, including key figures from the company's light-source development division, with signing bonuses reportedly reaching up to $700,000. Within 18 months, one recruited team filed eight critical EUV-related patents.
The prototype is far from commercially viable. It fills nearly an entire factory hall, uses secondary-market optics from Nikon and Canon rather than Zeiss-grade components, and achieves only about 3.4 percent conversion efficiency, far too low for high-volume manufacturing. It demonstrates an important proof-of-concept milestone. Western intelligence agencies, which had projected a Chinese EUV machine for 2035 at the earliest, were caught off guard. The timeline has compressed by nearly a decade, with Chinese officials targeting functional EUV chip production by 2028 to 2030.
A preliminary verdict
The evidence leads to a clear, if uncomfortable, set of conclusions. DeepSeek V4 is not a benchmark stunt. On selected coding tasks, V4-Pro is highly competitive! It achieves 80.6% on the SWE-bench Verified coding benchmark, essentially matching Claude Opus 4.6 at 80.8%, and surpasses it on LiveCodeBench with 93.5% versus 88.8% (Of course, it's also true that real-world usage differs from the benchmarks.). It accomplishes this while offering API prices 90 to 97 percent lower than Western equivalents, a cost advantage driven not by predatory pricing but by genuine architectural efficiency.
China did not close the chip gap. It went around it! The hardware remains inferior chip-for-chip, but radical system-level scaling, extraordinary software innovation, state-subsidized energy costs, and a willingness to accept manufacturing inefficiencies that would destroy any commercial enterprise combined to produce an outcome that the sanctions were specifically designed to prevent.
The sanctions paradox
The deepest irony of this story is that the export controls may have accelerated the very outcome they sought to prevent. Before October 2022, Chinese AI labs were happy NVIDIA customers, content to buy American hardware and train their models on CUDA. The sanctions forced them into an uncomfortable but ultimately productive marriage with Huawei, compelled DeepSeek to invent algorithmic solutions to hardware problems, and gave the Chinese government the political mandate to pour unlimited resources into semiconductor independence.
Chris Miller's analysis in "Chip War" was not wrong about the physics. EUV lithography is genuinely hard, and NVIDIA's chips are genuinely superior. What it underestimated was the degree to which software innovation, system-level engineering, and state-directed economic irrationality could neutralize those advantages in practice. The 10-to-15-year gap was measured in hardware generations. China's response was to make the hardware generation gap matter less.
The question going forward is not whether China can match NVIDIA chip for chip. It probably cannot, at least not soon. The question is whether chip-for-chip superiority still matters when the competition is being fought on a different axis entirely, one where algorithmic efficiency, system architecture, and political will have proven to be just as decisive as nanometers and transistors.
The West built a fortress around its silicon. China built a ladder out of software, and climbed over the wall.
A few final words and personal views
The future of AI infrastructure is more open than anyone in Washington or Silicon Valley assumed even 12 months ago, and the comfortable narrative of permanent Western dominance no longer holds. What we are watching is the emergence of a genuine two-player race between the US and China, one that will be fought across hardware, software, and industrial policy simultaneously, with escalating intensity on both sides. Europe, absent any frontier chip design capability or hyperscaler of its own, risks being reduced to a spectator in this contest. But one European lever remains decisive: as long as ASML remains the only supplier of production-grade EUV lithography, Europe is not merely watching the game. It holds one of the few choke points that still shapes the board.
P.s. This text is essentially the answer to my open question:
TechInsights: SMIC N+3 Confirmed, Kirin 9030 Analysis (12/11/2025) https://www.techinsights.com/blog/smic-n3-confirmed-kirin-9030-analysis-reveals-how-close-smic-5nm
Reuters (via Modern Diplomacy): Inside China's Secret Push to Build Its Own EUV Chip Machine (12/17/2025) https://moderndiplomacy.eu/2025/12/18/inside-chinas-secret-push-to-build-its-own-euv-chip-machine/ (Original Reuters article is paywalled; this is the most complete openly accessible version citing Reuters directly)
MIT Technology Review: Three Reasons Why DeepSeek's New Model Matters (04/24/2026) https://www.technologyreview.com/2026/04/24/1136422/why-deepseeks-v4-matters/
NIST/CAISI Evaluation of DeepSeek V4 Pro (05/02/2026) https://www.nist.gov/news-events/news/2026/05/caisi-evaluation-deepseek-v4-pro
EE Times: China EUV Breakthrough and the Rise of the 'Silicon Curtain' (12/23/2025) https://www.eetimes.com/china-euv-breakthrough-and-the-rise-of-the-silicon-curtain/
Asia Times: Made-in-China EUV Machine Targets AI Chip Output by 2028 (12/24/2025) https://asiatimes.com/2025/12/made-in-china-euv-machine-targets-ai-chip-output-by-2028/