# Snowflake CEO 实测：GLM-5.2 与 Opus 4.7 编程能力接近，成本仅为几分之一

- 来源：The Decoder：AI News（RSS）
- 作者：Matthias Bastian
- 发布时间：2026-06-25 01:07
- AIHOT 分数：59
- AIHOT 链接：https://aihot.virxact.com/items/cmqscu32y01yqslfun3zveqe2
- 原文链接：https://the-decoder.com/snowflake-ceo-finds-glm-5-2-competitive-with-opus-4-7-at-a-fraction-of-the-cost

## AI 摘要

Snowflake 内部基准测试显示，在每项任务三次尝试下，GLM-5.2 解决 66% 的编程问题，Anthropic 的 Opus 4.7 解决 67%，两者几乎持平。首次尝试准确率 Opus 为 53.7%，GLM 为 47.6%；GLM 每任务平均迭代 99 次、消耗 8.6 亿 token，Opus 则为 80 次、4.39 亿 token。成本方面，GLM-5.2 输出 token 价格为 $4.40/百万，远低于 Opus 的 $25 和 GPT-5.5 的 $30；输入 token 仅 $1.40/百万。GLM 存在过早放弃和过度检查等弱点，但其定价优势可能对西方 AI 公司的高估值构成压力。

## 正文

Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost

Matthias Bastian View the LinkedIn Profile of Matthias Bastian

Jun 24, 2026

Key Points

In a real-world programming benchmark conducted by Snowflake, the Chinese AI model GLM-5.2 and Anthropic's Opus-4.7 performed nearly identically when given three attempts per task, solving 66 and 67 percent of problems, respectively.

Opus holds an edge on first-attempt accuracy at 53.7 percent versus GLM's 47.6 percent, and is more efficient overall—GLM requires an average of 99 iterations per task compared to 80 for Opus and consumes nearly twice as many tokens.

Despite these efficiency gaps, GLM-5.2 is dramatically cheaper at $4.40 per million output tokens, creating significant price pressure that could challenge the high valuations of Western AI companies like OpenAI.

Snowflake compared GLM-5.2 and Opus 4.7 in a hands-on benchmark. The Chinese model held its own.

The test covered 103 tasks, each run three times, where models had to write code that works on both DuckDB and Snowflake. When each model got three attempts per task, the two were neck and neck: 66% vs. 67% of tasks solved.

First-attempt accuracy diverges: Opus hit 53.7%, GLM only 47.6%, showing GLM's output is less consistent. The Chinese model also averaged 99 runs per task versus Opus's 80 and burned through 860 million tokens, nearly double Opus's 439 million.

Opus 4.7 is the better model, but GLM is competitive in Snowflake's code benchmark and costs far less. | Image: via X[

GLM's strength is validating code reliably across both platforms (DuckDB and Snowflake) at the same time. According to Snowflake CEO Sridhar Ramaswamy, that's why only GLM could solve certain tasks.

Its weaknesses are giving up too early and obsessively checking the wrong things. On one task, GLM fired off 411 tool calls in 24 minutes, checking row counts, distributions, null values, and column types, and still failed all three attempts. Opus solved the same task with 49 calls in 9 minutes.

The claim that GLM produces cleaner code didn't hold up, Ramaswamy said. More checks don't lead to more correct results. Still, the team is excited about GLM-5.2 and wants to make it available to customers.

China's pricing puts real pressure on the Western AI bubble

The results matter most in the context of price. GLM-5.2 costs $1.40 per million input tokens and $4.40 per million output tokens, according to Zhipu's official price sheet. Some third-party providers undercut Zhipu's price even further. Claude Opus 4.7 runs $5 input and $25 output. GPT-5.5 costs $5 input and $30 output.

Model Input Cached Input Output

GLM-5.2 $1.40 $0.26 $4.40

Claude Opus 4.7 $5.00 $0.50 (Cache Hit) $25.00

GPT-5.5 $5.00 $0.50 $30.00

GPT-5.4 $2.50 $0.25 $15.00

GLM's higher token usage eats into that price gap somewhat. But Anthropic and OpenAI are facing serious pricing pressure, and right in coding, the flagship use case both Western AI labs are betting on.

If that pressure slows revenue growth, or worse, shrinks it, the already inflated AI market faces a real stress test. OpenAI's and Anthropic's valuations rest on the assumption that revenue keeps climbing fast. Those valuations are tied to billions in bets on AI infrastructure buildout, from data centers to chip orders.
