ChatGPT 的"强大新图像引擎"

2026-04-22 22:43·71天前·Gary Marcus

AI 摘要

正文内容仅包含"Regurgitating ≠ understanding"（反刍不等于理解），缺乏撰写摘要所需的完整信息，如具体发布细节、功能变化或性能指标。请提供完整文章内容以便提取关键信息并撰写符合要求的摘要。

原文 · 未翻译

There seems to be some excitement around “ChatGPT’s powerful new image engine”, but as ever, its functional understanding of the world seems limited.

I first learned about the new system when some some smart aleck on X sent me an example of the new system trying to label a bike (an example I have considered before), with the caption “Uh oh”, apparently believing that my longstanding challenges to image generation had been solved.

It does look impressive on first inspection, better than some examples I showed here before.

But if you look closely, there are several errors, and those errors are revealing. For example, the rear center-pull (?) brake is mislabeled as the seat stay, and the big gear on the back is mislabeled as the rear brake. There is a label for a spoke that is pointing to blank space.

In many modern bikes, of course, a rear brake can be found back there, but not in this diagram. Instead this system has combined a typical position for a modern disc brake system with a diagram of an older (though still in use) caliper (or similar) system. The system doesn’t actually understand how the various parts function.

And of course there are literally hundreds of labeled bikes on the internet as a quick Google Image Search would reveal. (Which is why my usual test here has been a tandem bike, to make things a little more challenging.)

To up the degree of difficulty, I asked ChatGPT to “please draw a taller than average tandem bike, and include a bike rack and panniers”, which is not something you could readily find on the internet, and not something I used here before, and got this:

Bike nuts would have a field day finding problems with this. (Feel free to drop your favorite error in the comments).

Suffice to say that most people don’t stuff their rear derailleur in the back wheel. And I don’t even know what to say about that “rear brake lever”, or the saddle-shaped rear handlebar, let alone the “rear brake” that is somehow part of the rear rack.

As in the first example, the lack of functional understanding is manifest.

Of course, to be fair, the average human couldn’t complete this task, either.

But anybody knowledgeable about bikes (racers, mechanics, designers etc) would immediately see numerous problems.

And honestly is anybody tall enough to ride in the front?

Gary Marcus：The Road to AI We Can Trust（RSS）

导出 Markdown

ChatGPT 的"强大新图像引擎"

2026-04-22 22:43·71天前·Gary Marcus

阅读原文· garymarcus.substack.com

AI 摘要

原文 · 保持原样，未翻译

There seems to be some excitement around “ChatGPT’s powerful new image engine”, but as ever, its functional understanding of the world seems limited.

It does look impressive on first inspection, better than some examples I showed here before.