AI 摘要
我觉得这些"泄露"很好笑的是,他们甚至懒得弄个大体准确的基准测试数据来输入到图像生成器里。至少让模型查一下真实数据吧。这很简单! 比如 GPQA 在所有近期模型上都超过 90% 了。https://t.co/XljT8L3QCJ
What I find very funny about these "leaks" is that they don't even bother to get ballpark benchmarks to feed into the image generators. Ask the model to look up real data, at least. Its easy!
Like GPQA is over 90% for all recent models.