Tensordyne 发布突破性推理系统,采用对数 AI 计算芯片。相比 NVIDIA Blackwell,每瓦特 token 数提升 17 倍,吞吐量提升 13 倍。核心创新是在硬件中实现高效对数运算,将乘法转为加法,从而缩小计算电路、减少晶体管、降低功耗,释放芯片空间用于更多张量引擎、高带宽 SRAM 和 HBM3e 内存。针对 DeepSeek-R1,单机架可达 363K tokens/sec,对照系统仅 27.4K。Napier 处理器已完成流片,在台积电 3nm 制程生产。
Tensordyne just announced a breakthrough Inference system.
Logarithmic AI compute chips which is 17x more tokens per watt and 13x higher throughput than NVIDIA Blackwell.
The main math advance they say they unlocked is efficient logarithmic math directly in hardware. In log space, multiplication turns into addition, which is much easier to build than multiplier circuits
That allows smaller compute circuits on the chip than today's FP8 and INT8 GPUs.With fewer transistors, the chips stay cooler and use less energy, while the extra die space can hold more tensor engines, additional high-bandwidth SRAM and HBM3e memory, plus a fast interconnect fabric.