Catnip 发布 MaineCoon,一款 22B 参数的流式实时交互音频-视觉模型,可在屏幕上呈现活生生的 AI 角色。首帧生成不到 1 秒,推理速度达 47.5 FPS(单张 H100),比现有音视频模型快 7 倍。该模型支持无限时长交互,强调 AI 持续在场而非轮流回复,旨在将被动视频升级为实时 AI 存在感。
Catnip has introduced MaineCoon, a new real-time interactive audio-visual model that puts a live AI character on screen.
> This is a 22B streaming model built for real-time processing, that keeps the character alive rather than pausing to render.
> The first frame lands in under a second, and the generation runs up to 7x faster than existing audio-visual models, holding around 47.5 FPS on a single H100.