Probably staying a little longer.

译可能多待一会儿。

Introducing Agent Skills. Build an ad campaign, create a commercial, localize your ads and more with a simple command. Type /, choose a Skill and Agent gets to work. Scale your marketing. On command. Get started at the link below.

译推出 Agent Skills。通过简单命令即可构建广告活动、创建商业广告、本地化广告等。输入 /，选择一项 Skill，Agent 便开始工作。规模化你的营销。一键执行。请通过下方链接开始使用。

Berryxia.AI@berryxia · 7小时前15

换助理了！！！新助理说每个人都需要一个数字人？那么，还要她干嘛呢？你说呢？兄弟们～

译换助理了！！！新助理说每个人都需要一个数字人？那么，还要她干嘛呢？你说呢？兄弟们～ [引用 @berryxia]：开始让美女助教卖课了😂 丝滑～

Rohan Paul@rohanpaul_ai · 10小时前52

You can now create ads directly inside Slack. Arcads turns Slack into an AI ad studio that also researches competitors and generates creatives. Claude Tag lets Slack users tag @ Claude and delegate work across connected channels and tools. MCP gives Claude a controlled way to call Arcads skills from Slack. Veo 3.1, Kling Motion Control, Nano Banana, and Sora 2 Pro become accessible from one interface.

译Arcads 将 Slack 转化为 AI 广告工作室，用户可直接在 Slack 内创建广告、研究竞争对手并生成创意。支持 Claude Tag（@Claude 跨频道跨工具委派任务）和 MCP（让 Claude 从 Slack 安全调用 Arcads 技能）。同一界面可调用 Veo 3.1、Kling Motion Control、Nano Banana、Sora 2 Pro 等视频生成工具。官方宣布 Claude x Arcads in Slack 即日上线，用户可在 Slack DM 中获取病毒式广告。

PixVerse@PixVerse_ · 11小时前25

From a single grey-model motion clip to a cinematic 4K zombie scene. Character appearance from one reference image, motion from a 3D reference, environment kept consistent across every shot. Created with Seedance 2.0 on PixVerse.

译从单个灰模动作片段到电影级4K丧尸场景。角色外观来自一张参考图像，动作来自3D参考，环境在每个镜头中保持一致。使用 PixVerse 上的 Seedance 2.0 创建。

meng shao@shao__meng · 11小时前79

AI 视频剪辑 Skill 分享「video-use」 https://github.com/browser-use/video-use @browser_use 团队推出的开源 Skill，定位为面向 AI Coding Agents（Codex、Claude Code、Cursor、Hermes Agent 等）的视频剪辑 Skill。它不做传统意义上的 Premiere / CapCut 替代品，它是一套让 LLM 通过 “阅读转写文本 + 按需可视化” 来理解视频、并调用 ffmpeg 等工具完成剪辑的 prompt-engineering + 工具脚本集合。 # 核心思想：LLM 不“看”视频，它“读”视频第一层：音频转写文本（always loaded）通过 ElevenLabs Scribe 获得逐词时间戳、说话人分离、音频事件标记（如笑声、叹息、掌声），打包成约 12KB 的 takes_packed.md。这是 LLM 的主要“阅读材料”。第二层：视觉时间线视图（on demand）仅在决策点（歧义停顿、重拍对比、切点校验）调用 timeline_view.py 生成胶片帧 + 波形 + 字幕的 PNG 复合图。对比朴素方案“30000 帧 × 1500 tokens = 4500 万 tokens 噪声”，项目走的是 “12KB 文本 + 少量 PNG” 的轻量化路径。这与 Browser Use 让 LLM 读结构化 DOM 而非直接看截图的思路一致。 # 技术流水线：Transcribe → Pack → Reason → EDL → Render → Self-Eval 1. 转写 - transcribe. py / transcribe_batch.py 提取 16kHz 单声道音频，调用 ElevenLabs Scribe，缓存为 transcripts/<name>.json 2. 打包 - pack_transcripts.py 将逐词 JSON 合并为按 0.5s 静音或说话人切换断句的 takes_packed.md 3. 决策 - LLM 自身阅读 packed transcript，必要时用 timeline_view.py 可视化 4. 生成 EDL - subagents 输出 JSON 格式 edl.json，包含源文件、切点、节奏标签、引用、原因 5. 渲染 - render. py 分段提取 → 无损 concat → 叠动画 → 压字幕 → 响度标准化 6. 自评估 - timeline_view.py + LLM 在输出文件的每个切点 ±1.5s 检查跳帧、爆音、字幕遮挡，最多 3 轮 # 关键工程细节： ffmpeg 为主的剪辑实现 1. 分段提取 + -c copy 拼接（避免叠 overlay 时二次编码） 2. 每段边界 30ms 音频淡入淡出（消除切点爆音） 3. overlay 使用 setpts=PTS-STARTPTS+T/TB 进行时移，确保动画第 0 帧对齐输出时间线 4. 字幕始终最后叠加（防止被动画遮挡） 5. Master SRT 使用输出时间轴偏移：output_time = word.start - segment_start + segment_offset 6. 切点必须落在词边界，并加 30–200ms 填充以吸收 Scribe 50–100ms 的时间戳漂移 7. HDR 源自动 tone-map（HLG/PQ → Rec.709 SDR） 8. 竖屏源自动按高度缩放 9. 两-pass loudnorm：-14 LUFS / -1 dBTP / LRA 11，符合主流社交平台标准 # 动画与包装：多引擎并行 1. HyperFrames：HTML/CSS/GSAP compositions，适合产品 UI、网页转视频、动态排版 2. Remotion：React 组件化 compositions 3. Manim：数学/技术/3Blue1Brown 风格解释动画 4. PIL + PNG sequence + ffmpeg：简单卡片、计数器、打字效果 # SKILL.md 的 12 条“铁律”：生产正确性优先 1. 必须遵守的 12 条硬规则：字幕最后、分段提取再拼接、30ms 淡入淡出、PTS 时移、SRT 输出时间偏移、不切在词中、切点填充、逐词 ASR、缓存转写、并行动画、先确认策略再执行、输出在 <videos_dir>/edit/ 2. 其余全部是可调整的“worked example”：调色风格、字幕分块、动画时长、节奏等都可按材料和用户品牌定制

译browser-use 团队推出面向 Codex、Claude Code 等 AI 编码智能体的开源 Skill「video-use」，让 LLM 通过 ElevenLabs Scribe 将音频转写为约 12KB 文本（含逐词时间戳、说话人分离、事件标记），仅在决策点调用 timeline_view.py 生成 PNG 帧图。技术流水线包括转写、打包、生成 JSON 格式 EDL、ffmpeg 渲染及最多 3 轮自评估。渲染关键细节：分段提取 + `-c copy` 拼接、30ms 音频淡入淡出、PTS 时移、字幕最后叠加、HDR 自动映射、竖屏缩放、两-pass loudnorm。动画支持 HyperFrames、Remotion、Manim 等引擎。项目附带 12 条硬规则确保生产正确性。

fofr@fofrAI · 12小时前60

These combine nicely with Omni: > a single unbroken scene of this strange creature <IMG_REF_0>, no dialogue, camera zooms shakily in from a distance and out, a bit of blur before it focuses, it's raining. Use the image as a reference not a first frame. One long scene filmed by an amateur.

译这些与 Omni 配合得很好： > 这个奇怪生物的一个连续不间断场景<IMG_REF_0>，没有对话，摄像机从远处颤抖地拉近又拉远，对焦前有点模糊，正在下雨。将图像用作参考而非第一帧。一个由业余爱好者拍摄的长场景。

Kling AI@Kling_ai · 12小时前31

🏆 Cannes Lions Bronze Winner — Lorem Ipsum Samurai, cowboys, mafia, why are they all speaking “Lorem Ipsum”? Bronze Lion winner in Cannes Lions Film: B2B, Lorem Ipsum was created by Argentine studio Purga Films to promote its AI advertising business. Though every character speaks in meaningless placeholder words “Lorem Ipsum”, the emotions still hit hard. As the film says: “We have the craft. We need the scripts.” The entire film was created with Kling AI. Across wildly different styles and worlds, Kling delivered consistent performances, emotional depth, and cinematic control throughout the piece. Some stories still hit you deeply — even when you don’t understand a single word. Huge congrats to Purga Films on the win!

译可灵Kling AI官方宣布，由其生成的广告片《Lorem Ipsum》荣获戛纳狮子奖（Cannes Lions）电影类B2B铜奖。该片由阿根廷工作室Purga Films制作，片中所有角色均使用无意义的占位词“Lorem Ipsum”对话，但情感表达依然强烈。全片通过Kling AI生成，涵盖多种风格和世界，展现了可灵在表演一致性、情感深度和电影级控制上的能力。

小互@xiaohu · 15小时前66

有点意思

译用户使用豆包，根据详细prompt复刻了一部真人实拍与2D动漫贴纸合成的搞笑短视频。视频为第一人称厨房做饭视角，包含4个镜头：贴纸角色倒盐捣乱、被锅铲敲头、被喂盐咸菜、齁到倒地。prompt指定了风格（8K超清竖屏）、时长10秒、场景（真实厨房）、角色（金色长发水手服Q版贴纸人物）及各镜头的动作与音效。

PixVerse@PixVerse_ · 17小时前37

@hq4ai Thank you for sharing this amazing workshop! It’s inspiring to see your amazing creation . Excited to support more creative education and future designers.

译日本大阪成蹊大学服装设计系使用 PixVerse 平台开设 AI 服装设计工作坊，学生在两个小时内完成从图像到视频的高质量闭环交付。PixVerse 官方感谢分享，并表示支持更多创意教育。

Alibaba Cloud@alibaba_cloud · 18小时前35

🥉 3rd Place at the AI Film Festival Monaco Hackathon! 🎬 Introducing 《The Glow of First Love》 by Iuliia Kiseleva — a hauntingly beautiful one-minute film that earned third place among global creators. After her husband dies in a 2003 car crash, a pregnant woman raises their daughter alone—but never truly alone. For 53 years, an unseen figure of light watches over her… until her final breath, when the glow fades and they reunite as they were at the beginning. A quiet, devastating meditation on enduring love and presence after loss—with light itself as a character—crafted using festival-grade AI tools that turn emotion into ethereal visual poetry. This is what’s possible when intimate storytelling meets Happy Horse.✨ Discover the creative engine behind it: https://int.alibabacloud.com/m/1000415018/

译阿里云Happy Horse平台创作的短片《The Glow of First Love》在摩纳哥AI电影节黑客松中获得第三名。该一分钟作品讲述一名女子在2003年丈夫车祸去世后独自抚养女儿，53年间被无形光之守护者陪伴，直至临终重逢。短片利用Happy Horse平台将情感转化为诗意视觉，展示了festival-grade AI工具在叙事中的潜力。

PixVerse@PixVerse_ · 20小时前52

Viewport in → 4K out Left: Raw motion rig blocking the pose Right: Seedance 2.0 turning that exact motion into a stylized 4K warrior Cape sway, landing weight, punch follow-through — all carried straight from the reference. Feed it clean motion. The model handles the rest.

译视口输入 → 4K 输出左：原始动作装备阻挡姿势右：Seedance 2.0 将该动作转化为风格化 4K 战士披风飘动、落地重量、出拳跟进——全部直接来自参考。输入清晰的运动。模型处理其余部分。

AYi@AYi_AInotes · 23小时前73

这个真的不像AI生成的，太逼真了！！ Seedance 2.0 Prompt：主要角色：年轻韩国女性，20岁出头，自然的日常妆容，褪色的炭灰色无袖露脐上衣，宽松的高腰浅色水洗牛仔裤，黑色帆布运动鞋，黑色绳编项链，黑色波浪长发扎成凌乱的侧马尾，带有些许碎刘海。逼真的皮肤纹理，淡妆，温暖而亲切的个性。在整个视频中保持一致的身份、服装、发型和外貌。地点：宁静的午后时分，真实的韩国住宅社区。狭窄的混凝土小巷，低矮的住宅楼，小型露台，盆栽植物，晾衣绳，自行车，电线杆，架空电线，成熟树木投下移动的树影，安静的住宅氛围。没有商店、广告、咖啡馆、人群或商业活动。视觉风格：超现实主义纪录片真实感。真实的即兴行为。自然的肢体语言。无剧本的日常生活片段感。强烈的环境真实性。丰富的现实世界细节和可信的人类动作。摄像风格：2000年代初消费级DV摄像机的美学。朋友随意记录日常生活瞬间。强烈的手持抖动，不完美的构图，频繁的自动对焦搜索，镜头呼吸，在阳光和阴影间移动时的曝光波动，偶尔的运动模糊，轻微的滚动快门，中等数字压缩伪影，褪色的色彩，柔和的对比度，轻微的传感器噪点。没有稳定。没有电影化的摄像机移动。没有现代色彩分级。 00:00–00:02 一个小房子入口外。她坐在低矮的混凝土墙上，用双手向上举起调整马尾。一阵微风吹动散落的发丝。她自然地微笑，而摄像机努力保持焦点。 00:02–00:04 摄像机跟随她走进一条两旁种满盆栽植物和混凝土墙的狭窄小巷。她注意到一只流浪猫靠近，便蹲下身。构图偏离中心，因为操作者试图跟上。 00:04–00:06 她轻轻抚摸并喂食猫咪。自动对焦反复在她脸部和动物之间切换。晨光透过头顶的树叶闪烁。 00:06–00:08 她房子旁的小前院。她在晾衣绳上挂晒衣物，织物在微风中摇曳。云朵短暂掠过头顶时曝光发生变化。 00:08–00:10 在一个安静的露台上，手持一个陶瓷咖啡杯。她舒适地坐着观察社区，偶尔将头发拨到耳后。松散的手持侧角视角，带有自然的摄像机漂移。 00:10–00:12 近距离侧脸轮廓。场外有人向她打招呼。她转过身，举起手，温暖地微笑，随口说：“Annyeong。”摄像机稍晚捕捉到这一刻。 00:12–00:15 她手持咖啡杯，缓慢走在树荫覆盖的住宅小道上。她注意到摄像机，露出一个小而真诚的微笑，然后移开视线，继续前行。录制在中途突然切到黑屏，仿佛摄像机关闭了。音频：仅自然环境音——晨间鸟鸣、远处摩托车声、轻风、树叶沙沙声、微弱的社区闲聊声、猫叫声、脚步踩在混凝土上的声音、晾衣绳上织物移动的声音、细微的住宅氛围。没有音乐。没有音效设计。没有旁白。目标：捕捉真实的韩国社区生活，仿佛一段被遗忘的2000年代初家庭录像——即兴、不完美、真实、温暖且极具说服力。 https://x.com/john_my07/status/2071977017474789557/video/1

译Seedance 2.0 通过详细 prompt 生成一段超真实视频，以 2000 年代初 DV 摄像机美学展示韩国女性日常：手持抖动、自动对焦搜索、曝光波动、运动模糊等不完美感，配合环境自然音（鸟鸣、风声、社区闲聊），实现家庭录像般的独特说服力。

PixVerse@PixVerse_ · 1天前59

The #1 reason most content never ships? Having to film yourself on camera. Meet Lip-Sync on PixVerse APP. Add an image or video, type your script (or upload audio), and generate. Use built-in voices, clone your own, or drive it with any audio file. RT + Follow + Reply = 150 credits in DMs (72h only)

译多数内容无法发布的首要原因？就是需要自己出镜拍摄。试试 PixVerse APP 上的 Lip-Sync。添加图片或视频，输入脚本（或上传音频），即可生成。使用内置声音、克隆自己的声音，或用任何音频文件驱动。转推 + 关注 + 回复 = 150 积分私信发放（仅限 72 小时）

AYi@AYi_AInotes · 1天前47

说个没人愿意说的零成本AI副业，不用露脸不用出镜，靠做动画片单月能赚$5000，赛道选YouTube儿童早教领域，全套流程靠AI就能单人跑完，前期投入为零，靠免费算力就能起步，日更一到两条，第一个月就能看到播放量收益，做得好月入能到$10000以上，一共五步，照着做就能跑通， 1️⃣找参考，搜童谣加爆款关键词，找同赛道的热门视频对标方向。 2️⃣改脚本，用AI重写故事线，换角色换场景，避开抄袭风险。 3做动画，用Wan2.7或者Pika生成连贯动画，不是静态图拼接。 4️⃣做音频，AI配儿童音色加背景音乐，音质直接决定完播率。 5️⃣做优化，标题标签瞄准早教关键词做SEO，对准流量入口。 📌三个避坑点一定要记牢： 1️⃣儿童内容审核最严，必须纯原创不能二剪，不然直接封号。 2️⃣配音别用普通机械音，用专业儿童音色，体验差了留不住观众。 3️⃣要遵守COPPA合规要求，记得关闭个性化广告，避免违规处罚。不止儿童赛道，TK带货知识付费都能套这个逻辑，换个赛道就能复用。工具链接放评论区了，想试的直接拿走去跑。

译主推文介绍零成本AI副业：用AI做YouTube儿童早教动画。五步：搜童谣对标、AI改写脚本、Wan2.7或Pika生成动画、AI配儿童音色+背景音乐、标题标签SEO。日更1-2条，首月见收益，月入$5000-$10000。注意纯原创、专业儿童音色、遵守COPPA关闭个性化广告。另引用营销Agent Lev8，找海外客户场景：有效结果90个（Exa 58.2，Codex 20），匹配精度83.3%（Exa 76.5，Codex 71.8），单条成本$0.052（Exa $0.061）。Lev8聚合50+数据源和10亿+职场人脉，支持5个渠道发送定制破冰消息。

Kling AI@Kling_ai · 1天前26

End of video. Start of game. Welcome to Choose Your Journey, our new interactive story series. Find your way out. Three doors. One choice. Choose wisely.

译视频结束，游戏开始。欢迎来到《Choose Your Journey》，我们全新的互动故事系列。找到你的出路。三扇门，一个选择。明智抉择。

Kling AI@Kling_ai · 1天前26

🎁1000 Credits Giveaway How to enter: ✓ Follow @Kling_ai ✓ Repost this post ✓ Reply with your choice and write what happens next. The Top 10 best replies will each win 1,000 Credits. Choose wisely. The next chapter may follow your comment. Duration: 72 hours

译🎁1000 Credits 抽奖活动参与方式： ✓ 关注 @Kling_ai ✓ 转发此推文 ✓ 回复你的选择并写下接下来会发生的事。排名前 10 的最佳回复将各赢取 1,000 Credits。慎重选择。下一章节可能就跟随你的评论。活动时长：72 小时

Luma@LumaLabsAI · 1天前29

Watch the take become the world. Green screen on one side, open ocean on the other, the same motion holding both. By @heydin_ai . Made with Luma.

译观看拍摄变成世界。一边是绿幕，另一边是开阔海洋，相同的动作连接两者。由@heydin_ai制作。使用Luma创作。

Runway@runwayml · 1天前36

Introducing Another Big Ad Contest For Products That Don't Exist. Your chance to make any ad you can imagine for up to $100K in cash prizes. No client notes. No producers saying no. Just 7 new briefs to choose from and 4 weeks to make your wildest concepts come to life. Big ideas win big. Learn more and get started at the link below.

译推出另一场大型广告竞赛，为不存在的产品。你的机会：制作你能想象到的任何广告，赢取高达10万美元的现金奖励。没有客户意见。没有制片人说“不”。只需从7个新创意简报中选择，用4周时间让你最疯狂的概念变为现实。大创意赢大钱。了解更多并开始，请点击下方链接。

Berryxia.AI@berryxia · 1天前29

Omini的场景很适合做换装视频，包括家居装修类的这类前后对比的场景。

Berryxia.AI@berryxia · 1天前15

我也来一杯啤酒吧~

译Omini 1.0 在视频修改方面表现不错，演示空间和透视处理有显著提升。新版本很快将可使用，但由于其属于强编辑型工具，目前热度不高。

Kling AI@Kling_ai · 1天前53

Powered by Kling, Awarded at Cannes Lions 🏆 L'Ultimo Uomo Reale (The Last Real Man) won a Silver Lion and a Bronze Lion at Cannes Lions 2026 — in the Film – Consumer Goods and the newly introduced Film Craft – AI Craft categories, respectively. Directed by Sebastian Strasser and produced by Lipstick, the film used Kling AI for the majority of its shots. From the character’s nuanced micro-expressions to fantastical worlds built from wild imaginations, Kling AI delivered industry-leading character consistency, cinematic visuals, and motion quality. It proved to be the perfect creative partner in bringing the director’s vision to life. Huge congrats to Lipstick and Team One for the win!

译可灵 Kling AI 宣布，由 Lipstick 制作、Sebastian Strasser 执导的短片《L'Ultimo Uomo Reale》（最后的真人）在 2026 年戛纳狮子奖上获奖：电影 – 消费品类银狮奖和新增的电影工艺 – AI 工艺类铜狮奖。片中大部分镜头由可灵 Kling AI 生成，展示出行业领先的角色一致性、电影级视觉效果和动作质量，成为导演创意的完美伙伴。

fofr@fofrAI · 1天前15

It’s amazing what you can script with agents these days. I gave a subagent a hyperframes skill, some Omni outputs and prompts, and it made this. Music generated with Lyria 3

译最近用智能体编写脚本真是太棒了。我给一个子智能体赋予了Hyperframes技能、一些Omni输出和提示词，它就生成了这个。音乐由Lyria 3生成。

Berryxia.AI@berryxia · 1天前26

Omini 1.0 修改视频也还可以，看着演示空间、透视这些都应该提升不少啊。应该很快就发布新版本可以使用了，但是因为属于编辑强，所以热度好像没有很高。

Luma@LumaLabsAI · 2天前20

Some doorways only open once you have walked far enough to find them. AOI, a short by Paola Rocchetti. Made with Luma.

译有些门只有当你走得足够远时才会打开。 AOI，Paola Rocchetti 创作的短片。由 Luma 制作。

Luma@LumaLabsAI · 2天前54

Seedance 2.0 Mini is now available in Luma. Bring it your boldest idea and watch it move. Generate fast, refine in the same canvas, and take your concept from spark to screen without leaving your flow. Create now → http://lumalabs.ai/app

译Seedance 2.0 Mini 现已登陆 Luma。带上你最疯狂的想法，看它动起来。快速生成，在同一画布中优化，让你的概念从火花到成片，无需离开你的工作流。立即创建 → http://lumalabs.ai/app

NotebookLM@NotebookLM · 2天前68

There seems to be a *lot* of discourse about our new Short Video Overviews. Want to join in on the fun? Short VOs have officially rolled out to ALL users on Web in English. Share your examples below! Here's one of our faves about this year's World Cup ⚽️:

译NotebookLM 正式向 Web 英文用户全量推出 Short Video Overviews（短视频概览）功能。该功能可将复杂资料自动转化为 60 秒竖屏视频，深入讲解任意概念。此前，这一功能已面向 Google AI Ultra 和 Pro 订阅者（移动端及 Web）推出，免费用户即将可用。

Runway@runwayml · 2天前49

Generate and edit video with Gemini Omni Flash, now in Runway. Start with a prompt, image or video and create anything you can imagine. Get started at the link below or ask Agent to use Omni.

译使用 Gemini Omni Flash 生成和编辑视频，现在已在 Runway 中上线。从提示词、图片或视频开始，创建你能想象到的任何内容。点击下方链接开始使用，或让 Agent 调用 Omni。

Luma@LumaLabsAI · 2天前31

A lonely dinosaur. One shared ice cream. A friendship. The whole tender little world built alongside an agent, by Anurag Tiwari. Made with Luma.

译一只孤独的恐龙。一份共享的冰淇淋。一段友谊。整个温柔的小世界，与一个AI智能体一同构建，由Anurag Tiwari创作。使用Luma制作。

AYi@AYi_AInotes · 2天前62

卧槽，Google这回不拉胯了，这才是短视频真正该有的打开方式啊， NotebookLM可以把复杂资料直接做成六十秒竖屏概览，刷信息流的功夫就能啃完一个硬核概念了🤯

Artificial Analysis@ArtificialAnlys · 2天前68

Alibaba's HappyHorse 1.1 lands at #2 on the Artificial Analysis Text to Video and Image to Video leaderboards, behind only ByteDance’s Seedance 2.0! HappyHorse 1.1 is the latest version of Alibaba's video generation model, a refinement of 1.0 on the same unified transformer architecture. Alibaba positions the upgrade around stronger audio-visual sync, including native audio with better lip-synced dialogue in seven languages, alongside gains in motion, character, and scene consistency. It supports up to nine reference images and generates at 720p and 1080p. Our results line up with that focus: HappyHorse 1.1's largest gains over 1.0 come in our Image to Video with Audio modality, where it now ranks #2, up from #5. HappyHorse 1.1 is priced at $9.90 per minute of generated video at 1080p, and is available now on Alibaba Cloud Model Studio (Bailian), Qwen Cloud, and fal. Congratulations to @HappyHorseATH and @alibaba_cloud the release! See below for comparisons between HappyHorse 1.1 and other leading models in the Artificial Analysis Video Arena 🧵

译阿里巴巴 HappyHorse 1.1 在 Artificial Analysis 文生视频和图生视频排行榜位列第二，仅次于字节跳动 Seedance 2.0。该模型基于统一 Transformer 架构，是 1.0 的改进版，重点提升音画同步，支持七种语言的原生音频与唇形同步对话，并在运动、角色和场景一致性上增强。支持最多 9 张参考图像，生成 720p 和 1080p。图生视频带音频模态从第 5 名升至第 2 名。定价 $9.90/分钟（1080p），已在阿里云 Model Studio、Qwen Cloud 和 fal 上线。

Rohan Paul@rohanpaul_ai · 2天前72

Google released Nano Banana 2 Lite, a 4-second image model, alongside Gemini Omni Flash. Image generation usually breaks creative work because every trial costs time, money, and attention. The lighter image model lowers that friction with 4-second outputs at $0.034 per 1K-resolution image. Chaining both models is the real product shape, not either model alone. Nano Banana 2 Lite makes reference images, then Gemini Omni Flash animates them. Google positions it as the replacement for gemini-2.5-flash-image across high-volume developer pipelines. Users still need prompt adherence, stable characters, and readable text during fast visual testing. Gemini Omni Flash extends the workflow from image drafts to editable 10-second video outputs. It accepts text, image, and video inputs, then edits clips through conversation. Pricing: $0.10 per second of video output, matching Veo 3.1 Fast. Gemini Omni Flash currently generates 10-second clips and lacks API audio reference support. Google says the API accepts video references up to 3 seconds, but Gemini Omni Flash does not process them correctly yet.” Interactions API keeps session context, so users can stack 3 sequential edits.

译Google推出快速图像模型Nano Banana 2 Lite（4秒生成，$0.034/1K分辨率图像）以及视频编辑模型Gemini Omni Flash（输出10秒片段，$0.10/秒，支持文本/图像/视频输入和对话式剪辑）。两者可链式使用：Nano生成参考图，Omni将其动画化，逐步替代gemini-2.5-flash-image。当前Omni Flash API不支持音频参考，视频参考最多3秒但未正确生效；Interactions API保留会话上下文，支持连续3次编辑。

fofr@fofrAI · 2天前52

Omni Flash is a smart model. The way the hand is wet, the water ripples, the refraction, the shadows, the sound effects 🤯 > Change the table to be a shallow pool of water I'm excited to see what y'all build now it's available in the API. The edit capabilities of this model were made for cool pipelines.

译Omni Flash 是个聪明的模型。看那湿润的手、水波、折射、阴影、音效 🤯 我很期待看到大家用它做什么，现在它已在 API 中可用。这个模型的编辑能力是为酷炫的流程而生的。

fofr@fofrAI · 2天前73

You can bootstrap your agent quickly with the Omni API using the skill we published: https://github.com/google-gemini/gemini-skills It includes: - video editing - text to video - video generation with image references - first frame to video But it also has some helper tools for: - prepping input videos for editing (10s, 720p) - audio stripping if you want to generate new audio - video inspection

译Google 通过 Gemini Omni API 发布 gemini-skills 技能包，支持视频编辑、文生视频、图片参考视频生成、首帧生成视频，并提供预处理输入视频为 10 秒 720p、音频剥离、视频检查等辅助工具。同作者展示 Omni Flash 模型编辑能力：输入“将桌子改成浅水池”，模型输出湿手、水波、折射、阴影及音效。该 API 已开放，可用于构建视频编辑流水线。

fofr@fofrAI · 2天前32

> Change the table to be underwater sand

译Omni Flash 模型具有出色的图像编辑能力，能够将桌子变为浅水池，并逼真呈现手部湿润、水波、折射、阴影和音效。该模型现已通过 API 提供，其编辑能力非常适合实现炫酷的流水线。

elvis@omarsar0 · 2天前45

Love how Google continues to drive down the cost of building with their models. <4s image and $0.034 / 1K image. Wow! We have a bunch of stuff (education & research) we're building @dair_ai using Nano Banana and Gemini. Testing out Nano Banana 2 Lite and sharing more soon.

译Elvis Saravia 称赞谷歌持续降低模型使用成本。谷歌在 Gemini API 和 AI Studio 中推出两款新模型：Nano Banana 2 Lite 图像生成速度低于 4 秒，价格仅 $0.034/千张；Gemini Omni Flash 在视频编辑上达到 SOTA，价格为 $0.10/秒，与 Veo 3.1 Fast 一致。Saravia 透露 DAIR.AI 正使用 Nano Banana 和 Gemini 构建教育研究项目，并已开始测试 Nano Banana 2 Lite。

Logan Kilpatrick@OfficialLoganK · 2天前78

Introducing Nano Banana 2 Lite 🍌 and Gemini Omni Flash 🔮, our new generative media models in the Gemini API and AI Studio! Nano Banana 2 Lite is extremely fast (<4s image) & cheap ($0.034 / 1K image). Omni Flash is SOTA at video editing at $0.10 / sec, same as Veo 3.1 Fast!

译推出 Nano Banana 2 Lite 🍌 和 Gemini Omni Flash 🔮，我们在 Gemini API 和 AI Studio 中新的生成媒体模型！ Nano Banana 2 Lite 极快（图像 <4 秒）且便宜（$0.034 / 1K 图像）。 Omni Flash 在视频编辑上达到 SOTA，$0.10 / 秒，与 Veo 3.1 Fast 相同！

🚨 AI News | TestingCatalog@testingcatalog · 2天前62

GOOGLE 🔥: Besides Nano Banana 2 Lite, Google also announced Gemini Omni Flash Preview on APIs and Google AI Studio! > Omni Flash is SOTA at video editing at $0.10 / sec, same as Veo 3.1 Fast! Flashes everywhere ⚡

译Google 在 Gemini API 和 AI Studio 推出两款新生成式媒体模型：Nano Banana 2 Lite 图像生成极快（<4秒/张），价格仅 $0.034/千张；Gemini Omni Flash Preview 在视频编辑上达到 SOTA，定价 $0.10/秒，与 Veo 3.1 Fast 相同。Omni Flash 现已提供 API 预览。

Google DeepMind@GoogleDeepMind · 2天前66

We’re shipping 2 major releases:  🔘 Nano Banana 2 Lite: our fastest and cheapest Gemini Image model 🔘 Gemini Omni Flash: now available via the Gemini API and in @GoogleAIStudio to help developers generate and edit high-quality videos.

译我们正在推出两个主要版本： 🔘 Nano Banana 2 Lite：我们最快、最便宜的 Gemini 图像模型 🔘 Gemini Omni Flash：现可通过 Gemini API 和 @GoogleAIStudio 使用，帮助开发者生成和编辑高质量视频。

Google AI@GoogleAI · 2天前74

We’re shipping two major updates to streamline your creative workflow, allowing you to generate high-speed images with one model and then instantly animate them with the other—all at a fraction of the cost 🍌⚡️ 1️⃣ Introducing Nano Banana 2 Lite: Our fastest and most cost-efficient Gemini Image model yet delivers text-to-image outputs in under 4 seconds. Now available via the Gemini API and Google AI Studio, and rolling out soon across @NotebookLM, @FlowbyGoogle, @geminiapp, @stitchbygoogle, Google Search and @GooglePhotos. 2️⃣ Gemini Omni Flash in Public Preview: Our natively multimodal model for cost-efficient video generation and conversational editing. Now available via the Gemini API, @googleaistudio, and Gemini Enterprise Agent Platform so you can integrate the model into your workflow. While exciting on their own, the real magic happens when you build using these models together. Watch how our interior design demo integrates Nano Banana 2 Lite and Omni to instantly reimagine any space. Upload a photo, swipe through tailored design concepts, and see Omni bring the details to life in cinematic motion. Try out the demo app in AI Studio: http://goo.gle/443xPqw

译Google AI 推出两大模型更新：1）Nano Banana 2 Lite——最快、最经济的 Gemini 图像模型，文本生成图像不到 4 秒，已上线 Gemini API 和 AI Studio，即将登陆 NotebookLM、Google 搜索、Google Photos 等；2）Gemini Omni Flash 进入公开预览——原生多模态模型，支持低成本视频生成与对话式编辑，可通过 Gemini API、AI Studio 及 Gemini Enterprise Agent Platform 集成。两模型结合可快速实现空间设计重绘：上传照片、滑动选择设计方案，Omni 将细节以电影级动画呈现。演示应用已在 AI Studio 上架。