fofr@fofrAI

2026-07-01 00:36·2天前

AI 摘要

Google 通过 Gemini Omni API 发布 gemini-skills 技能包，支持视频编辑、文生视频、图片参考视频生成、首帧生成视频，并提供预处理输入视频为 10 秒 720p、音频剥离、视频检查等辅助工具。同作者展示 Omni Flash 模型编辑能力：输入“将桌子改成浅水池”，模型输出湿手、水波、折射、阴影及音效。该 API 已开放，可用于构建视频编辑流水线。

You can bootstrap your agent quickly with the Omni API using the skill we published：

https://github.com/google-gemini/gemini-skills

It includes：

video editing
text to video
video generation with image references
first frame to video

But it also has some helper tools for：

prepping input videos for editing （10s， 720p）
audio stripping if you want to generate new audio
video inspection

fofrOmni Flash is a smart model. The way the hand is wet, the water ripples, the refraction, the shadows, the sound effects 🤯 > Change the table to be a shallow po...

智能体 Google 教程/实践视频

在 X 查看原推导出 Markdown

fofr@fofrAI · X

73导出 Markdown