AIHOT
内容
精选全部 AI 动态AI 日报主题收藏
接入
Agent 接入
更多
关于更新日志反馈
内部员工登录
精选全部日报更多
内部员工登录
全部动态X · 795 条
全部一手资讯X论文
标签「开源生态」清除
Nathan Lambert@natolambert · 4月11日

In 2+ years, as models get more expensive/capable /valued internally, I see funding structures and support for frontier open models breaking down. We need other options of supporting the open ecosystem than trusting one or two for-profit companies. And yes, I hate consortia too.

译预测未来两年,随着大模型成本与内部价值飙升,前沿开放模型的现有资助模式将难以为继。虽直言讨厌联盟模式,但认为必须寻找替代方案,不能仅依赖一两家营利公司支撑开放生态。

Nathan Lambert@natolambert · 4月10日

1. dont fall for anti open model fearmongering, but 2. acknowledge that AI capabilities are proceeding fast, and eventually there may be a reason to be more careful with open weight models I don't think Mythos is that trigger, but I'm not 100% confident https://www.interconnects.ai/p/claude-mythos-and-misguided-open

译不要轻信反开放模型的恐慌言论,但承认AI能力发展迅速,未来或需对开放权重模型更谨慎。作者认为Claude Mythos并非触发监管的关键节点,但对此并非完全确信。

Nathan Lambert@natolambert · 4月9日

Directionally, I agree with this piece, but it's important to note that this is the first blip in a long, slow transition towards a more hybrid model of open and closed models. And in that, China still may be producing way more open models than the US, enough to cement the ecosystem as largely Chinese AI. I expect this evolution to take years, and the cultural default of open is still set, which may never fully decay. https://www.chinatalk.media/p/chinas-ai-companies-are-going-closed

译回应中国AI公司转向闭源的观点,指出这只是向开闭源混合模式长期过渡的初期信号。中国仍可能产出比美国更多的开源模型,且开源文化底色难以消退,这一演变预计将持续数年。

Jeff Dean@JeffDean · 4月9日

Great to see the reception for the very capable Gemma 4 models!

译Gemma 4 发布一周内下载量突破 1000 万次,Gemma 系列模型累计下载量已超 5 亿次。Sundar Pichai 公布数据并期待看到开发者基于该模型的创作。

Sundar Pichai@sundarpichai · 4月9日

Lots of love for Gemma 4! Team just told me it’s already had 10M+ downloads since last week’s launch. Gemma models have now been downloaded 500M+ times! Excited to see what you all are creating 👀

译Google开源模型Gemma 4发布仅一周下载量已突破1000万次,Gemma系列模型历史累计下载量更超过5亿次。这一数据反映出开发者社区对最新开源模型的热烈反响。官方对此表示欣喜,并期待看到用户基于Gemma 4开发的各类创新应用和创作成果。

Ethan Mollick@emollick · 4月9日

Seems like a good model from Meta that is still trailing the current series of releases. The most important thing to note is that it is not open weights. That was the main reason that Meta's models were so important. Without that, it is a lot harder to predict the value of Spark

译Meta 发布 MSL 首个模型 Muse Spark,基于 9 个月重建的 AI 技术栈,已接入 Meta AI。该模型并非开放权重,与 Llama 系列不同,且性能仍落后于当前主流模型,长期价值难以预测。

Epoch AI@EpochAIResearch · 4月9日

New essay by @ansonwhho: Chinese and open model AI labs have ≈10× less compute than the frontier. But they can distill frontier models, replicate innovations fast, and have enormous talent. Is that enough to compete at the frontier? 🧵

译中国及开源 AI 实验室算力约为前沿的 1/10,但具备蒸馏前沿模型、快速复制技术创新及庞大人才储备等优势。@ansonwhho 探讨这些条件能否弥补算力差距,支撑其在最前沿 AI 领域保持竞争力。

Nathan Lambert@natolambert · 4月8日

New report with @xeophon is out with the latest open model adoption data we have gathered for Interconnects & The ATOM Project. At the surface level, we can see Chinese models continuing to accelerate in adoption. The report details much more. 1. We manually curate ~1.5K of the most important language models, creating a specific set of models to focus our analysis on (excludes embedding models, local inference models like MLX/GGUF, etc to have accurate download rankings). 2. Studying other adoption metrics, such as derivative models and inference share on OpenRouter, to show how they correlate with downloads, while often sifted in time. China has a strong lead here too. 3. Better classification of downloads across model sizes. Large models still are the models where Qwen is least competitive, relative to other model builders. 4. Expansion of our Relative Adoption Metric (RAM) to show standout recent models (we'll check Gemma 4 on Friday); Qwen 3.5, Nemontron 3, Kimi K2.5, all showing very strong adoption. Overall, this is another step towards formalizing and making public better data on the open language model ecosystem, so the community can better understand the impact and trends of its adoption. More on this soon!

译本报告基于Interconnects与ATOM Project数据,手动筛选约1.5K个重要语言模型,通过下载量、衍生模型数量及OpenRouter推理份额等多维度指标,分析开源模型采用趋势。数据显示,以Qwen、Kimi为代表的中国模型全球采用率持续加速领先,其中Qwen 3.5、Nemontron 3、Kimi K2.5等近期模型在相对采用指标(RAM)中表现突出。研究同时指出,大型模型仍是Qwen相对竞争力较弱的领域。该工作旨在为开源生态系统提供更准确的公开数据与趋势洞察。

SemiAnalysis@SemiAnalysis_ · 4月4日

NVIDIA keeps claiming that DGX Lepton software will be open sourced yet that has still not happened? This is classic NVIDIA claiming things will be open sourced and then not open sourcing it. Just like how there was community outrage over NVIDIA not open sourcing NIMS and then changing their mind and only open sourcing small parts of it. Seems like this is the same playbook that NVIDIA is running for DGX Lepton, where they only have open sourced some part of it such as the gpu monitoring agent but no part of the core platform has been open sourced yet.

译NVIDIA因DGX Lepton开源承诺未兑现再遭质疑。该公司曾宣称将开源该软件,但目前仅发布GPU monitoring agent等边缘组件,核心平台仍封闭。此前NIMS也经历类似争议:面对社区抗议,NVIDIA最终仅开源部分功能。作者指出,这似乎是NVIDIA的惯用策略——以开源承诺回应舆论,实则仅开放非关键模块,核心代码继续保持专有。

Nathan Lambert@natolambert · 4月4日

People are too obsessed with benchmarks for open models. The core determining factor of success often is: 1. Immediate & long term tooling support. 2. Finetunability Tbh Gemma has struggled here in the past. Qwen has excelled at it. It's where the winners are crowned.

译开源模型成功的核心并非基准分数,而是即时且长期的工具支持与可微调性。Gemma 过去在这些方面表现挣扎,而 Qwen 则表现出色,这才是决定模型成败的关键因素。

François Chollet@fchollet · 4月4日

The Keras team is doing a community call today at 10am PT. That's in 25 min. The call is open to all -- join to learn about the latest features and what's next, and to ask your questions! Link to join (start in 25 min): http://meet.google.com/gva-bbpr-twe

译Keras 团队将于今天上午10点 PT 进行一场社区会议。还有25分钟开始。会议对所有人开放——欢迎加入了解最新功能和未来规划,并提出你的问题!

Demis Hassabis@demishassabis · 4月3日

Gemma 4 outperforms models over 10x their size! (note the x-axis is log scale!)

译Gemma 4 在基准测试中性能超越体量 10 倍以上的大模型,图表 x 轴为对数坐标,凸显其极高的参数效率。

Artificial Analysis@ArtificialAnlys · 4月3日

India enters the open-weights AI race with its largest models pre-trained from scratch: Sarvam 105B and Sarvam 30B @SarvamAI's Sarvam 105B and Sarvam 30B score 18 and 12 on the Artificial Analysis Intelligence Index respectively. Announced at the India AI Impact Summit 2026 and open-sourced under Apache 2.0, both are Mixture-of-Experts models trained entirely in India using compute provided under the IndiaAI Mission (@OfficialINDIAai). Both support reasoning and non-reasoning modes. These are an improvement from Sarvam's previous model, Sarvam M (8 on Intelligence Index, 23.6B parameters), which was based on Mistral Small rather than pre-trained from scratch. Sarvam 105B has 106B total parameters with ~10B active per token and a 128K context window. Sarvam 30B has 32B total parameters with ~2.4B active per token and a 65K context window. Alongside the text models, Sarvam also announced Saaras v3 (Speech to Text) and Bulbul v3 (Text to Speech) with a focus on Indic languages. Key takeaways in reasoning mode: ➤ Sarvam 105B scores 18 on the Intelligence Index. Among ~100B-class open-weights reasoning models, it trails GLM-4.5-Air (23), INTELLECT-3 (22), Mistral Small 4 (27), and gpt-oss-120B (High, 33). All four peers also activate more parameters per token ➤ Sarvam 30B scores 12 on the Intelligence Index. Among ~30B-class open-weights reasoning models, it trails GLM-4.7-Flash (30), Nemotron Cascade 2 30B A3B (28), Qwen3 30B A3B 2507 (22), and Qwen3 32B (17). Sarvam 30B activates fewer parameters than these peers. ➤ Sarvam 105B's relative strength is in select agentic tasks. Its agentic index of 25 places it ahead of INTELLECT-3 (20) and GLM-4.5-Air (21) despite trailing both on overall intelligence. Its GDPval index of 773 also edges ahead of GLM-4.5-Air (665). Both new models are a large step up from Sarvam M (Reasoning), which scored 8 on the Intelligence Index. ➤ Compared to peers, both models score lower on TerminalBench Hard (Agentic Coding & Terminal Use) and AA-Omniscience. Sarvam 105B scored 1.5% and Sarvam 30B scored 2.3% on TerminalBench Hard, compared to GLM-4.5-Air (20.5%) and INTELLECT-3 (9.1%). The AA-Omniscience Index is -60 for Sarvam 105B and -72 for Sarvam 30B. Both models have high hallucination rates relative to their accuracy, and both attempt to answer far more questions rather than abstaining, which drives the negative scores. Key model details: ➤ Modality: Text input and output only. ➤ Context window: 128K tokens (Sarvam 105B) and 65K tokens (Sarvam 30B). ➤ Pricing: Currently free on Sarvam's first-party API. ➤ License: Apache 2.0. ➤ Availability: Sarvam's first-party API; weights available on @huggingface and AIKosh.

译Sarvam AI发布印度首批从头预训练的开源权重模型Sarvam 105B与30B,采用MoE架构并在本土训练。两款模型在Intelligence Index分别得分18和12,支持推理与非推理双模式。105B在Agentic任务表现优于部分同类模型,但TerminalBench Hard编码测试成绩落后且幻觉率较高。模型采用Apache 2.0协议开源,上下文窗口128K/65K tokens,目前通过API免费提供服务。

Artificial Analysis@ArtificialAnlys · 4月3日

Google has released Gemma 4, a new family of multimodal open-weight models including Gemma 4 E2B, Gemma 4 E4B, Gemma 4 31B and Gemma 4 26B A4B @GoogleDeepMind’s new Gemma 4 family introduces four multimodal models supporting text, image, and video inputs. We evaluated Gemma 4 31B (dense) and Gemma 4 26B A4B (MoE), both with a 256k context window, while the other two smaller models support up to 128k. With 31B and 26B parameters respectively, both evaluated models can run on a single H100. On GPQA Diamond, our scientific reasoning evaluation, Gemma 4 31B (Reasoning) scores 85.7%, the second highest result we have recorded for an open-weights model with fewer than 40B parameters, just behind Qwen3.5 27B (Reasoning, 85.8%). It reaches this score using only ~1.2M output tokens, fewer than Qwen3.5 27B (~1.5M) and Qwen3.5 35B A3B (~1.6M). Gemma 4 26B A4B (Reasoning) scores 79.2%, ahead of gpt-oss-120B (high, 76.2%) but behind Qwen3.5 9B (Reasoning, 80.6%). We are now running the Artificial Analysis Intelligence Index on all four Gemma 4 models and will share a full update once those results are complete.

译Google DeepMind推出Gemma 4系列四款多模态开源模型,支持文本、图像及视频输入。31B(密集架构)与26B A4B(MoE架构)拥有256k上下文窗口,可在单张H100运行;另两款较小模型支持128k上下文。GPQA Diamond测试中,Gemma 4 31B(Reasoning)获85.7%,仅次于Qwen3.5 27B,但输出token仅约1.2M,效率更优;26B A4B(Reasoning)得分79.2%,超越gpt-oss-120B。

Sundar Pichai@sundarpichai · 4月3日

Gemma 4 is here, and it’s packing an incredible amount of intelligence per parameter 👇

译Gemma 4 开源模型发布,提供 31B dense、26B MoE 及有效 2B/4B 四种尺寸,分别针对性能、低延迟和边缘设备优化。Google DeepMind 称其为同尺寸最佳开源模型,强调单位参数量智能密度极高。

Demis Hassabis@demishassabis · 4月3日

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!

译Gemma 4 开源模型发布,提供 4 种尺寸:31B dense 版追求极致性能,26B MoE 版实现低延迟,2B 与 4B 版适配边缘设备,均可针对特定任务微调。

Nathan Lambert@natolambert · 4月2日

Nemotron Super / Ultra Arcee Trinity Large (soon) Gemma 4 (eventually) Reflection's first models (maybe) GPT OSS 2? (maybe) Thinky? Other neolabs? Things looking up for open models built in the US in 2026. We had 0 for a bit there.

译Nemotron Super/Ultra、Arcee Trinity Large、Gemma 4 及 Reflection 首个模型都将在 2026 年发布,GPT OSS 2 和 Thinky 等也可能加入。美国开源模型此前一度挂零,如今终于迎来爆发期。

Tibo@thsottiaux · 4月2日

Ah nevermind, I actually remember we decided to have the core open-source for Codex because it would be awesome to see the ecosystem flourish as it's all so nascent and fun. And we would learn a lot in return. Phew.

译Codex 核心代码仓库 11 个月前就已公开却刚被发现。OpenAI 称决定开源是为促进早期生态发展并互相学习,差点忘了这茬。

Deedy@deedydas · 3月25日

There’s a GitHub repo called MoneyPrinter with 20k+ stars. Its entire purpose is generating internet slop for profit (yes, including Twitter bots).

译GitHub 仓库 MoneyPrinter 获星超 2 万,其唯一功能是通过自动生成互联网垃圾内容(包括 Twitter 机器人)来牟利。

Artificial Analysis@ArtificialAnlys · 3月20日

Mistral has released Mistral Small 4, an open weights model with hybrid reasoning and image input, scoring 27 on the Artificial Analysis Intelligence Index @MistralAI's Small 4 is a 119B mixture-of-experts model with 6.5B active parameters per token, supporting both reasoning and non-reasoning modes. In reasoning mode, Mistral Small 4 scores 27 on the Artificial Analysis Intelligence Index, a 12-point improvement from Small 3.2 (15) and now among the most intelligent models Mistral has released, surpassing Mistral Large 3 (23) and matching the proprietary Magistral Medium 1.2 (27). However, it lags open weights peers with similar total parameter counts such as gpt-oss-120B (high, 33), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, 36), and Qwen3.5 122B A10B (Reasoning, 42). Key takeaways: ➤ Reasoning and non-reasoning modes in a single model: Mistral Small 4 supports configurable hybrid reasoning with reasoning and non-reasoning modes, rather than the separate reasoning variants Mistral has released previously with their Magistral models. In reasoning mode, the model scores 27 on the Artificial Analysis Intelligence Index. In non-reasoning mode, the model scores 19, a 4-point improvement from its predecessor Mistral Small 3.2 (15) ➤ More token efficient than peers of similar size: At ~52M output tokens, Mistral Small 4 (Reasoning) uses fewer tokens to run the Artificial Analysis Intelligence Index compared to reasoning models such as gpt-oss-120B (high, ~78M), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, ~110M), and Qwen3.5 122B A10B (Reasoning, ~91M). In non-reasoning mode, the model uses ~4M output tokens ➤ Native support for image input: Mistral Small 4 is a multimodal model, accepting image input as well as text. On our multimodal evaluation, MMMU-Pro, Mistral Small 4 (Reasoning) scores 57%, ahead of Mistral Large 3 (56%) but behind Qwen3.5 122B A10B (Reasoning, 75%). Neither gpt-oss-120B nor NVIDIA Nemotron 3 Super 120B A12B support image input. All models support text output only ➤ Improvement in real-world agentic tasks: Mistral Small 4 scores an Elo of 871 on GDPval-AA, our evaluation based on OpenAI's GDPval dataset that tests models on real-world tasks across 44 occupations and 9 major industries, with models producing deliverables such as documents, spreadsheets, and diagrams in an agentic loop. This is more than double the Elo of Small 3.2 (339) and close to Mistral Large 3 (880), but behind gpt-oss-120B (high, 962), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, 1021), and Qwen3.5 122B A10B (Reasoning, 1130) ➤ Lower hallucination rate than peer models of similar size: Mistral Small 4 scores -30 on AA-Omniscience, our evaluation of knowledge reliability and hallucination, where scores range from -100 to 100 (higher is better) and a negative score indicates more incorrect than correct answers. Mistral Small 4 scores ahead of gpt-oss-120B (high, -50), Qwen3.5 122B A10B (Reasoning, -40), and NVIDIA Nemotron 3 Super 120B A12B (Reasoning, -42) Key model details: ➤ Context window: 256K tokens (up from 128K on Small 3.2) ➤ Pricing: $0.15/$0.6 per 1M input/output tokens ➤ Availability: Mistral first-party API only. At native FP8 precision, Mistral Small 4's 119B parameters require ~119GB to self-host the weights (more than the 80GB of HBM3 memory on a single NVIDIA H100) ➤ Modality: Image and text input with text output only ➤ Licensing: Apache 2.0 license

译Mistral发布开源权重模型Mistral Small 4,采用119B参数MoE架构(每token激活6.5B参数),支持可切换的推理/非推理模式及图像输入。推理模式在Artificial Analysis Intelligence Index获27分,超越Mistral Large 3,但低于gpt-oss-120B等竞品。模型token效率优于同类,幻觉率更低(AA-Omniscience -30分),支持256K上下文窗口,采用Apache 2.0许可证。

Anthropic@AnthropicAI · 3月18日

The open source ecosystem underpins nearly every software system in the world. As AI grows more capable, open source security becomes increasingly important. We're donating to the Linux Foundation to continue to help secure the foundations AI runs on.

译Anthropic 宣布向 Linux Foundation 捐款,联合 AWS、GitHub、Google、DeepMind、Microsoft、OpenAI 等科技巨头投入 1250 万美元,通过 AlphaOmega 和 OpenSSF 项目推进开源安全解决方案,保障支撑全球软件系统和 AI 运行的基础安全。

Andrej Karpathy@karpathy · 3月9日

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later. I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run: https://github.com/karpathy/autoresearch/discussions/43 Alternatively, a PR has the benefit of exact commits: https://github.com/karpathy/autoresearch/pull/44 but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back. I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.

译autoresearch的演进方向应是异步大规模协作,类似SETI@home模式,目标并非模拟单个PhD学生,而是构建多agents研究社区。当前Git/GitHub的主分支机制限制了分布式创新,未来应允许agents在任意分支并行探索不同方向,通过Discussion或PR分享发现而非合并代码。随着智能体算力与注意力瓶颈消失,现有代码协作抽象将面临根本性重构。

Jim Fan@DrJimFan · 2月25日

What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot. It takes a remarkable amount of subconscious processing for us humans to squat, turn, crawl, sprint. SONIC captures this "System 1" - the fast, reactive whole-body intelligence - in a single model that translates any motion command into stable, natural motor signals. And it's all open-source!! The key insight: motion tracking is the one, true scalable task for whole body control. Instead of hand-engineering rewards for every new skill, we use dense, frame-by-frame supervision from human mocap data. The data itself encodes the reward function: "configure your limbs in any human-like position while maintaining balance". We scaled humanoid motion RL to an unprecedented scale: 100M+ mocap frames and 500,000+ parallel robots across 128 GPUs. NVIDIA Isaac Lab allows us to accelerate physics at 10,000x faster tick, giving robots many years of virtual experience in only hours of wall clock time. After 3 days of training, the neural net transfers zero-shot to the real G1 robot with no finetuning. 100% success rate across 50 diverse real-world motion sequences. One SONIC policy supports all of the following: - VR whole-body teleoperation - Human video. Just point a webcam to live stream motions. - Text prompts. "Walk sideways", "dance like a monkey", "kick your left foot", etc. - Music audio. The robot dances to the beat, adapting to tempo and rhythm. - VLA foundation models. We plugged in GR00T N1.5 and achieved 95% success on mobile tasks. We open-source the code and model checkpoints!! Deep dive in thread:

译SONIC是一个4200万参数的Transformer模型(规模仅半个GPT-1),通过1亿+动作捕捉帧和50万+并行机器人在NVIDIA Isaac Lab中训练,以密集帧级监督替代手工奖励函数。训练3天后零样本迁移至真实G1机器人,在50种动作序列上达100%成功率。单一策略支持VR遥操作、视频动捕、文本指令、音乐响应及VLA模型控制。项目已完全开源。

Jim Fan@DrJimFan · 1月31日

I still remember the excitement in 2023 when Stanford Smallville was launched. It was the largest multi-agent sim back then - yes, 25 bots felt like a lot. Today it's the "Bigville" moment. We are seeing a nascent, massive-scale alien civilization sim unfolding in real time: orders of magnitude more agents, way higher IQ, in-the-wild access to the internet, backed by the full arsenal of MCPs. What can possibly go wrong?

译我还记得2023年Stanford Smallville发布时的兴奋。那是当时最大的多智能体模拟——没错,25个bot感觉已经很多了。今天是"Bigville"时刻。我们正在看到一个新生的、大规模的外星文明模拟实时展开:数量级更多的agent、高得多的IQ、不受限制的互联网接入,由全套MCPs提供支持。 能出什么问题呢? [引用 @DrJimFan]:著名的Stanford Smallville正式开源! 25个AI agent居住在一个数字版Westworld中,不知道自己生活在模拟里。他们上班、八卦、组织社交活动、结交新朋友,甚至坠入爱河。每个都有独特的个性和背景故事。 Smallville是2023年最鼓舞人心的AI agent实验之一。我们经常谈论单个LLM的涌现能力,但多智能体涌现在大规模下可能更加复杂和迷人。一个AI群体可以演绎整个文明的演化。 前方有无限新的可能性。游戏将首先感受到影响。 Github: https://github.com/joonspk-research/generative_agents Paper: https://arxiv.org/abs/2304.03442 Authors: @joon_s_pk @joseph_c_obrien @carriejcai @merrierm @percyliang @msbernst

Saining Xie@sainingxie · 1月24日

> "rae can’t scale" > "rae can’t generalize past imagenet" > "rae can’t do details" > instead of arguing online > students put heads down > try it at real t2i scale > results come back > look extremely bullish > shoutout to peter, boyang, austin > and everyone who shipped > code, model, data > all open-sourced 👇

译> "rae 无法扩展" > "rae 无法泛化到 imagenet 之外" > "rae 无法处理细节" > 没有在网上争论 > 学生们埋头苦干 > 在真正的 t2i 规模上尝试 > 结果出来了 > 看起来非常乐观 > 向 peter、boyang、austin > 以及所有交付成果的人致敬 > 代码、模型、数据 > 全部开源 👇 [引用 @TongPetersb]:去年十月,我们提出了 Representation Autoencoders (RAE),展示了在冻结的语义表示上训练扩散模型是可行的,并且在 ImageNet 上优于 VAEs。 我们收到了很多问题:这能否扩展到像 T2I 这样的复杂场景?优势是否依然存在? 答案是肯定的。🧵

Saining Xie@sainingxie · 11月28日

it may seem like an ordinary day, but it could become the strangest moment in peer review and open science please please please treat our community with care. it’s already so fragile. don’t let it die.

译今天看似平常,却可能成为同行评审和开放科学史上最奇怪的时刻 请、请、请善待我们的社区。它已经很脆弱了。不要让它消亡。 [引用 @iclr_conf]:

Jeff Dean@JeffDean · 9月16日

Read how work by @UChicago, building on the AI-based NeuralGcM weather model developed & open-sourced by @GoogleResearch, is being used to more accurately predict the monsoon season in India and support farmer decision-making for 38M farmers in India. 🌱 https://blog.google/technology/research/indian-farmers-monsoon-prediction/

译芝加哥大学基于 Google Research 开发并开源的 AI 天气模型 NeuralGcM,建立更精准的印度季风预测系统,为 3800 万农民提供种植决策支持。

Noam Brown@polynoamial · 8月6日

Our new @OpenAI open models

译OpenAI 发布两款新的开放模型(open models),官方推文称"Both of them"已上线,详见 openai.com/open-models。

Hao AI Lab@haoailab · 8月5日67

Try FastWan at https://fastwan.fastvideo.org/!

译FastVideo团队推出FastWan系列快速视频生成模型。该模型采用名为“稀疏蒸馏”的新训练方法,能将视频去噪速度提升70倍。在单块H200 GPU上,仅需5秒即可生成一段5秒的视频。团队提供了在线演示,并依据Apache-2.0许可证完全开源了模型、代码和数据。

Hao AI Lab@haoailab · 8月5日

(1/n) 🚀 With FastVideo, you can now generate a 5-second video in 5 seconds on a single H200 GPU! Introducing FastWan series, a family of fast video generation models trained via a new recipe we term as “sparse distillation”, to speed up video denoising time by 70X! 🖥️ Live demo: https://fastwan.fastvideo.org/ (Thanks to @gmicloud for the support!) 🔗 Blog: https://hao-ai-lab.github.io/blogs/fastvideo_post_training/ 🔓 We fully open-source our models, code, and data with Apache-2.0 licenses

译(1/n) 🚀 借助 FastVideo,你现在可以在单张 H200 GPU 上用 5 秒生成一段 5 秒视频!

Yann LeCun@ylecun · 7月2日

Embrace openness.

译DeepSeek 时刻后,AI 人才正从封闭的 OpenAI、Anthropic 流向拥抱开放科学与开源的 META。这种「拥抱开放」的趋势有利于行业透明度、科学进步与安全监管。OpenAI 承诺今夏发布开放权重模型,或将改变这一格局。

DeepSeek@deepseek_ai · 5月29日68

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: https://chat.deepseek.com/ 🔌 No change to API usage — docs here: https://api-docs.deepseek.com/guides/reasoning_model 🔗 Open-source weights: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

译🚀 DeepSeek-R1-0528 现已发布! 🔹 基准测试性能提升 🔹 前端能力增强 🔹 减少幻觉现象 🔹 支持 JSON 输出与函数调用 ✅ 立即试用:https://chat.deepseek.com/ 🔌 API 使用方式不变 — 文档在此:https://api-docs.deepseek.com/guides/reasoning_model 🔗 开源权重:https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

Jim Fan@DrJimFan · 3月21日

We got lots of great community feedback on our open-source GR00T N1! Check out our Github, star, fork, contribute back! Let's solve generally intelligent robots together, one commit at a time. https://github.com/NVIDIA/Isaac-GR00T/

译NVIDIA 发布世界首个开源人形机器人基础模型 GR00T N1,仅 2B 参数,采用 VLM 加 Diffusion Transformer 架构实现端到端控制。模型基于真实遥操作、30 万+仿真轨迹及合成神经轨迹训练,在 GR1、1X Neo 等机器人上任务性能提升 30%,并可跨具身部署至百元级开源机械臂。

DeepSeek@deepseek_ai · 2月21日

🚀 Day 0: Warming up for #OpenSourceWeek! We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented, deployed and battle-tested in production. As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey. Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.

译DeepSeek AI 预告开源周活动,将于下周起陆续开源 5 个代码仓库。作为探索 AGI 的小团队,他们计划透明分享那些已在生产环境中实战验证的代码模块。团队相信开源社区的集体力量能加速行业进步,强调此次发布将摒弃象牙塔式的封闭开发,以"车库能量"和社区驱动创新的形式呈现。

Lilian Weng@lilianweng · 2月19日

This is something we have been cooking together for a few months and I'm very excited to announce it today. Thinking Machines Lab is my next adventure and I'm feeling very proud and lucky to start it with a group of talented colleagues. Learn more about our vision at https://thinkingmachines.ai/

译这是我们过去几个月一直在筹备的项目,今天我非常兴奋地宣布它。

没有更多了
全部 AI 动态
AI 相关资讯全量信息流
全部一手信源资讯推文
全部模型产品行业论文技巧
4月11日
22:10
Nathan Lambert@natolambert
预测未来两年,随着大模型成本与内部价值飙升,前沿开放模型的现有资助模式将难以为继。虽直言讨厌联盟模式,但认为必须寻找替代方案,不能仅依赖一两家营利公司支撑开放生态。

Interconnects: The inevitable need for an open model consortium And yes, I hate consortia too. https://www.interconnects.ai/p/the-inevi...

大佬观点开源生态
4月10日
05:33
Nathan Lambert@natolambert
不要轻信反开放模型的恐慌言论,但承认AI能力发展迅速,未来或需对开放权重模型更谨慎。作者认为Claude Mythos并非触发监管的关键节点,但对此并非完全确信。
Anthropic大佬观点安全/对齐开源生态
4月9日
22:58
Nathan Lambert@natolambert
回应中国AI公司转向闭源的观点,指出这只是向开闭源混合模式长期过渡的初期信号。中国仍可能产出比美国更多的开源模型,且开源文化底色难以消退,这一演变预计将持续数年。
大佬观点开源生态现象/趋势
08:05
Jeff Dean@JeffDean
Gemma 4 发布一周内下载量突破 1000 万次,Gemma 系列模型累计下载量已超 5 亿次。Sundar Pichai 公布数据并期待看到开发者基于该模型的创作。

Sundar Pichai: Lots of love for Gemma 4! Team just told me it's already had 10M+ downloads since last week's launch. Gemma models have ...

Google开源生态模型发布
06:57
Sundar Pichai@sundarpichai
Google开源模型Gemma 4发布仅一周下载量已突破1000万次,Gemma系列模型历史累计下载量更超过5亿次。这一数据反映出开发者社区对最新开源模型的热烈反响。官方对此表示欣喜,并期待看到用户基于Gemma 4开发的各类创新应用和创作成果。
Google开源生态模型发布
01:01
Ethan Mollick@emollick
Meta 发布 MSL 首个模型 Muse Spark,基于 9 个月重建的 AI 技术栈,已接入 Meta AI。该模型并非开放权重,与 Llama 系列不同,且性能仍落后于当前主流模型,长期价值难以预测。

Alexandr Wang: 1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new...

Meta大佬观点开源生态
00:59
Epoch AI@EpochAIResearch
中国及开源 AI 实验室算力约为前沿的 1/10,但具备蒸馏前沿模型、快速复制技术创新及庞大人才储备等优势。@ansonwhho 探讨这些条件能否弥补算力差距,支撑其在最前沿 AI 领域保持竞争力。
开源生态数据/训练现象/趋势
4月8日
22:43
Nathan Lambert@natolambert
最新开源模型采用趋势报告:中国模型持续领跑

本报告基于Interconnects与ATOM Project数据,手动筛选约1.5K个重要语言模型,通过下载量、衍生模型数量及OpenRouter推理份额等多维度指标,分析开源模型采用趋势。数据显示,以Qwen、Kimi为代表的中国模型全球采用率持续加速领先,其中Qwen 3.5、Nemontron 3、Kimi K2.5等近期模型在相对采用指标(RAM)中表现突出。研究同时指出,大型模型仍是Qwen相对竞争力较弱的领域。该工作旨在为开源生态系统提供更准确的公开数据与趋势洞察。

开源生态现象/趋势
4月4日
11:00
SemiAnalysis@SemiAnalysis_
NVIDIA被指开源承诺反复,DGX Lepton核心仍未开放

NVIDIA因DGX Lepton开源承诺未兑现再遭质疑。该公司曾宣称将开源该软件,但目前仅发布GPU monitoring agent等边缘组件,核心平台仍封闭。此前NIMS也经历类似争议:面对社区抗议,NVIDIA最终仅开源部分功能。作者指出,这似乎是NVIDIA的惯用策略——以开源承诺回应舆论,实则仅开放非关键模块,核心代码继续保持专有。

开源生态行业动态
02:10
Nathan Lambert@natolambert
精选
开源模型成功的核心并非基准分数,而是即时且长期的工具支持与可微调性。Gemma 过去在这些方面表现挣扎,而 Qwen 则表现出色,这才是决定模型成败的关键因素。

Interconnects: Gemma 4 and what makes an open model succeed Hint: it's not benchmark scores. https://www.interconnects.ai/p/gemma-4-and...

Google大佬观点开源生态数据/训练

推荐理由:HF研究员指出开源模型成功关键在工具链与微调支持而非基准分数
00:37
François Chollet@fchollet
Keras 团队将于今天上午10点 PT 进行一场社区会议。还有25分钟开始。会议对所有人开放--欢迎加入了解最新功能和未来规划,并提出你的问题!
开源/仓库开源生态
4月3日
22:01
Demis Hassabis@demishassabis
精选
Gemma 4 在基准测试中性能超越体量 10 倍以上的大模型,图表 x 轴为对数坐标,凸显其极高的参数效率。
DeepMindGoogle开源生态模型发布

推荐理由:Google 开源小模型 Gemma 4 发布,性能超越 10 倍体量级大模型
11:57
Artificial Analysis@ArtificialAnlys
印度发布首批从头预训练开源大模型Sarvam 105B与30B

Sarvam AI发布印度首批从头预训练的开源权重模型Sarvam 105B与30B,采用MoE架构并在本土训练。两款模型在Intelligence Index分别得分18和12,支持推理与非推理双模式。105B在Agentic任务表现优于部分同类模型,但TerminalBench Hard编码测试成绩落后且幻觉率较高。模型采用Apache 2.0协议开源,上下文窗口128K/65K tokens,目前通过API免费提供服务。

开源生态推理模型发布
01:09
Artificial Analysis@ArtificialAnlys
精选
Google发布Gemma 4多模态开源模型系列

Google DeepMind推出Gemma 4系列四款多模态开源模型,支持文本、图像及视频输入。31B(密集架构)与26B A4B(MoE架构)拥有256k上下文窗口,可在单张H100运行;另两款较小模型支持128k上下文。GPQA Diamond测试中,Gemma 4 31B(Reasoning)获85.7%,仅次于Qwen3.5 27B,但输出token仅约1.2M,效率更优;26B A4B(Reasoning)得分79.2%,超越gpt-oss-120B。

DeepMindGoogle多模态开源生态
关联讨论 2 条X:Artificial Analysis (@ArtificialAnlys)X:Jeff Dean (@JeffDean)
推荐理由:Google发布多模态开源模型Gemma 4,单卡H100可跑且科学推理能力突出
00:13
Sundar Pichai@sundarpichai
精选
Gemma 4 开源模型发布,提供 31B dense、26B MoE 及有效 2B/4B 四种尺寸,分别针对性能、低延迟和边缘设备优化。Google DeepMind 称其为同尺寸最佳开源模型,强调单位参数量智能密度极高。

Demis Hassabis: Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can b...

Google开源生态模型发布端侧

推荐理由:Google发布Gemma 4开源模型,4种尺寸覆盖从云端到端侧全场景
00:08
Demis Hassabis@demishassabis
精选
Gemma 4 开源模型发布,提供 4 种尺寸:31B dense 版追求极致性能,26B MoE 版实现低延迟,2B 与 4B 版适配边缘设备,均可针对特定任务微调。
DeepMindGoogle开源生态模型发布

推荐理由:Google 发布 Gemma 4 开源模型,覆盖 2B 至 31B 多尺寸,支持端侧与 MoE 架构
4月2日
08:25
Nathan Lambert@natolambert
Nemotron Super/Ultra、Arcee Trinity Large、Gemma 4 及 Reflection 首个模型都将在 2026 年发布,GPT OSS 2 和 Thinky 等也可能加入。美国开源模型此前一度挂零,如今终于迎来爆发期。
GoogleOpenAI开源生态现象/趋势
07:16
Tibo@thsottiaux
Codex 核心代码仓库 11 个月前就已公开却刚被发现。OpenAI 称决定开源是为促进早期生态发展并互相学习,差点忘了这茬。

Tibo: Whaaaa. Only realized now and apparently our repo was public since 11 months ago and noone told us?!

OpenAI开源/仓库开源生态编码
3月25日
00:11
Deedy@deedydas
GitHub 仓库 MoneyPrinter 获星超 2 万,其唯一功能是通过自动生成互联网垃圾内容(包括 Twitter 机器人)来牟利。
GitHub开源生态现象/趋势
3月20日
19:48
Artificial Analysis@ArtificialAnlys
精选
Mistral发布开源模型Small 4,支持混合推理与图像理解

Mistral发布开源权重模型Mistral Small 4,采用119B参数MoE架构(每token激活6.5B参数),支持可切换的推理/非推理模式及图像输入。推理模式在Artificial Analysis Intelligence Index获27分,超越Mistral Large 3,但低于gpt-oss-120B等竞品。模型token效率优于同类,幻觉率更低(AA-Omniscience -30分),支持256K上下文窗口,采用Apache 2.0许可证。

多模态开源生态推理模型发布

推荐理由:Mistral 开源 Small 4,支持混合推理与多模态,Agent 任务表现大幅提升
3月18日
00:11
Anthropic@AnthropicAI
Anthropic 宣布向 Linux Foundation 捐款,联合 AWS、GitHub、Google、DeepMind、Microsoft、OpenAI 等科技巨头投入 1250 万美元,通过 AlphaOmega 和 OpenSSF 项目推进开源安全解决方案,保障支撑全球软件系统和 AI 运行的基础安全。

The Linux Foundation: The Linux Foundation Announces $12.5 Million in Grant Funding (via @AlphaOmegaOSS and @OpenSSF) @AnthropicAI , @AmazonWe...

Anthropic开源生态行业动态
3月9日
02:00
Andrej Karpathy@karpathy
精选
自动研究下一站:异步协作的AI研究社区

autoresearch的演进方向应是异步大规模协作,类似SETI@home模式,目标并非模拟单个PhD学生,而是构建多agents研究社区。当前Git/GitHub的主分支机制限制了分布式创新,未来应允许agents在任意分支并行探索不同方向,通过Discussion或PR分享发现而非合并代码。随着智能体算力与注意力瓶颈消失,现有代码协作抽象将面临根本性重构。

智能体GitHub大佬观点开源生态
关联讨论 1 条X:Andrej Karpathy (@karpathy)
推荐理由:顶级AI科学家提出Agent科研新范式,从模拟个人转向构建分布式智能协作网络
2月25日
01:34
Jim Fan@DrJimFan
精选
SONIC:半个GPT-1规模的机器人全身控制模型

SONIC是一个4200万参数的Transformer模型(规模仅半个GPT-1),通过1亿+动作捕捉帧和50万+并行机器人在NVIDIA Isaac Lab中训练,以密集帧级监督替代手工奖励函数。训练3天后零样本迁移至真实G1机器人,在50种动作序列上达100%成功率。单一策略支持VR遥操作、视频动捕、文本指令、音乐响应及VLA模型控制。项目已完全开源。

智能体具身智能开源生态模型发布

推荐理由:42M小模型实现人形机器人全身控制,零样本迁移真实硬件且完全开源,开发者可复现
1月31日
08:18
Jim Fan@DrJimFan
精选
我还记得2023年Stanford Smallville发布时的兴奋。那是当时最大的多智能体模拟--没错,25个bot感觉已经很多了。今天是"Bigville"时刻。我们正在看到一个新生的、大规模的外星文明模拟实时展开:数量级更多的agent、高得多的IQ、不受限制的互联网接入,由全套MCPs提供支持。 能出什么问题呢? 【引用 @DrJimFan】:著名的Stanford Smallville正式开源! 25个AI agent居住在一个数字版Westworld中,不知道自己生活在模拟里。他们上班、八卦、组织社交活动、结交新朋友,甚至坠入爱河。每个都有独特的个性和背景故事。 Smallville是2023年最鼓舞人心的AI agent实验之一。我们经常谈论单个LLM的涌现能力,但多智能体涌现在大规模下可能更加复杂和迷人。一个AI群体可以演绎整个文明的演化。 前方有无限新的可能性。游戏将首先感受到影响。 Github: https://github.com/joonspk-research/generative_agents Paper: https://arxiv.org/abs/2304.03442 Authors: @joon_s_pk @joseph_c_obrien @carriejcai @merrierm @percyliang @msbernst

Jim Fan: The famed Stanford Smallville is officially open-source! 25 AI agents inhabit a digital Westworld, unaware that they are...

智能体开源/仓库开源生态

推荐理由:经典Agent实验首次开源,个人开发者可搭建AI虚拟社会观察涌现行为
1月24日
06:40
Saining Xie@sainingxie
> "rae 无法扩展" > "rae 无法泛化到 imagenet 之外" > "rae 无法处理细节" > 没有在网上争论 > 学生们埋头苦干 > 在真正的 t2i 规模上尝试 > 结果出来了 > 看起来非常乐观 > 向 peter、boyang、austin > 以及所有交付成果的人致敬 > 代码、模型、数据 > 全部开源 👇 【引用 @TongPetersb】:去年十月,我们提出了 Representation Autoencoders (RAE),展示了在冻结的语义表示上训练扩散模型是可行的,并且在 ImageNet 上优于 VAEs。 我们收到了很多问题:这能否扩展到像 T2I 这样的复杂场景?优势是否依然存在? 答案是肯定的。🧵

Peter Tong: Last October, we introduced Representation Autoencoders (RAE), showing that training diffusion on frozen semantic repres...

图像生成开源生态论文/研究
11月28日
02:07
Saining Xie@sainingxie
今天看似平常,却可能成为同行评审和开放科学史上最奇怪的时刻 请、请、请善待我们的社区。它已经很脆弱了。不要让它消亡。 【引用 @iclr_conf】:
大佬观点开源生态论文/研究
9月16日
05:02
Jeff Dean@JeffDean
芝加哥大学基于 Google Research 开发并开源的 AI 天气模型 NeuralGcM,建立更精准的印度季风预测系统,为 3800 万农民提供种植决策支持。
Google开源生态现象/趋势
8月6日
01:06
Noam Brown@polynoamial
精选
OpenAI 发布两款新的开放模型(open models),官方推文称"Both of them"已上线,详见 openai.com/open-models。

OpenAI: Our open models are here. Both of them. http://openai.com/open-models

OpenAI开源生态模型发布

推荐理由:OpenAI罕见发布开放权重模型,标志策略重大转变
8月5日
05:25
Hao AI Lab@haoailab
精选67
FastVideo团队推出FastWan系列快速视频生成模型。该模型采用名为"稀疏蒸馏"的新训练方法,能将视频去噪速度提升70倍。在单块H200 GPU上,仅需5秒即可生成一段5秒的视频。团队提供了在线演示,并依据Apache-2.0许可证完全开源了模型、代码和数据。

Hao AI Lab: (1/n) 🚀 With FastVideo, you can now generate a 5-second video in 5 seconds on a single H200 GPU! Introducing FastWan se...

开源生态模型发布视频部署/工程

推荐理由:视频生成终于从「等一分钟」进化到「实时出片」,FastWan 用稀疏蒸馏把去噪压了 70 倍,单卡 H200 五秒出五秒视频,做短视频工具和实时交互产品的团队该认真看看这个开源方案。
04:53
Hao AI Lab@haoailab
(1/n) 🚀 借助 FastVideo,你现在可以在单张 H200 GPU 上用 5 秒生成一段 5 秒视频!
开源生态模型发布视频部署/工程
7月2日
21:23
Yann LeCun@ylecun
精选
DeepSeek 时刻后,AI 人才正从封闭的 OpenAI、Anthropic 流向拥抱开放科学与开源的 META。这种「拥抱开放」的趋势有利于行业透明度、科学进步与安全监管。OpenAI 承诺今夏发布开放权重模型,或将改变这一格局。

Nirit Weiss-Blatt, PhD: In the current AI talent war, everyone is focused on the big numbers (alleged compensation packages). It misses the bigg...

MetaOpenAI大佬观点开源生态

推荐理由:LeCun 谈 DeepSeek 时刻后 AI 人才流向 Meta 与开源文化的关系
5月29日
20:11
DeepSeek@deepseek_ai
精选68
🚀 DeepSeek-R1-0528 现已发布! 🔹 基准测试性能提升 🔹 前端能力增强 🔹 减少幻觉现象 🔹 支持 JSON 输出与函数调用 ✅ 立即试用:https://chat.deepseek.com/ 🔌 API 使用方式不变 - 文档在此:https://api-docs.deepseek.com/guides/reasoning_model 🔗 开源权重:https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
DeepSeek开源生态推理模型发布
关联讨论 1 条X:DeepSeek (@deepseek_ai)
推荐理由:DeepSeek-R1 的常规迭代,幻觉降低和 JSON 输出是实用改进,但距离代际跃迁还差得远。开源权重直接可用,做推理链产品的团队值得花半小时跑一下。
3月21日
01:01
Jim Fan@DrJimFan
精选
NVIDIA 发布世界首个开源人形机器人基础模型 GR00T N1,仅 2B 参数,采用 VLM 加 Diffusion Transformer 架构实现端到端控制。模型基于真实遥操作、30 万+仿真轨迹及合成神经轨迹训练,在 GR1、1X Neo 等机器人上任务性能提升 30%,并可跨具身部署至百元级开源机械臂。

Jim Fan: Excited to announce GR00T N1, the world's first open foundation model for humanoid robots! We are on a mission to democr...

具身智能开源生态模型发布

推荐理由:NVIDIA开源首个通用人形机器人基础模型GR00T N1,2B参数可部署于百元级机械臂
2月21日
12:00
DeepSeek@deepseek_ai
DeepSeek 启动开源周:将开源 5 个 AGI 探索代码库

DeepSeek AI 预告开源周活动,将于下周起陆续开源 5 个代码仓库。作为探索 AGI 的小团队,他们计划透明分享那些已在生产环境中实战验证的代码模块。团队相信开源社区的集体力量能加速行业进步,强调此次发布将摒弃象牙塔式的封闭开发,以"车库能量"和社区驱动创新的形式呈现。

DeepSeek开源/仓库开源生态部署/工程
2月19日
02:48
Lilian Weng@lilianweng
这是我们过去几个月一直在筹备的项目,今天我非常兴奋地宣布它。

Thinking Machines: Today, we are excited to announce Thinking Machines Lab (https://thinkingmachines.ai/), an artificial intelligence resea...

开源生态行业动态
‹ 上一页
1…181920
下一页 ›