刚刚刷到Hugging Face上这个gpt-oss-20b-tq3，真的有点爽啊！ OpenAI自己开源的20B参数MoE模型，被社区用TurboQuant 3-bit量化 + MLX优化后，竟然能直接在普通MacBook上本地丝滑跑起来。完全不用联网、不用交月费，还支持131K超长上下文。日常聊天、写作、写代码这些日常需求，现在都能在自己笔记本上搞定。非常适合公司的一些部门使用啊！以前本地跑大模型还得配高端显卡，现在一台M系列Mac就够了。模型直达👉 https://huggingface.co/manjunathshiva/gpt-oss-20b-tq3

AK@_akhaliq · 5月6日61

Persistent Visual Memory Sustaining Perception for Deep Generation in LVLMs paper: https://huggingface.co/papers/2605.00814

译持久视觉记忆为LVLMs中的深度生成维持感知论文: https://huggingface.co/papers/2605.00814

AK@_akhaliq · 5月5日68

UniVidX A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors paper: https://huggingface.co/papers/2605.00658

译UniVidX 一个通过扩散先验实现多功能视频生成的统一多模态框架 paper: https://huggingface.co/papers/2605.00658

AK@_akhaliq · 5月2日56

Heterogeneous Scientific Foundation Model Collaboration paper: https://huggingface.co/papers/2604.27351

译异构科学基础模型协作 paper: https://huggingface.co/papers/2604.27351

Ant Ling@AntLingAGI · 5月1日76

Ecosystem-first approach continued! Ling-2.6-1T officially landed on @huggingface and the official inference is now live via @novita_labs. Experience the efficiency of Ling-2.6-1T for yourself, front and center on HF model card page! 🔥

译AntLingAGI团队宣布Ling-2.6-1T模型正式开源，已登陆Hugging Face平台，并通过Novita Labs提供官方推理体验。该模型采用混合专家架构，总参数1万亿、激活参数630亿，核心优化方向为“令牌效率”以满足真实生产需求。具体表现为：低令牌开销，能在无需冗长推理链的情况下保持强大智能；可靠的多步执行能力，提升指令、工具、上下文和工作流的控制水平；生产就绪的部署特性，覆盖从代码生成到错误修复的任务，并广泛兼容各类智能体框架。团队旨在通过降低测试、部署、定制和构建的难度，为开发者创造价值。

Berryxia.AI@berryxia · 4月30日59

🚀 Qwen 重磅开源 Qwen-Scope！稀疏自编码器完整套件正式发布，把 SAE 特性变成真正能落地的实用工具，模型可解释性直接起飞！ 1. Inference：直接操纵内部特征实现输出控制，完全无需 prompt engineering 2. Data：用极少种子样本就能分类和合成目标数据，解决长尾能力问题 3. Training：精准追溯 code-switching 和重复生成根源，从源头修复 4. Evaluation：通过特征激活模式分析智能挑选 benchmark，减少冗余 Qwen 模型家族的深度可解释性神器，社区快来挖掘新机制和应用！项目地址： https://huggingface.co/collections/Qwen/qwen-scope

译Qwen开源了Qwen-Scope，这是一个为Qwen模型家族设计的稀疏自编码器完整套件，旨在将SAE特征转化为实用工具。该套件提供四大核心功能：在推理方面，可直接操纵模型内部特征以控制输出，无需依赖提示工程；在数据方面，能用极少样本对目标数据进行分类和合成，增强模型的长尾能力；在训练方面，能精准追溯代码切换和重复生成等问题的根源并进行修复；在评估方面，可通过分析特征激活模式来智能筛选基准测试，减少冗余。Qwen希望社区能利用此工具深入探索模型内部机制并开发更多应用。

Qwen@Alibaba_Qwen · 4月30日73

Today we’re releasing Qwen-Scope 🔭, an open suite of sparse autoencoders for the Qwen model family. It turns SAE features into practical tools： 🎯 Inference — Steer model outputs by directly manipulating internal features, no prompt engineering needed 📂 Data — Classify & synthesize targeted data with minimal seed examples, boosting long-tail capabilities 🏋️ Training — Trace code-switching & repetitive generation back to their source, fix them at the root 📊 Evaluation — Analyze feature activation patterns to select smarter benchmarks and cut redundancy We hope the community uses Qwen-Scope to uncover new mechanisms inside Qwen models and build applications beyond what we explored.Excited to see what you build! 🚀 🔗🔗 Blog: https://qwen.ai/blog?id=qwen-scope HuggingFace: https://huggingface.co/collections/Qwen/qwen-scope ModelScope: https://modelscope.cn/collections/Qwen/Qwen-Scope Technical Report: https://qianwen-res.oss-accelerate.aliyuncs.com/qwen-scope/Qwen_Scope.pdf

译Qwen团队推出开源稀疏自编码器套件Qwen-Scope，将SAE特征转化为实用工具。该套件支持四大应用方向：无需提示工程即可通过直接操控内部特征引导模型输出；用极少样本对目标数据进行分类与合成，提升长尾能力；追踪代码切换和重复生成问题的根源并进行修复；通过分析特征激活模式优化评测基准并减少冗余。团队希望社区利用Qwen-Scope深入探索Qwen模型内部机制，并开发出超越现有研究范围的应用。相关资源已开放。

Artificial Analysis@ArtificialAnlys · 4月29日63

IBM has released three new non-reasoning Granite 4.1 models (30B, 8B, 3B) as open weights under Apache 2.0. All three are notably token-efficient relative to peer non-reasoning models, with the 8B standing out for its token efficiency relative to intelligence @IBM has released three new instruct models in the Granite 4.1 family: Granite 4.1 30B (15 on the Intelligence Index), Granite 4.1 8B (12), and Granite 4.1 3B (9). The release continues IBM's focus on small, efficient, and open models for enterprise and edge deployment, alongside the existing Granite 4.0 Nano family (1B and 350M variants released in October 2025). The Intelligence Index is the Artificial Analysis synthesis metric incorporating 10 evaluations covering agentic tasks, coding, and scientific reasoning. Key benchmarking results: ➤ All three Granite 4.1 models score 61 on the Artificial Analysis Openness Index, standing out among peer open weights non-reasoning models. This is driven by full open weights under Apache 2.0 plus partial disclosures across pre-training data, post-training data, and training methodology. Granite 4.1 sits well above peers like Qwen3.5 (39), Gemma 4 (39) and GLM-4.7-Flash (44), and represents a meaningful improvement over the Granite 4.0 family (56), driven by stronger methodology disclosure. Olmo 3.1 and K2 Think V2 (both 89) remain leaders as the most ‘open’ models. ➤ Granite 4.1 8B uses just 4M output tokens to run the Intelligence Index. This is ~20x fewer than Qwen3.5 9B (78M tokens), ~3x fewer than Ministral 3 8B (13M), and ~2x fewer than Gemma 4 E4B (8M). The pattern holds across the family: Granite 4.1 30B uses 4.6M output tokens (vs 7M for Gemma 4 31B and 25M for Qwen3.5 27B), and Granite 4.1 3B uses 2.7M. ➤ Token efficiency comes at the cost of intelligence relative to peer non-reasoning models. Granite 4.1 30B (15) trails leading peers like Qwen3.5 27B (37) and Gemma 4 31B (32). Granite 4.1 8B (12) trails Ministral 3 8B (15) and Gemma 4 E4B (15). Granite 4.1 3B (9) trails Gemma 4 E2B (12). ➤ Granite 4.1 30B and 3B both gain on the Intelligence Index over their Granite 4.0 predecessors. Granite 4.1 30B (15) gains 4 points over Granite 4.0 H Small (32B / 9B active, 11), with the largest gains in tool use (τ²-Bench: 42% vs 17%) and agentic tasks (GDPval-AA: 493 vs 344 Elo). Granite 4.1 3B (9) gains 1 point over Granite 4.0 Micro (8). Other information: ➤ License: Apache 2.0 (open weights, permissive commercial use) ➤ Context window: 128K tokens ➤ Availability: Granite 4.1 8B is available via @WandB ($0.05/$0.1 per 1M input/output tokens) and @replicate. Weights for all three models are available via @huggingface.

译IBM发布了三款采用Apache 2.0许可的Granite 4.1开源模型（30B、8B、3B）。其核心特点是极高的令牌效率，例如8B模型运行智能指数仅需4M输出令牌，远低于同类模型。在开放性指数上，三款模型均获得61分，领先多数同行。但高效率也带来了智能指数的相对折衷，其得分低于Qwen3.5、Gemma 4等竞品。不过，与上一代Granite 4.0系列相比，新模型的智能表现仍有提升。该系列模型拥有128K令牌的上下文窗口，主要面向企业和边缘部署，可通过WandB、Replicate和Hugging Face获取。

Tencent Hy@TencentHunyuan · 4月29日67

We're open-sourcing Hy-MT1.5-1.8B-1.25bit — a 440MB translation model that runs fully offline on your phone, supports 33 languages, and outperforms Google Translate. At 1.8B parameters, it matches commercial translation APIs and 235B-scale models on standard benchmarks. By quantizing to 1.25-bit, memory drops from 3.3GB (FP16) to 440MB — 25% smaller and ~10% faster than prior 1.67-bit approaches, with no accuracy loss. Covers 33 languages, 5 dialects, and 1,056 translation directions including minority languages like Tibetan and Mongolian. Our translation model has won 30 first-place rankings in international MT competitions and is already deployed across multiple Tencent products.🏆 📲Demo APK (Android): https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk 🤗Hugging Face:: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit 🔗GitHub: https://github.com/tencent/AngelSlim 📄Paper: https://arxiv.org/abs/2601.07892

译腾讯开源了Hy-MT1.5-1.8B-1.25bit翻译模型，其参数量为18亿，经量化后仅440MB，可在手机上完全离线运行。该模型支持33种语言、5种方言及1056个翻译方向，包括藏语、蒙古语等少数语言。在标准测试中，其性能媲美商业翻译API和2350亿参数的大模型。通过量化至1.25比特，模型内存占用从FP16格式的3.3GB大幅降低，比之前的1.67比特方法体积缩小25%、速度提升约10%，且无精度损失。该模型已在国际机器翻译竞赛中获得30项第一，并部署于腾讯多个产品中。

SenseTime@SenseTime_AI · 4月29日56

Thank you @liuziwei7 for co‑creating the future of #multimodal intelligence with us!

译感谢 @liuziwei7 与我们共同创造 #多模态智能的未来！

SenseTime@SenseTime_AI · 4月29日65

𝗬𝗲𝘀, 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨1 𝗶𝘀 𝗻𝗼𝘄 𝗮𝘃𝗮𝗶𝗹𝗮𝗯𝗹𝗲 𝗼𝗻 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗙𝗮𝗰𝗲 𝗮𝗻𝗱 𝗚𝗶𝘁𝗛𝘂𝗯! Discover how it enables complex #infographic creation with semantic precision and pixel‑level fidelity. Hugging Face: https://huggingface.co/collections/sensenova/sensenova-u1 GitHub: https://github.com/OpenSenseNova/SenseNova-U1 Discord: https://discord.gg/cxkwXWjp

译是的，SenseNova U1 现已在 Hugging Face 和 GitHub 上发布！探索它如何以语义精确性和像素级保真度实现复杂的 #信息图创作。 Hugging Face: https://huggingface.co/collections/sensenova/sensenova-u1 GitHub: https://github.com/OpenSenseNova/SenseNova-U1 Discord: https://discord.gg/cxkwXWjp

AK@_akhaliq · 4月29日44

SenseNova U1 is out on Hugging Face https://huggingface.co/collections/sensenova/sensenova-u1

译SenseNova U1 已在 Hugging Face 发布 https://huggingface.co/collections/sensenova/sensenova-u1

AK@_akhaliq · 4月29日57

From Skills to Talent Organising Heterogeneous Agents as a Real-World Company paper: https://huggingface.co/papers/2604.22446

译从技能到人才将异构智能体组织为现实世界的公司论文: https://huggingface.co/papers/2604.22446

AK@_akhaliq · 4月25日39

Context Unrolling in Omni Models paper: https://huggingface.co/papers/2604.21921

译Omni模型中的上下文展开 paper: https://huggingface.co/papers/2604.21921

AK@_akhaliq · 4月24日40

over 1.2 million AI apps on Hugging Face the biggest AI app store probably

译Hugging Face 上有超过 120 万个 AI 应用这可能是最大的 AI 应用商店

AK@_akhaliq · 4月23日

OpenAI just released privacy-filter on Hugging Face a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text model: https://huggingface.co/openai/privacy-filter

译OpenAI 刚刚在 Hugging Face 上发布了 privacy-filter 一个用于文本中个人身份信息（PII）检测与掩码的双向 token 分类模型模型：https://huggingface.co/openai/privacy-filter

AK@_akhaliq · 4月22日

Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation paper: https://huggingface.co/papers/2604.18168

译通过判别性文本表征将一步图像生成从类别标签扩展到文本 paper: https://huggingface.co/papers/2604.18168

AK@_akhaliq · 4月21日42

Kimi K2.6 is available in huggingchat

译Kimi K2.6 现已在 huggingchat 上可用

AK@_akhaliq · 4月21日

Maximal Brain Damage Without Data or Optimization Disrupting Neural Networks via Sign-Bit Flips paper: https://huggingface.co/papers/2502.07408

译无需数据或优化的最大脑损伤通过符号位翻转破坏神经网络 paper: https://huggingface.co/papers/2502.07408

AK@_akhaliq · 4月21日56

Kimi K2.6 is out on Hugging Face https://huggingface.co/moonshotai/Kimi-K2.6

译Kimi K2.6 已在 Hugging Face 发布 https://huggingface.co/moonshotai/Kimi-K2.6

Nathan Lambert@natolambert · 4月17日

New video! Talking through my 10+ open model pieces from early 2026 and how they fit together. They're all trying to figure out where open models go next. Mostly, 10min video form of the thread below. 00:00 Intro & Recap piece 02:57 High-level trends & capabilities 07:09 State of the ecosystem 08:21 Better American Models 10:10 Long-term strategy & control of AI 12:05 Conclusion

译新视频！介绍我2026年初写的10多篇关于开放模型的文章，以及它们如何相互关联。这些文章都在试图弄清楚开放模型接下来会走向何方。主要是下面这个帖子的10分钟视频版。 00:00 介绍与回顾文章 02:57 高层趋势与能力 07:09 生态系统现状 08:21 更好的美国模型 10:10 AI的长期战略与控制 12:05 结论 [引用 @natolambert]：我花了一些时间试图将影响开放模型的所有复杂因素——经济、能力、分发、政策等——提炼成一份清晰的信念清单。以下是完整内容。 1. 令人惊讶的是，基于训练和研究的算力差异，顶尖闭源模型并未显示出相对于开放模型不断增长的能力优势，特别是在2025年下半年至今。

AK@_akhaliq · 4月15日

Playing Along Learning a Double-Agent Defender for Belief Steering via Theory of Mind paper: https://huggingface.co/papers/2604.11666

译配合演出通过心智理论学习用于信念引导的双重代理防御者论文：https://huggingface.co/papers/2604.11666

AK@_akhaliq · 4月14日35

GLM-5.1 sunset racing game on Hugging Face is kind of fun to play app: https://huggingface.co/spaces/victor/sunset-racing-glm-5.1

译Hugging Face 上的 GLM-5.1 日落赛车游戏玩起来挺有趣 app: https://huggingface.co/spaces/victor/sunset-racing-glm-5.1

AK@_akhaliq · 4月14日48

WildDet3D Scaling Promptable 3D Detection in the Wild paper: https://huggingface.co/papers/2604.08626

译WildDet3D 在野外扩展可提示的3D检测论文: https://huggingface.co/papers/2604.08626

AK@_akhaliq · 4月12日

MiniMax-M2.7 is out on Hugging Face model: https://huggingface.co/MiniMaxAI/MiniMax-M2.7

译MiniMax-M2.7 模型现已在 Hugging Face 平台发布，用户可通过官方仓库链接获取该模型。

AK@_akhaliq · 4月11日

MegaStyle Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping paper: https://huggingface.co/papers/2604.08364

译MegaStyle 提出通过一致文本到图像风格映射构建多样化可扩展风格数据集的方案，论文已发布至 Hugging Face（2604.08364）。

AK@_akhaliq · 4月11日

HY-Embodied-0.5 Embodied Foundation Models for Real-World Agents paper: https://huggingface.co/papers/2604.07430

译HY-Embodied-0.5正式发布，专为真实世界智能体打造的具身基础模型，相关论文已公开至Hugging Face。

AK@_akhaliq · 4月11日

Rethinking Generalization in Reasoning SFT A Conditional Analysis on Optimization, Data, and Model Capability paper: https://huggingface.co/papers/2604.06628

译从优化过程、数据构成与模型能力三个条件维度，对推理 SFT 的泛化性展开分析，重新审视监督微调在推理任务中的泛化机制与关键影响因素。

AK@_akhaliq · 4月11日

SkillClaw Let Skills Evolve Collectively with Agentic Evolver paper: https://huggingface.co/papers/2604.08377

译SkillClaw 提出一种基于 Agentic Evolver 的框架，支持技能在智能体系统中集体进化与协同优化，相关论文已发布于 Hugging Face。

AK@_akhaliq · 4月10日

DMax Aggressive Parallel Decoding for dLLMs paper: https://huggingface.co/papers/2604.08302

译DMax 提出针对扩散语言模型（dLLM）的激进并行解码方案，突破传统顺序生成限制，显著提升推理速度。论文已发布。

AK@_akhaliq · 4月10日

FP4 Explore, BF16 Train Diffusion Reinforcement Learning via Efficient Rollout Scaling paper: https://huggingface.co/papers/2604.06916

译新论文提出扩散强化学习方法，在Rollout探索阶段使用FP4低精度采样，训练阶段采用BF16精度，通过混合精度策略平衡计算效率与训练稳定性，实现高效扩展。

AK@_akhaliq · 4月10日

MARS Enabling Autoregressive Models Multi-Token Generation paper: https://huggingface.co/papers/2604.07023

译MARS 新方法支持自回归模型每步生成多个 Token，打破传统逐 Token 解码的效率限制，相关论文已公开。

AK@_akhaliq · 4月10日

RAGEN-2 Reasoning Collapse in Agentic RL paper: https://huggingface.co/papers/2604.06268

译RAGEN-2 论文发布，研究智能体强化学习（Agentic RL）中的「推理崩溃」现象，即训练过程中智能体推理能力退化的问题。论文已上传至 Hugging Face。

AK@_akhaliq · 4月10日

Think in Strokes, Not Pixels Process-Driven Image Generation via Interleaved Reasoning paper: https://huggingface.co/papers/2604.04746

译新论文提出过程驱动的图像生成方法，通过交错推理模拟绘画笔触的创作过程，而非直接生成像素，实现更符合人类作画逻辑的图像合成。

AK@_akhaliq · 4月10日

Embarrassingly Simple Self-Distillation Improves Code Generation paper: https://huggingface.co/papers/2604.01193

译「简单到令人尴尬」的自蒸馏方法无需复杂架构或额外数据，即可有效提升大模型代码生成能力，效果优于现有复杂方案。相关论文已发布在 Hugging Face Papers。

AK@_akhaliq · 4月9日

INSPATIO-WORLD A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling paper: https://huggingface.co/papers/2604.07209

译INSPATIO-WORLD 通过时空自回归建模实现实时 4D 世界模拟，可实时生成动态三维环境并支持交互。技术论文已发布于 Hugging Face。

AK@_akhaliq · 4月9日

Video-MME-v2 Towards the Next Stage in Benchmarks for Comprehensive Video Understanding paper: https://huggingface.co/papers/2604.05015

译Video-MME 基准测试发布 v2 版本，推动全面视频理解评估进入新阶段。论文已上传至 Hugging Face。

AK@_akhaliq · 4月7日

MinerU2.5-Pro Pushing the Limits of Data-Centric Document Parsing at Scale paper: https://huggingface.co/papers/2604.04771

译MinerU2.5-Pro 发布，专注于突破大规模数据驱动文档解析的技术极限。相关论文已上传至 Hugging Face。

AK@_akhaliq · 4月7日

OpenWorldLib A Unified Codebase and Definition of Advanced World Models paper: https://huggingface.co/papers/2604.04707

译OpenWorldLib 正式发布，提供高级世界模型的统一代码库与标准化定义，相关论文已上传至 Hugging Face。

AK@_akhaliq · 4月7日

gradio.Server Any Custom Frontend with Gradio's Backend build with your own frontend framework entirely like React, Svelte, or even plain HTML/JS, while still benefiting from Gradio's queuing system, API infrastructure, MCP support, and ZeroGPU on Spaces blog: https://huggingface.co/blog/introducing-gradio-server

译gradio.Server 允许开发者使用 React、Svelte 或纯 HTML/JS 等任意前端框架构建应用，同时完整保留 Gradio 的队列系统、API 基础设施、MCP 支持及 Spaces ZeroGPU 等后端能力。