为黑盒划定边界：面向AI风险监管的统计认证框架

2026-04-25 12:00·57天前·Natan Levy, Gadi Perl

精选理由

EU AI Act 已经生效但没人知道「可接受风险」到底怎么量化，这篇论文直接给出了黑盒统计验证工具，做合规和安全的团队值得细读，虽然离工程落地还有距离。

AI 摘要

当前AI系统已在贷款审批、刑事调查标记、自动驾驶刹车等高风险领域做出决策，欧盟《人工智能法案》等监管框架要求系统在部署前证明安全性，但均未界定“可接受风险”的量化标准，也缺乏验证是否达标的可行方法。研究借鉴航空认证范式，提出一个两阶段统计认证框架：第一阶段由主管机构明确设定可接受失败概率δ与操作输入域ε；第二阶段通过RoMA与gRoMA统计验证工具，在不依赖模型内部结构的前提下，计算出系统真实失败率的可审计上限。该框架适用于任意架构的黑盒模型，能将监管责任前移至开发方，并与现有法律体系衔接。

原文 · 未翻译

Computer Science > Artificial Intelligence

[Submitted on 23 Apr 2026]

Title:Bounding the Black Box: A Statistical Certification Framework for AI Risk Regulation

Authors:Natan Levy, Gadi Perl

View PDF HTML (experimental)

Abstract:Artificial intelligence now decides who receives a loan, who is flagged for criminal investigation, and whether an autonomous vehicle brakes in time. Governments have responded: the EU AI Act, the NIST Risk Management Framework, and the Council of Europe Convention all demand that high-risk systems demonstrate safety before deployment. Yet beneath this regulatory consensus lies a critical vacuum: none specifies what ``acceptable risk'' means in quantitative terms, and none provides a technical method for verifying that a deployed system actually meets such a threshold. The regulatory architecture is in place; the verification instrument is not.
This gap is not theoretical. As the EU AI Act moves into full enforcement, developers face mandatory conformity assessments without established methodologies for producing quantitative safety evidence - and the systems most in need of oversight are opaque statistical inference engines that resist white-box scrutiny.
This paper provides the missing instrument. Drawing on the aviation certification paradigm, we propose a two-stage framework that transforms AI risk regulation into engineering practice. In Stage One, a competent authority formally fixes an acceptable failure probability $\delta$ and an operational input domain $\varepsilon$ - a normative act with direct civil liability implications. In Stage Two, the RoMA and gRoMA statistical verification tools compute a definitive, auditable upper bound on the system's true failure rate, requiring no access to model internals and scaling to arbitrary architectures. We demonstrate how this certificate satisfies existing regulatory obligations, shifts accountability upstream to developers, and integrates with the legal frameworks that exist today.

Comments:
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2604.21854 [cs.AI]
	(or arXiv:2604.21854v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2604.21854 arXiv-issued DOI via DataCite

Submission history

From: Natan Levy [view email]
[v1] Thu, 23 Apr 2026 16:50:35 UTC (96 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2026-04

Change to browse by:

References & Citations

Bookmark

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)

Connected Papers (What is Connected Papers?)

Litmaps (What is Litmaps?)

scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub (What is DagsHub?)

Gotit.pub (What is GotitPub?)

Hugging Face (What is Huggingface?)

ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)

Hugging Face Spaces (What is Spaces?)

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

安全/对齐政策/监管论文/研究

arXiv：cs.AI（全量分类）

精选62