Guide Labs 推出 Clarity,首个本质可解释的 AI 平台,解决模型“黑箱”问题。Clarity 将生成文本分为若干块,点击可查看模型生成该块所用的概念(如“海洋生物”“非洲野生动物”“计算机科学”等)。它还能将生成块与相似训练数据块关联,便于诊断错误。新增概念引导控制层,用户可直接放大或抑制特定概念,无需重写提示词或重新训练模型。
This is brilliant.
The first inherently interpretable AI platform just launched, "Clairy" by Guide Labs.
Attacks the "Black box" problem of AI.
The model generates text in chunks. You can click a chunk and see what concepts the model used to generate it.
With normal LLMs: if the model gives a wrong or biased answer, you mostly have to guess which words to change in the prompt.
Clarity changes that by trying to show the concepts the model is using while generating the answer, such as "marine life," "African wildlife," "computer science," or "male role descriptions."
i.e. you are not only seeing the final answer, you are seeing some of the hidden ingredients that pushed the model toward that answer.