10:36.432
I'm not a doomer. I'm a doer. My team and I are working on a technical solution. We call it Scientist AI. It's modeled after a selfless, ideal scientist who’s only trying to understand the world, without agency. Unlike the current AI systems that are trained to imitate us or please us, which gives rise to these untrustworthy agentic behaviors. So what could we do with this? Well, one important question is we might need agentic AIs in the future. So how could the Scientist AI, which is not agentic, fit the bill? Well, here's the good news. The Scientist AI could be used as a guardrail against the bad actions of an untrusted AI agent. And it works because in order to make predictions that an action could be dangerous, you don't need to be an agent. You just need to make good, trustworthy predictions. In addition, the Scientist AI, by nature of how it's designed, could help us accelerate scientific research for the betterment of humanity. We need a lot more of these scientific projects to explore solutions to the AI safety challenges, and we need to do it quickly.
我不是一个末日主义者。 我是一个行动者。 我和我的团队正在研究技术解决方案。 我们称之为“科学家 AI”。 它以一位无私、理想的科学家为蓝本, 只是想了解世界, 没有自主性。 与当前为模仿或取悦我们 而训练的 AI 系统不同, 后者导致了不可信的的智能体行为。 我们能做些什么呢? 一个重要的问题是 我们将来可能需要智能体 AI。 那么不是智能体的科学家 AI 该如何满足需求呢? 有个好消息。 科学家 AI 可以被用作 避免恶性行为的防护措施, 警惕不可信的 AI 智能体。 这种用途是因为要预测 某个行为是危险的, 并不需要智能体来做。 只需要做出好的、值得信赖的预测即可。 此外, 科学家 AI 的设计初衷 可以帮助我们加快科学研究, 改善人类福祉。 我们需要更多这样的科学项目 来探索 AI 安全挑战的解决方案, 而且是快速地探索。