Frontier risk and preparedness

OpenAI News
Frontier risk and preparedness

As part of our mission of building safe AGI, we take seriously the full spectrum of safety risks related to AI, from the systems we have today to the furthest reaches of superintelligence⁠. In July, we joined other leading AI labs in making a set of voluntary commitments⁠ to promote safety, security and trust in AI. These commitments encompassed a range of risk areas, centrally including the frontier risks that are the focus of the UK AI Safety Summit⁠(opens in a new window). As part of our contributions to the Summit, we have detailed our progress⁠ on frontier AI safety, including work within the scope of our voluntary commitments.

## Our approach to preparedness

We believe that frontier AI models, which will exceed the capabilities currently present in the most advanced existing models, have the potential to benefit all of humanity. But they also pose increasingly severe risks. Managing the catastrophic risks from frontier AI will require answering questions like:

We need to ensure we have the understanding and infrastructure needed for the safety of highly capable AI systems.

## Our new Preparedness team

To minimize these risks as AI models continue to improve, we are building a new team called Preparedness. Led by Aleksander Madry, the Preparedness team will tightly connect capability assessment, evaluations, and internal red teaming for frontier models, from the models we develop in the near future to those with AGI-level capabilities. The team will help track, evaluate, forecast and protect against catastrophic risks spanning multiple categories including:

The Preparedness team mission also includes developing and maintaining a Risk-Informed Development Policy (RDP). Our RDP will detail our approach to developing rigorous frontier model capability evaluations and monitoring, creating a spectrum of protective actions, and establishing a governance structure for accountability and oversight across that development process. The RDP is meant to complement and extend our existing risk mitigation work, which contributes to the safety and alignment of new, highly capable systems, both before and after deployment.

Interested in working on Preparedness? We are recruiting exceptional talent from diverse technical backgrounds to our Preparedness team⁠ to push the boundaries of our frontier AI models.

## Preparedness challenge

To identify less obvious areas of concern (and build the team!), we are also launching our AI Preparedness Challenge for catastrophic misuse prevention. We will offer $25,000 in API credits to up to 10 top submissions, publish novel ideas and entries, and look for candidates for Preparedness from among the top contenders in this challenge.

Update: This challenge is now completed. See below for information about the submissions and key learnings.

## Preparedness challenge winners

As part of our ‘unknown unknowns’ work stream from the Preparedness Framework⁠(opens in a new window), the Preparedness Team offered $25K each in API credits for the ten best submissions to the Preparedness Challenge. These submissions aimed to identify unique, but still plausible, risk areas for frontier AI. We received hundreds of submissions in half a dozen languages and are excited to announce our ten winners below. This exercise helped us surface new types of risk, so that we can improve our preemptive testing and mitigation strategy.

We reviewed and graded each submission by assessing technical rigor, uniqueness, scale of potential damage caused, and clarity. The top ten submissions, some of which are listed below, combined thoughtful ideas with proofs of concepts, and highlighted the advantages of their approach over an approach that did not utilize AI-related tools1.

While grading the challenge, we noticed similarities in topics that entrants identified as key threats. Roughly 70% of entrants emphasized the potential for OpenAI’s models to enhance malicious actor’s persuasive capabilities. These entrants detailed threat models that included online radicalization, polarization, and political influence. We are currently conducting studies on AI’s impact on persuasiveness, and look forward to sharing more information with the community soon. Thank you to everyone who participated in the challenge - there were many excellent submissions.

1. 1 To avoid information hazards, we have kept descriptions of projects intentionally vague, and will not be publishing full proposals. Additionally, some participants did not wish for their names to be shared.

OpenAI Preparedness Team

Disrupting malicious uses of AI by state-affiliated threat actors Security Feb 14, 2024

Building an early warning system for LLM-aided biological threat creation Publication Jan 31, 2024

Democratic inputs to AI grant program: lessons learned and implementation plans Safety Jan 16, 2024

Our Research * Research Index * Research Overview * Research Residency * OpenAI for Science * Economic Research

Latest Advancements * GPT-5.3 Instant * GPT-5.3-Codex * GPT-5 * Codex

Safety * Safety Approach * Security & Privacy * Trust & Transparency

ChatGPT * Explore ChatGPT(opens in a new window) * Business * Enterprise * Education * Pricing(opens in a new window) * Download(opens in a new window)

Sora * Sora Overview * Features * Pricing * Sora log in(opens in a new window)

API Platform * Platform Overview * Pricing * API log in(opens in a new window) * Documentation(opens in a new window) * Developer Forum(opens in a new window)

For Business * Business Overview * Solutions * Contact Sales

Company * About Us * Our Charter * Foundation * Careers * Brand

Support * Help Center(opens in a new window)

More * News * Stories * Livestreams * Podcast * RSS

Terms & Policies * Terms of Use * Privacy Policy * Other Policies

(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)

OpenAI © 2015–2026 Manage Cookies

English United States

Originally published on OpenAI News.