gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning models post-trained from the gpt-oss models and trained to reason from a provided policy in order to label content under that policy. In this report, we describe gpt-oss-safeguard’s capabilities and provide our baseline safety evaluations on the gpt-oss-safeguard models, using the underlying gpt-oss models as a baseline. For more information about the development and architecture of the underlying gpt-oss models, see the original gpt-oss model model card.
Originally published on
OpenAI News.
Latest Briefs
Fast updates from the latest stories.
NEWS
CoinDCX Founders Arrested Amid Allegations of Impersonation Fraud
Mar 21, 2026
NEWS
ChatGPT to Introduce Ads for Free and Low-Cost Users Soon
Mar 21, 2026
NEWS
+1
New-Age Tech Stocks Rebound: FirstCry Leads Gains This Week
Mar 21, 2026
EXCLUSIVE
+4
How fusion power works and the startups pursuing it
Mar 21, 2026