New and improved content moderation tooling

OpenAI News
New and improved content moderation tooling

To help developers protect their applications against possible misuse, we are introducing the faster and more accurateModeration endpoint⁠(opens in a new window). This endpoint provides OpenAI API developers with free access toGPT‑based⁠classifiers that detect undesired content—an instance ofusing AI systems⁠to assist with human supervision of these systems. We have also released both atechnical paper⁠(opens in a new window)describing our methodology and thedataset⁠(opens in a new window)used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm—content prohibited by ourcontent policy⁠(opens in a new window). The endpoint has been trained to be quick, accurate, and to perform robustly across a range of applications. Importantly, this reduces the chances of products “saying” the wrong thing, even when deployed to users at-scale. As a consequence, AI can unlock benefits in sensitive settings, like education, where it could not otherwise be used with confidence.

The Moderation endpoint helps developers to benefit from our infrastructure investments. Rather than build and maintain their own classifiers—an extensive process, as we document in ourpaper⁠(opens in a new window)—they can instead access accurate classifiers through a single API call.

As part of OpenAI’scommitment⁠tomaking the AI ecosystem safer⁠, we are providing this endpoint to allow free moderation of all OpenAI API-generated content. For instance,Inworld⁠(opens in a new window), an OpenAI API customer, uses the Moderation endpoint to help their AI-based virtual characters remain appropriate for their audiences. By leveraging OpenAI’s technology, Inworld can focus on their core product: creating memorable characters. We currently do not support monitoring of third-party traffic.

Get started with the Moderation endpoint by checking outthe documentation⁠(opens in a new window). More details of the training process and model performance are available in ourpaper⁠(opens in a new window). We have also released anevaluation dataset⁠(opens in a new window), featuring Common Crawl data labeled within these categories, which we hope will spur further research in this area.

* View documentation(opens in a new window)

Todor Markov, Chong Zhang, Sandhini Agarwal, Tyna Eloundou, Teddy Lee, Steven Adler, Angela Jiang, Lilian Weng

View all product articles

Global news partnerships: Le Monde and Prisa Media Company Mar 13, 2024

Review completed & Altman, Brockman to continue to lead OpenAI Company Mar 8, 2024

OpenAI announces new members to board of directors Company Mar 8, 2024

Our Research * Research Index * Research Overview * Research Residency * OpenAI for Science * Economic Research

Latest Advancements * GPT-5.3 Instant * GPT-5.3-Codex * GPT-5 * Codex

Safety * Safety Approach * Security & Privacy * Trust & Transparency

ChatGPT * Explore ChatGPT(opens in a new window) * Business * Enterprise * Education * Pricing(opens in a new window) * Download(opens in a new window)

Sora * Sora Overview * Features * Pricing * Sora log in(opens in a new window)

API Platform * Platform Overview * Pricing * API log in(opens in a new window) * Documentation(opens in a new window) * Developer Forum(opens in a new window)

For Business * Business Overview * Solutions * Contact Sales

Company * About Us * Our Charter * Foundation * Careers * Brand

Support * Help Center(opens in a new window)

More * News * Stories * Livestreams * Podcast * RSS

Terms & Policies * Terms of Use * Privacy Policy * Other Policies

(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)(opens in a new window)

OpenAI © 2015–2026 Manage Cookies

English United States

Originally published on OpenAI News.