gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning fashions post-trained from the gpt-oss fashions and educated to reason why from a equipped coverage as a way to label content material underneath that coverage. On this file, we describe gpt-oss-safeguard’s functions and supply our baseline protection opinions at the gpt-oss-safeguard fashions, the use of the underlying gpt-oss fashions as a baseline. For more info in regards to the building and structure of the underlying gpt-oss fashions, see the unique gpt-oss fashion fashion card.

