Source linked

Немотрон 3.5 Корабли настраивают политику для мультимодальной безопасности

Единая 4B-модель, которая понимает текст, изображения и ответы помощников на более чем 140 языках при соблюдении политики безопасности для каждого развертывания.

nvidianemotron 3 5 content safetygemma 3multimodal aicontent safetyenterprise ai

NVIDIA's Nemotron 3.5 Content Safety model does something most guard systems can't: it accepts a custom policy specification alongside the input and reasons over that policy to produce a verdict. That architectural choice—built on a fine-tuned Gemma 3 4B IT base with 128K context window—moves content safety from a one-size-fits-all classifier to a configurable auditor that enterprise teams can actually trust.

Unified Multimodal Evaluation Closes a Gap

Nemotron 3.5 takes a user prompt, an optional image, and an optional assistant response as a single context window and returns a coherent safety verdict over the combined input. Scoring all three together catches policy violations that only emerge from the interaction between text and image, or between request and response. Previous multimodal safety stacks scored each modality independently and missed those cross-modal attacks.

The model maintains the 12-language explicit training coverage of its predecessors—English, French, Spanish, German, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Portuguese, Italian—and inherits strong zero-shot generalization across roughly 140 languages from the Gemma 3 base. Deployments in markets with sparse training data (Southeast Asian languages, Scandinavian languages, less-resourced African languages) benefit without separate fine-tuning.

Custom Policy Enforcement Is the Real Story

Production deployments never share a single safety taxonomy. A healthcare chatbot's risk profile differs from a financial services assistant, a developer IDE, or a children's education app. Nemotron 3.5 accepts a custom policy specification alongside the input. The model reasons over that policy when producing its verdict rather than deferring to the built-in taxonomy. This extends the reasoning capability first introduced in the Nemotron Content Safety Reasoning 4B to the full multimodal, multilingual setting.

Every safety verdict can be accompanied by an auditable reasoning trace via an optional THINK mode. The model outputs step-by-step reasoning before delivering a safe/unsafe label and, optionally, the violated categories from the Aegis 2.0 taxonomy (13 core categories plus 10 fine-grained subcategories). When latency is the constraint, THINK mode can be disabled to return to the same low-latency binary verdict.

What the Release Includes

NVIDIA is also releasing the training and evaluation dataset behind Nemotron 3.5—a rarity for open safety models, especially in the multimodal space where images and videos often carry restrictive licensing. The Nemotron 3.5 Content Safety Dataset is multimodal, multilingual, and includes the safety reasoning traces used to train the model. The LoRA adapter keeps the model compact enough for real-time deployment on 8GB+ VRAM GPUs.

Three inference modes are available: Mode 1 for low-latency binary verdict, Mode 2 adds violated categories, Mode 3 outputs the full reasoning trace. That flexibility lets teams optimise for latency or auditability depending on the endpoint. For enterprises running regulated AI services, the ability to point to a reasoning trace that explains why a response was blocked is the difference between a deployable guardrail and a black box.


Source: Nemotron 3.5 Content Safety: Customizable Multimodal Safety for Global Enterprise AI
Domain: huggingface.co

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.