Sharon_Barnes
ServiceNow Employee

Configuring Now Assist Guardian: Offensive Content & Prompt Injection Protection

If you're deploying Now Assist in your organization, you've probably asked yourself: What happens when someone tries to misuse it? Now Assist Guardian is your answer. It's a built-in layer of protection that lets you detect and block offensive content and prompt injection attacks before they ever reach your users or your data.
Now Assist Guardian is a set of guardrails that sit between user inputs/AI outputs and the rest of your ServiceNow environment. It covers two main threat categories: Offensive Content Detection, which flags or blocks LLM responses that contain harmful, inappropriate, or abusive language, and Prompt Injection Protection, which defends against malicious inputs designed to manipulate the AI into doing something it shouldn't. Think of it as a security policy layer for your AI interactions—the same way you'd configure content filtering on a firewall, but purpose-built for LLM behavior.
In this post, we'll walk through exactly how to configure it, explain what each setting actually does, and help you decide which options make sense for your environment.
 

Family ReleaseAustralia

Release: Now Assist Suite 28.7.12
Roles Required: admin
 

00:00: This video demonstrates how to configure Now Assist Guardian to detect and manage

00:04: offensive content and prompt injection threats.

00:08: It covers activating settings, selecting severity levels, and managing guardrail

00:13: service providers for enhanced security.

00:18: Hello and welcome. Today

00:20: we are looking at configuring Now Assist Guardian.

00:25: And our first step is to go to Admin, Now

00:27: Assist Admin.

00:30: Once we're in the console, we're going to go to Settings, and under Settings,

00:34: expand the Now Assist Guardian.

00:40: Inside of Now Assist Guardian, we have Offensiveness.

00:43: And currently you see we have no active detections of offensiveness.

00:48: So let's go see what's available for us and activate offensiveness for now

00:52: assist ITSM.

00:57: Inside of here you will see you have options for either simply logging the output or

01:02: blocking the response as well as logging the output.

01:06: Do note that if you block the response, it does increase latency for your LLMs.

01:12: You can also select the severity of content detection.

01:17: Do note that the higher severity means the system only targets to log and block the

01:21: responses that are highly severe, whereas the low setting means it will detect even

01:26: the mildest of offensive content.

01:37: So let's go for low and log the output and save and activate.

01:42: This will then take us back to our Now

01:44: Assist Guardian Offensiveness page, and we can see that it is active for ITSM.

01:50: Next,

01:52: Let's take a look at prompt injection.

01:56: Now prompt injection protects you against attacks from malicious prompts.

02:02: So we're going to start by activating this.

02:06: And once active, let's also go ahead and select to block the response and log

02:11: the output. Do note that by blocking it,

02:13: there is a potential increase in latency for the LLM.

02:18: And then for our severity, we could select either the high severity, which means that

02:23: the system targets certain high certainty attacks, or low severity, where it's going

02:28: to detect even the most subtle of injections. So I'm going to go for low.

02:37: Block the response and log the output.

02:39: Then click Save.

02:43: And we get our confirmation message that it has been saved.

02:47: Next, let's go into Guardrail Service Providers.

02:52: Here, you can manage the Guardrail Service Providers specifically for Now Assist

02:56: Guardian.

02:58: The out of box default is ServiceNow

03:02: Guardrail. You may select any of the other three P options.

03:06: Or go into the AI control tower to configure and set up your own provider.

03:11: This includes bringing your own key to create Guardian guardrails.

03:17: Thank you for joining us today.

03:22: Learn to activate and customize

03:25: Now Assist Guardian's offensiveness detection,

03:27: prompt injection protection, and guardrail service provider settings for improved

03:32: content security.

 

Accessing Now Assist Guardian Settings

  1. Navigate to the Now Assist Admin.
  2. Select the Settings tab.
  3. Expand the Now Assist Guardian section.
  4. Locate the two primary sections: Offensiveness and Prompt Injection.
    2026-03-25_15-17-21.png


Configuring Offensiveness Detection

 

Understanding Offensiveness Detection

This setting monitors AI-generated responses and either logs or blocks any content that's flagged as offensive. Out of the box, there are no active detections—you need to enable it per Now Assist application (like ITSM).
2026-03-25_15-24-35.png


Choosing Between Log and Block Modes

Log Only Mode
  1. Enable this mode to record offensive content events without blocking responses.
  2. Use this setting to establish a baseline of what content is being flagged in your environment.
  3. Review logs regularly to understand the frequency and nature of offensive content detections.
Block and Log Mode
  1. Enable this mode to prevent offensive responses from reaching users while maintaining event records.
  2. Consider the latency impact before implementing this mode in production environments.
  3. Start with log-only mode first to assess before switching to blocking.

Selecting the Appropriate Severity Level (low, medium, high)

Understanding Severity Levels
The severity setting controls detection sensitivity, but works counterintuitively:
  1. Select High severity to catch only the most extreme offensive content (more permissive in practice).
  2. Select Low severity to detect even mild offensive language (more restrictive, with higher false positive potential).
Recommended Initial Configuration
  1. Start with Low severity combined with Log only mode for most enterprise deployments.
  2. Monitor the logs to understand what content is being flagged in your specific environment.
  3. Adjust the severity level based on actual log data and organizational requirements.

Configuring Prompt Injection Protection

 

Understanding Prompt Injection Threats

Prompt injection occurs when malicious actors embed instructions inside a prompt to trick the AI into ignoring its guidelines, leaking data, or taking unintended actions. This represents one of the most common attack vectors against LLM-based systems.
2026-03-25_15-25-38.png


Enabling Detection and Response

Log Only Mode
  1. Enable this mode to observe potential prompt injection attempts without blocking them.
  2. Use this setting in trusted environments or during initial assessment periods.
  3. Review detection logs to understand the types of injection attempts your system encounters.
Block and Log Mode
  1. Enable this mode to actively prevent prompt injection attacks while maintaining audit records.
  2. Factor in the additional evaluation latency when calculating SLA expectations for high-volume use cases.
  3. Implement this mode for customer-facing applications or environments where data security is critical.

Configuring Severity Thresholds

  1. Select High to flag only high-certainty attacks (may miss subtle injection attempts).
  2. Select Low to detect even subtle manipulation attempts (more protective but potentially noisier).

Managing Guardrail Service Providers

 

Understanding Provider Options

By default, Guardian uses ServiceNow Guardrail as the out-of-the-box provider, but you have additional flexibility. 
2026-03-25_15-26-24.png

Selecting Alternative Providers

  1. Access the Guardrail Service Providers section within Now Assist Guardian settings.
  2. Choose from three additional pre-built provider options available in the settings UI.
  3. Navigate to AI Control Tower to configure custom provider settings if needed.

Recommended Starting Configurations

 

Initial Setup and Audit Mode

  1. Configure Low severity for both Offensiveness and Prompt Injection detections.
  2. Enable Log only mode for both detection types.
  3. Monitor system behavior and review logs before implementing blocking policies.

Production User-Facing Deployments

  1. Set Prompt Injection to Low severity with Block and Log mode enabled. 
  2. Configure Offensiveness to Log only mode until baseline review is complete.
  3. Accept the latency trade-off as necessary protection for public-facing systems.

High-Security Environments

  1. Enable Low severity for both Offensiveness and Prompt Injection detections.
  2. Activate Block and Log mode for both detection types.
  3. Accept additional latency as the cost of comprehensive protection.
  4. Establish regular review processes for blocked content and injection attempts.

Conclusion

 

Now Assist Guardian gives you real control over how your AI behaves and the ability to respond quickly when something looks off. Setting it up takes a few minutes, but the peace of mind is worth it. By carefully configuring offensiveness detection and prompt injection protection with the appropriate severity levels and response modes, you can balance security requirements with system performance.
Start with log-only modes to understand your environment's specific patterns, then gradually move toward blocking as you gain confidence in your configuration. Remember that the counterintuitive severity levels mean Low severity provides the most comprehensive protection, while High severity is more permissive. Whether you're in audit mode, running production deployments, or managing high-security environments, these guardrails provide essential protection for your Now Assist implementation.