- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Configuring Now Assist Guardian: Offensive Content & Prompt Injection Protection
Family Release: Australia
00:00: This video demonstrates how to configure Now Assist Guardian to detect and manage
00:04: offensive content and prompt injection threats.
00:08: It covers activating settings, selecting severity levels, and managing guardrail
00:13: service providers for enhanced security.
00:18: Hello and welcome. Today
00:20: we are looking at configuring Now Assist Guardian.
00:25: And our first step is to go to Admin, Now
00:27: Assist Admin.
00:30: Once we're in the console, we're going to go to Settings, and under Settings,
00:34: expand the Now Assist Guardian.
00:40: Inside of Now Assist Guardian, we have Offensiveness.
00:43: And currently you see we have no active detections of offensiveness.
00:48: So let's go see what's available for us and activate offensiveness for now
00:52: assist ITSM.
00:57: Inside of here you will see you have options for either simply logging the output or
01:02: blocking the response as well as logging the output.
01:06: Do note that if you block the response, it does increase latency for your LLMs.
01:12: You can also select the severity of content detection.
01:17: Do note that the higher severity means the system only targets to log and block the
01:21: responses that are highly severe, whereas the low setting means it will detect even
01:26: the mildest of offensive content.
01:37: So let's go for low and log the output and save and activate.
01:42: This will then take us back to our Now
01:44: Assist Guardian Offensiveness page, and we can see that it is active for ITSM.
01:50: Next,
01:52: Let's take a look at prompt injection.
01:56: Now prompt injection protects you against attacks from malicious prompts.
02:02: So we're going to start by activating this.
02:06: And once active, let's also go ahead and select to block the response and log
02:11: the output. Do note that by blocking it,
02:13: there is a potential increase in latency for the LLM.
02:18: And then for our severity, we could select either the high severity, which means that
02:23: the system targets certain high certainty attacks, or low severity, where it's going
02:28: to detect even the most subtle of injections. So I'm going to go for low.
02:37: Block the response and log the output.
02:39: Then click Save.
02:43: And we get our confirmation message that it has been saved.
02:47: Next, let's go into Guardrail Service Providers.
02:52: Here, you can manage the Guardrail Service Providers specifically for Now Assist
02:56: Guardian.
02:58: The out of box default is ServiceNow
03:02: Guardrail. You may select any of the other three P options.
03:06: Or go into the AI control tower to configure and set up your own provider.
03:11: This includes bringing your own key to create Guardian guardrails.
03:17: Thank you for joining us today.
03:22: Learn to activate and customize
03:25: Now Assist Guardian's offensiveness detection,
03:27: prompt injection protection, and guardrail service provider settings for improved
03:32: content security.
Accessing Now Assist Guardian Settings
- Navigate to the Now Assist Admin.
- Select the Settings tab.
- Expand the Now Assist Guardian section.
- Locate the two primary sections: Offensiveness and Prompt Injection.
Configuring Offensiveness Detection
Understanding Offensiveness Detection
Choosing Between Log and Block Modes
- Enable this mode to record offensive content events without blocking responses.
- Use this setting to establish a baseline of what content is being flagged in your environment.
- Review logs regularly to understand the frequency and nature of offensive content detections.
- Enable this mode to prevent offensive responses from reaching users while maintaining event records.
- Consider the latency impact before implementing this mode in production environments.
- Start with log-only mode first to assess before switching to blocking.
Selecting the Appropriate Severity Level (low, medium, high)
- Select High severity to catch only the most extreme offensive content (more permissive in practice).
- Select Low severity to detect even mild offensive language (more restrictive, with higher false positive potential).
- Start with Low severity combined with Log only mode for most enterprise deployments.
- Monitor the logs to understand what content is being flagged in your specific environment.
- Adjust the severity level based on actual log data and organizational requirements.
Configuring Prompt Injection Protection
Understanding Prompt Injection Threats
Enabling Detection and Response
- Enable this mode to observe potential prompt injection attempts without blocking them.
- Use this setting in trusted environments or during initial assessment periods.
- Review detection logs to understand the types of injection attempts your system encounters.
- Enable this mode to actively prevent prompt injection attacks while maintaining audit records.
- Factor in the additional evaluation latency when calculating SLA expectations for high-volume use cases.
- Implement this mode for customer-facing applications or environments where data security is critical.
Configuring Severity Thresholds
- Select High to flag only high-certainty attacks (may miss subtle injection attempts).
- Select Low to detect even subtle manipulation attempts (more protective but potentially noisier).
Managing Guardrail Service Providers
Understanding Provider Options
Selecting Alternative Providers
- Access the Guardrail Service Providers section within Now Assist Guardian settings.
- Choose from three additional pre-built provider options available in the settings UI.
- Navigate to AI Control Tower to configure custom provider settings if needed.
Recommended Starting Configurations
Initial Setup and Audit Mode
- Configure Low severity for both Offensiveness and Prompt Injection detections.
- Enable Log only mode for both detection types.
- Monitor system behavior and review logs before implementing blocking policies.
Production User-Facing Deployments
- Set Prompt Injection to Low severity with Block and Log mode enabled.
- Configure Offensiveness to Log only mode until baseline review is complete.
- Accept the latency trade-off as necessary protection for public-facing systems.
High-Security Environments
- Enable Low severity for both Offensiveness and Prompt Injection detections.
- Activate Block and Log mode for both detection types.
- Accept additional latency as the cost of comprehensive protection.
- Establish regular review processes for blocked content and injection attempts.
Conclusion
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
