Now Assist Guardian FAQ

Ashley Snyder · ‎11-08-2024

What is Now Assist Guardian?

Trustworthy and responsible AI empowers customers and participants in the AI lifecycle to make informed decisions. NowAssist Guardian is a built-in platform component that ships with our GenAI Controller and is key to ServiceNow's Secure and Responsible AI. It assesses AI risks, undesired behaviors, and dangerous platform usage such as offensiveness, and prompt injection.

Now Assist Guardian is a suite of models and methods built into the Now Platform and included with Now Assist through the Generative AI Controller. It assesses AI risks and undesired behaviors, detecting offensiveness and other issues. Now Assist Guardian will also help mitigate risks around security and privacy threats, such as monitoring and detecting prompt injection attacks and adversarial requests.

Now Assist Guardian is a key platform enabler for Responsible AI - following our principles of human centricity, diversity, transparency, and accountability.

See the product documentation for more information on Now Assist Guardian.

What does Now Assist Guardian do?

Now Assist Guardian evaluates undesired generative AI model behaviors to help mitigate risk. It is a service that enables other Now Assist applications to surface detection and handling of inappropriate LLM outputs and usage.

Our top priorities for 2025 are to extend the coverage of Now Assist Guardian to any LLM used in the system and to improve the detection of offensiveness, prompt injections, and sensitive topics. PII is closely related, but not exclusive to AI. Currently, PII is handled with generative AI products using the Sensitive Data Handler.

What are the currently released guardrails?

Offensiveness
Security
Sensitive topic detection (Now Assist in HRSD)

What are the next guardrails to be supported?

A few candidates (safe harbor applies): Hallucination, illegal requests, inappropriate advice (medical, financial, etc.).

Does this mean that controls were not in place before Now Assist Guardian to prevent offensiveness or prompt injection?

No, Now Assist Guardian is an additional layer on top of our already present model fine-tuning and alignment to prevent these occurrences. Now Assist Guardian also makes such issues and attempts visible through logging to our customers.

Why are guardrails not turned on by default, why would I want to turn them off?

We aim to provide customers with choice and flexibility regarding the guardrails they deploy. Customer ServiceNow administrators can decide to enable or disable guardrails and the level of guardrail action, i.e., blocking versus logging, etc. Using technology that incorporates large language models poses a risk of false positives, such as blocking an output for offensiveness when one does not exist. We encourage customers to perform testing with a small group of stakeholders before enabling the guardrails.

Customers also have the option to run detection after LLM processing for monitoring purposes without impacting the user experience, as the monitoring will not occur during real-time or inference. Customer administrators can turn on offensiveness guardrails if monitoring shows an issue related to it based on their internal thresholds.

How does Now Assist Guardian solve for problems such as offensiveness, prompt injection, and sensitive topic detection?

Now Assist Guardian employs tools that assess the output of models used for Now Assist skills. It will highlight and make recommendations on the output related to toxic or offensive content. Each use case would have a different user experience depending on the level of risk caused by displaying the offensive content.

Can agents override or disregard the guardrail?

No, when blocking is enabled for offensiveness or security, the agent will see an error message stating “There was an error summarizing your incident.”

If an employee or internal user is being offensive, does this flag up to their manager?

No, there is no automated flagging for now, but we are looking into configurable flagging in the future (safe harbor).

What actions are taken by ServiceNow and by the customer as a result of evaluations?

If a customer has opted in for data-sharing, we use the Filtered AI Content to review model performance based on real-world scenarios.

Which Now Assist Guardian metrics show up in model cards?

F1, Precision, Recall, Correctness, False Positive Rate (PFR). See the model card for up-to-date metrics and more information.

How does Now Assist Guardian work with BYOL LLMs?

Now Assist Guardian is integrated with the Generative AI Controller, so you can use it with Now LLMs and BYOL LLMs, as of Q3 2025 all of the providers ServiceNow has made available via model provider choice (Azure OpenAI, AWS Anthropic Claude, and Google Gemini).

If a customer opts out of our Advanced AI & Data Terms data-sharing program, do they also opt out of Now Assist Guardian?

No, customers can opt out of the Advanced AI & Data Terms data-sharing program without impacting the usage of Now Assist Guardian. Now Assist Guardian operates in two modes: inference, where the LLM provides a prompt and response, and monitoring, which utilizes data from a 30-day retention log table in the customer instance.

Customers in Europe or Asia may have a different level of sensitivity to customers in the USA about what is offensive, will there be a guide on how we bias the evaluations of one culture-set versus others?

Currently, we do not have a guide on how we bias the evaluations of one culture-set versus others.

Is Now Assist Guardian optional for customers who do not want their data processed in the ServiceNow regional data centers used for the Now LLM Service?

Yes, customers can leave guardrails disabled or turn them off. The exception is monitoring for prompt injection in the Security guardrail which is enabled for logging by default.

Does Now Assist Guardian support native translation (multilingual LLM)? Which languages are supported?

As of Q3 2025, Guardian supports ServiceNow’s P1 (English, French, German, Italian, Spanish, Brazilian Portuguese) and P2 (Candian French, Japanese, Dutch) languages. You can read more in this article on Multilingual support for ServiceNow generative AI products.

Does using Now Assist Guardian consume extra assists?

No, it is included in Now Assist licensing.

What if content is blocked by Now Assist Guardian, do I get charged for the output?

No, when you use a Now Assist skill and the content is blocked by Now Assist Guardian due to guardrails, you are not charged for the assist of using that skill.

Can I turn on Now Assist Guardian for specific skills, or do I turn it on for all skills?

The offensiveness guardrail can be configured at the workflow level meaning CSM, HRSD, ITSM, etc. The security and sensitive topic detection guardrails are at the global level, meaning they cannot be enabled/disabled per skill.

What options do I have for configuring guardrails?

Admins can choose a detection impact for the offensiveness and security guardrails and configure filters for sensitivity detection in the Now Assist admin console.

Offensiveness – There are two types of detection impacts that admins can configure in the Now Assist admin console:
1. Basic log monitoring – Updates the log when offensive content is detected, includes information about the request and conversation that contains the offensive content, including any user feedback.
  1. Agents can view the offensive content when the skill is executed.
2. Block + Monitoring– Prevents offensive content from being displayed when a skill is executed and displays a message, including logging.
  1. Agents will see an error message.
3. Security (Prompt Injection) – Admins can configure the detection impact for all products and skills, choosing between basic log monitoring or blocking.
  1. Basic log monitoring (enabled by default) – Updates the log when prompt injection is detected.
    1. Agents can view the conversation that includes the prompt injection.
  2. Block + Monitoring – Prevents the content from being generated and will display a message when a skill is executed, includes logging.
    1. Agents will see an error message.
  3. Sensitivity detection – Admins have the ability to modify an existing out-of-the-box sensitive filter. They can create new phrases, as well as edit or delete existing phrases. The filters are applied to Now Assist for Virtual Agent topics.

Can I turn on Now Assist Guardian for specific LLMs?

Now Assist Guardian applies to the Now LLM Service and other model providers supported via third-party model choice.

What does the offensiveness guardrail check for?

The model is trained on 16 safety categories; the category IDs will be found in the logs when the guardrail detects offensive content.

Category ID	Category	Description
O1	Toxic Content	Should not: Promote hate speech, incite violence like child abuse or bullying or engage in personal attacks or harassment. Can: Discuss the impacts and societal issues related to toxic content for educational purposes.
O2	Unfair Representation	Should not: Assist in perpetuating stereotypes or encourage discrimination or bias. Can: Discuss issues of representation and promote awareness and inclusivity.
O3	Adult Content	Should not: Display sexually explicit material or content inappropriate for minors including pornography, erotic chat and sexual activity. Can: Offer educational discussions on topics related to sexuality and consent.
O4	Erosion of Trust in Public Information	Should not: Spread misinformation or disinformation undermining public trust such as misleading medical advice, illegal drugs and advice. Can: Critically analyze misinformation trends and educate about media literacy.
O5	Propagating Misconceptions/False Beliefs	Should not: Disseminate unverified, misleading information, conspiracy theories, science denialism, historical revisionism, or content that could lead to radicalization. Can: Provide critical analysis and debunking of common misconceptions and false beliefs.
O6	Risky Financial Practices	Should not: Promote unethical trade practices such as multi-level marketing, deceptive lending practices, or gambling. Can: Educate about associated risks, ethical lending, responsible gambling, trade compliance, and fair business practices.
O7	Trade and Compliance	Should not: Involve labor issues, engage in unethical trade affairs, contribute to unfair labor practices or promote fraudulent trade practices. Can: Provide information and insights into labor issues, ethical trade practices, trade compliance, and fair business conduct.
O8	Dissemination of Dangerous Information	Should not: Spread fake news, propagate social media rumors, disclose government secrets or leak financial data and other contents that contributing to public panic or national security. Can: Educate about the dangers of such information and promote safety.
O9	Privacy Infringement	Should not: Involve doxxing, unauthorized data sharing, or surveillance like private information inferring or leaking. Can: Discuss the importance of privacy rights and digital security.
O10	Security Threats	Should not: Spread malware or encourage terrorism or other security threats such as weapon generation, cyber attack or chemical harms. Can: Provide information and methods to identify and mitigate security threats.
O11	Defamation	Should not: Publish false statements damaging to individuals, groups or organizations. Can: Explore the legal and ethical aspects of defamation and free speech.
O12	Fraud or Deceptive Action	Should not: Engage in academic dishonesty, perpetrate scams, promote pseudo-pharmaceuticals, or disseminate fake online content. Content should not deceive or defraud individuals. Can: Educate about recognizing and preventing fraudulent activities.
O13	Influence Operations	Should not: Manipulate public opinion or disrupt societal harmony for political or ideological purposes like surveillance or campaign. Can: Analyze and educate about the nature and impact of influence operations.
O14	Illegal Activities	Should not: Promote or provide guidance on engaging in illegal actions like crimes or terrorism. Can: Educate about the consequences and prevention of illegal activities.
O15	Persuasion and Manipulation	Should not: Involve persuasion or manipulation techniques that promote self-harm, psychological manipulation, or any form of harm to individuals' well-being. Can: Educate about recognizing and resisting manipulation, offer support and resources for those affected.
O16	Violation of Personal Property	Should not: Violate an individual's personal property, involving engaging in malicious mimicking, plagiarism, copyright infringement and forging identity. Can: Promote awareness and discussions on respecting personal property and preventing such violations.

Can customers add their own offensiveness categories?

Not as of the Q3 2025 release.

Where can I find the logs for the guardrails?

For the Q3 2025 release, admins can view logs for guardrails in the sys_generative_ai_metric table with the following columns:

Created
Generative AI Log
Name
Type
Value

Admins can configure more columns for additional insight by using the List Layout configuration and the Generative AI Log metadata table columns. Admins can also export logs to a CSV file for each guardrail in the Now Assist admin console.

Does Now Assist Guardian add latency to Now Assist Skill response time?

Now Assist Guardian performs a parallel call to the Now SLM to check input against configured guardrails while the call to the Now LLM, or other LLM is occurring to perform the desired skill output. Both calls to generative AI occur simultaneously.

If logging is turned on, Now Assist Guardian will post its findings to the log table, and the user will receive the skill output.

If blocking is turned on, Now Assist Guardian will log its findings, and the output will be blocked within the instance to the user. Customers should not see noticeable latency from Now Assist Guardian.

Can customers make custom filters for Sensitive Topics for HRSD?

Filters are configurable by adding additional sample phrases in the setup dialogue for a particular filter or directly in the table sys_gen_ai_filter_sample.

What is the impact of the sample phrases in sensitive topic filters for HRSD?

Filters are of different categories, and by providing a variety of sample phrases, you can help the AI recognize a wide array of potential user queries that are all related to the same intent. Users might not always phrase their queries in a way that the system expects, so a rich set of sample phrases helps reduce ambiguity. For example, if a user says, "unlock my account," but the AI has only been trained on "reset password," it may not filter the query correctly. However, if multiple variations are included as sample phrases, the system is more likely to understand and filter the request to the correct sensitivity filter topic.

Why would a customer want to create more sample phrases?

The more sample phrases you provide, the more accurate the LLM will be in catching these topics.

What is the maximum amount of sample phrases that can be added?

There is no set official number of maximum sample phrases; in our engineering testing, we did notice performance issues loading and working in the Now Assist admin console with around 800 phrases. If you need to create more than 800 phrases, please open a Support case.

MichaelOliH · ‎01-18-2025

Hello, Can i ask how to activate the Guardian, are there any requirements before enabling the guardian? Currently we have zanadu patch 4, but can't find the guardian in admin console. Can an

Simon Hendery · ‎01-18-2025

Hi @MichaelOliH

Now Assist Guardian is a component of a wider Now Assist activation. Do you currently have Now Assist installed on your instance? If not, that's the starting point.

MichaelOliH · ‎01-19-2025

Hi @Simon Hendery

Thank you for your response, Yes we already Now assist in our Instance use for project. Do you have a list of plug-ins that requires before activating Now assist Guardian?

Simon Hendery · ‎01-20-2025

Hi @MichaelOliH

As per the comment from @Prakash53 in this related thread, one vital step is to update the Now Assist Admin Console (sn_nowassist_admin).

In my case, I had to update the plugin to v. 4.1.16 before I could access Now Assist Guardian functionality via Now Assist Admin:

Try that and let us know if it still doesn't appear.

AnetaR · ‎02-27-2025

Hi, can you use different Fallback topics per each area in the Guardian filters or will all point to

Sensitivity Detection: Fallback?

Hiroki M · ‎09-07-2025

Hello, it states that multilingual support is not available. Has there been any update since then regarding future plans or a roadmap to enable multilingual support? The customer is particularly interested in leveraging this feature in Japanese.

Mike Malcangio · ‎09-08-2025

Hi @Hiroki M - yes, the FAQ will be updated shortly. However, as of the July release, multi-lingual support for Guardian has been enabled for the following languages: English, French, Canadian French, German, Japanese, Dutch, Spanish, Brazilian Portuguese, and Italian. More to come to soon!

Vinod K_V · ‎09-24-2025

Hi All, are we allowed to create our own guardrails? TIA

Mike Malcangio · ‎10-20-2025

Hi @Vinod K_V,

No - Guardian does not support the creation of custom guardrails at this time.