Ashley Snyder
ServiceNow Employee
ServiceNow Employee

What is Now Assist Guardian?

Trustworthy and responsible AI empowers customers and participants in the AI lifecycle to make informed decisions. NowAssist Guardian is a built-in platform component that ships with our GenAI Controller and is key to ServiceNow's Secure and Responsible AI. It assesses AI risks, undesired behaviors, and dangerous platform usage such as offensiveness, and prompt injection.

 

Now Assist Guardian is a suite of models and methods built into the Now Platform and included with Now Assist through the Generative AI Controller.  It assesses AI risks and undesired behaviors, detecting offensiveness, and others. Now Assist Guardian will also help mitigate risks around security and privacy threats such as monitoring and detecting prompt injection attacks and adversarial requests. 

 

Now Assist Guardian is a key platform enabler for Responsible AI - in accordance with our principles of human centricity, diversity, transparency, and accountability.

 

See the product documentation for more information on Now Assist Guardian.

 

What does Now Assist Guardian do?

Now Assist Guardian evaluates undesired generative AI model behaviors to help mitigate risk. It is a service that enables other Now Assist applications to surface detection and handling of inappropriate LLM outputs and usage.

 

Our top priorities for Q4 2024 are offensiveness, prompt injection, and sensitive topic detection. PII is closely related, but not exclusive to AI. Currently PII is handled with generative AI products using the Sensitive Data Handler.

 

What are the currently released guardrails?

  1. Offensiveness
  2. Security
  3. Sensitive topic detection (Now Assist in HRSD)

 

What are the next guardrails to be supported?

A few candidates (safe harbor applies): Hallucination, illegal requests, inappropriate advice (medical, financial, etc.).

 

Does this mean that controls were not in place prior to Now Assist Guardian to prevent offensiveness or prompt injection?

No, Now Assist Guardian is an additional layer on top of our already present model fine-tuning and alignment to prevent these occurrences. Now Assist Guardian also makes such issues and attempts visible through logging to our customers.

 

Why are guardrails not turned on by default, why would I want to turn them off?

We aim to provide customers with choice and flexibility regarding the guardrails they deploy. Customer ServiceNow administrators can decide to enable or disable guardrails and the level of guardrail action, i.e., blocking versus logging, etc. Using technology that incorporates large language models poses a risk of false positives, such as blocking an output for offensiveness when one does not exist. We encourage customers to perform testing with a small group of stakeholders prior to enabling the guardrails.

 

Customers also have the option to run detection after LLM processing for monitoring purposes without impacting the user experience as the monitoring will not occur during real-time or inference. Customer administrators can turn on offensiveness guardrails if monitoring shows an issue related to it based upon their internal thresholds.

 

How does Now Assist Guardian solve for problems such as offensiveness, prompt injection, and sensitive topic detection?

Now Assist Guardian employs tools that assess the output of models used for Now Assist skills. It will highlight and make recommendations on the output related to toxic or offensive content. Each use case would have a different user experience depending on the level of risk caused by displaying the offensive content.

 

Can agents override or disregard the guardrail?

No, when blocking is enabled for offensiveness or security, the agent will see an error message stating “There was an error summarizing your incident.”

 

If an employee or internal user is being offensive, does this flag up to their manager?

No, there is no automated flagging for now, but we are looking into configurable flagging in the future (safe harbor).

 

What actions are taken by ServiceNow and by the customer as a result of evaluations?

If a customer has opted in for data-sharing, we use the Filtered AI Content to review model performance based on real-world scenarios.

 

Which Now Assist Guardian metrics show up in model cards?

F1, Precision, Recall, Correctness, False Positive Rate (PFR). See the model card for up-to-date metrics and more information.

 

How does Now Assist Guardian work with BYOL LLMs?

Now Assist Guardian is integrated with the Generative AI Controller, so you can use it with Now LLMs and BYOL LLMs, as of Q4 2024, only the Azure OpenAI spoke is supported for BYOL LLMs.

 

As of Q4 2024, Now Assist Guardian is not integrated with Now Assist Skill Kit, guardrails will not apply to custom skills using Now Assist Skill Kit.

 

If a customer opts out of our Advanced AI & Data Terms data-sharing program, do they also opt out of Now Assist Guardian?

No, customers can opt out of the Advanced AI & Data Terms data-sharing program without impacting the usage of Now Assist Guardian. Now Assist Guardian either runs at inference, when the LLM is called to provide a prompt and response, or during monitoring, which utilizes data housed in a log table in the customer instance that has a data retention period of 30 days in the instance.

 

Customers in Europe or Asia may have a different level of sensitivity to customers in the USA about what is offensive, will there be a guide on how we bias the evaluations of one culture-set versus others?

Currently, we do not have a guide on how we bias the evaluations of one culture-set versus others.

 

Is Now Assist Guardian optional for customers who do not want their data processed in the ServiceNow regional data centers used for the Now LLM Service?

Yes, customers can leave guardrails disabled, or turn them off. The exception is monitoring for prompt injection in the Security guardrail which is enabled for logging by default.

 

Does Now Assist Guardian support native translation (multilingual LLM)? Which languages are supported?

Currently, only English is supported. The model used for Now Assist Guardian has been tested and evaluated using English datasets. You may see results using multilingual capabilities and Dynamic Translation with Now Assist Guardian, but currently multilingual is not supported. Refer to future product documentation for changes in supported languages.

 

Does using Now Assist Guardian consume extra assists?

No, it is included in Now Assist licensing.

 

What if content is blocked by Now Assist Guardian, do I get charged for the output?

No, when you use a Now Assist skill and the content is blocked by Now Assist Guardian due to guardrails, you are not charged for the assist of using that skill.

 

Can I turn on Now Assist Guardian for specific skills, or do I turn it on for all skills?

The offensiveness guardrail can be configured at the workflow level meaning CSM, HRSD, ITSM, etc. The security and sensitive topic detection guardrails are at the global level, meaning they cannot be enabled/disabled per skill.

 

What options do I have for configuring guardrails?

Admins can choose a detection impact for the offensiveness and security guardrails and configure filters for sensitivity detection in the Now Assist admin console.

  1. Offensiveness – There are two types of detection impacts that admins can configure in the Now Assist admin console:
    1. Basic log monitoring – Updates the log when offensive content is detected, includes information about the request and conversation that contains the offensive content, including any user feedback.
      1. Agents can view the offensive content when the skill is executed.
    2. Block + Monitoring– Prevents offensive content from being displayed when a skill is executed and displays a message, including logging.
      1. Agents will see an error message.
  2. Security (Prompt Injection) – Admins can configure the detection impact for all products and skills, choosing between basic log monitoring or blocking.
    1. Basic log monitoring (enabled by default) – Updates the log when prompt injection is detected.
      1. Agents can view the conversation that includes the prompt injection.
    2. Block + Monitoring – Prevents the content from being generated and will display a message when a skill is executed, includes logging.
      1. Agents will see an error message.
  3. Sensitivity detection – Admins have the ability to modify an existing out-of-the-box sensitive filter. They can create new phrases, as well as edit or delete existing phrases. The filters are applied to Now Assist for Virtual Agent topics.

 

Can I turn on Now Assist Guardian for specific LLMs?

Now Assist Guardian applies to Now LLM Service and Azure OpenAI LLMs for the Q4 2024 Store release.

 

What does the offensiveness guardrail check for?

The model is trained on 16 safety categories, the category IDs will be found in the logs when the guardrail detects offensive content.

 

Category ID

Category

Description

O1

Toxic Content

Should not: Promote hate speech, incite violence like child abuse or bullying or engage in personal attacks or harassment.


Can: Discuss the impacts and societal issues related to toxic content for educational purposes.

O2

Unfair Representation

Should not: Assist in perpetuating stereotypes or encourage discrimination or bias.


Can: Discuss issues of representation and promote awareness and inclusivity.

O3

Adult Content

Should not: Display sexually explicit material or content inappropriate for minors including pornography, erotic chat and sexual activity.


Can: Offer educational discussions on topics related to sexuality and consent.

O4

Erosion of Trust in Public Information

Should not: Spread misinformation or disinformation undermining public trust such as misleading medical advice, illegal drugs and advice.


Can: Critically analyze misinformation trends and educate about media literacy.

O5

Propagating Misconceptions/False Beliefs

Should not: Disseminate unverified, misleading information, conspiracy theories, science denialism, historical revisionism, or content that could lead to radicalization.


Can: Provide critical analysis and debunking of common misconceptions and false beliefs.

O6

Risky Financial Practices

Should not: Promote unethical trade practices such as multi-level marketing, deceptive lending practices, or gambling.


Can: Educate about associated risks, ethical lending, responsible gambling, trade compliance, and fair business practices.

O7

Trade and Compliance

Should not: Involve labor issues, engage in unethical trade affairs, contribute to unfair labor practices or promote fraudulent trade practices.


Can: Provide information and insights into labor issues, ethical trade practices, trade compliance, and fair business conduct.

O8

Dissemination of Dangerous Information

Should not: Spread fake news, propagate social media rumors, disclose government secrets or leak financial data and other contents that contributing to public panic or national security.


Can: Educate about the dangers of such information and promote safety.

O9

Privacy Infringement

Should not: Involve doxxing, unauthorized data sharing, or surveillance like private information inferring or leaking.


Can: Discuss the importance of privacy rights and digital security.

O10

Security Threats

Should not: Spread malware or encourage terrorism or other security threats such as weapon generation, cyber attack or chemical harms.


Can: Provide information and methods to identify and mitigate security threats.

O11

Defamation

Should not: Publish false statements damaging to individuals, groups or organizations.


Can: Explore the legal and ethical aspects of defamation and free speech.

O12

Fraud or Deceptive Action

Should not: Engage in academic dishonesty, perpetrate scams, promote pseudo-pharmaceuticals, or disseminate fake online content. Content should not deceive or defraud individuals.


Can: Educate about recognizing and preventing fraudulent activities.

O13

Influence Operations

Should not: Manipulate public opinion or disrupt societal harmony for political or ideological purposes like surveillance or campaign.


Can: Analyze and educate about the nature and impact of influence operations.

O14

Illegal Activities

Should not: Promote or provide guidance on engaging in illegal actions like crimes or terrorism.


Can: Educate about the consequences and prevention of illegal activities.

O15

Persuasian and Manipulation

Should not: Involve persuasion or manipulation techniques that promote self-harm, psychological manipulation, or any form of harm to individuals' well-being.


Can: Educate about recognizing and resisting manipulation, offer support and resources for those affected.

O16

Violation of Personal Property

Should not: Violate an individual's personal property, involving engaging in malicious mimicking, plagiarism, copyright infringement and forging identity.


Can: Promote awareness and discussions on respecting personal property and preventing such violations.

 

Can customers add their own offensiveness categories?

Not as of the Q4 2024 release.

 

Where can I find the logs for the guardrails?

For the Q4 2024 release, admins can view logs for guardrails in the sys_generative_ai_metric table with the following columns:

  • Created
  • Generative AI Log
  • Name
  • Type
  • Value

 

Admins can configure more columns for additional insight by using the List Layout configuration and the Generative AI Log metadata table columns. Admins can also export logs to a CSV file for each guardrail in the Now Assist admin console.

 

Does Now Assist Guardian add latency to Now Assist Skill response time?     

 

Now Assist Guardian performs a parallel call to the Now SLM to check input against configured guardrails while the call to the Now LLM, or other LLM is occurring to perform the desired skill output. Both calls to generative AI occur simultaneously. 

 

If logging is turned on, Now Assist Guardian will post its findings to the log table, and the user will receive the skill output.

 

If blocking is turned on, Now Assist Guardian will log its findings, and the output will be blocked within the instance to the user. Customers should not see noticeable latency from Now Assist Guardian.

 

Can customers make custom filters for Sensitive Topics for HRSD?

Filters are configurable through tables (sys_gen_ai_filter_sample) but not configurable via the Now Assist Guardian interface at this time.

 

What is the impact of the sample phrases in sensitive topic filters for HRSD?

Filters are of different categories, and by providing a variety of sample phrases, you can help the AI recognize a wide array of potential user queries that are all related to the same intent. Users might not always phrase their queries in a way that the system expects, so a rich set of sample phrases helps reduce ambiguity. For example, if a user says, "unlock my account," but the AI has only been trained on "reset password," it may not filter the query correctly. However, if multiple variations are included as sample phrases, the system is more likely to understand and filter the request to the correct sensitivity filter topic.

 

Why would a customer want to create more sample phrases?

The more sample phrases you provide, the more accurate the LLM will be in catching these topics. 

 

What is the maximum amount of sample phrases that can be added?

There is no set official number of maximum sample phrases; in our engineering testing, we did notice performance issues loading and working in the Now Assist admin console with around 800 phrases. If you need to create more than 800 phrases, please open a Support case.

 

 

Comments
MichaelOliH
Tera Contributor

Hello, Can i ask how to activate the Guardian, are there any requirements before enabling the guardian? Currently we have zanadu patch 4, but can't find the guardian in admin console. Can an

Simon Hendery
Mega Patron
Mega Patron

Hi @MichaelOliH 

 

Now Assist Guardian is a component of a wider Now Assist activation. Do you currently have Now Assist installed on your instance? If not, that's the starting point.

MichaelOliH
Tera Contributor

Hi @Simon Hendery 

 

Thank you for your response, Yes we already Now assist in our Instance use for project. Do you have a list of plug-ins that requires before activating Now assist Guardian?

Simon Hendery
Mega Patron
Mega Patron

Hi @MichaelOliH 

 

As per the comment from @Prakash53 in this related thread, one vital step is to update the Now Assist Admin Console (sn_nowassist_admin).

 

In my case, I had to update the plugin to v. 4.1.16 before I could access Now Assist Guardian functionality via Now Assist Admin:

 

guardian.png

Try that and let us know if it still doesn't appear.

AnetaR
Tera Contributor

Hi, can you use different Fallback topics per each area in the Guardian filters or will all point to 

Sensitivity Detection: Fallback?

Hiroki M
ServiceNow Employee
ServiceNow Employee

Hello, it states that multilingual support is not available. Has there been any update since then regarding future plans or a roadmap to enable multilingual support? The customer is particularly interested in leveraging this feature in Japanese.

Mike Malcangio
ServiceNow Employee
ServiceNow Employee

Hi @Hiroki M - yes, the FAQ will be updated shortly. However, as of the July release, multi-lingual support for Guardian has been enabled for the following languages: English, French, Canadian French, German, Japanese, Dutch, Spanish, Brazilian Portuguese, and Italian. More to come to soon!

Version history
Last update:
3 weeks ago
Updated by: