Health tab in AI Control Tower
Summarize
Summary of Health tab in AI Control Tower
The Health tab in the AI Control Tower dashboard enables ServiceNow customers to monitor the performance and effectiveness of guardrails implemented via Now Assist Guardian. These guardrails are designed to detect and mitigate offensive content and prompt injection attempts within your AI-powered ServiceNow skills. This monitoring helps ensure your AI assets operate securely and appropriately.
Show less
Key Features
- Latency Monitoring: Tracks average latency added by active guardrails for offensive content and prompt injection attempts, helping identify periods of increased guardrail activity.
- Content Flagging Metrics: Displays the number and percentage of requests flagged for offensive content and prompt injections relative to total LLM interactions, offering insight into guardrail activity levels.
- Category Breakdown: Provides detailed categorization of offensive content occurrences, allowing you to understand the nature of flagged content across multiple categories if applicable.
- Skill-Based Analysis: Visualizes offensive content and prompt injection occurrences by specific skills over selected date ranges, enabling targeted investigation and remediation.
- Date Range Filtering: Allows you to apply filters to view guardrail activity for chosen skills within specific time periods to analyze trends and effectiveness.
Key Outcomes
- Gain real-time visibility into how guardrails affect AI performance and content safety.
- Identify which skills are most impacted by offensive content or prompt injection attempts to prioritize security efforts.
- Use latency insights to understand the performance trade-offs of active guardrails.
- Leverage detailed categorization to fine-tune guardrail settings based on the types of offensive content detected.
- Ensure your AI implementations comply with safety standards by proactively monitoring and managing guardrail effectiveness.
Monitor the performance of guardrails enabled through Now Assist Guardian.
The Health tab in the AI Control Tower dashboard helps you monitor and evaluate the effectiveness of offensive content and prompt injection guardrails active on your ServiceNow AI assets.
- Average latency as a result of active offensive content and prompt injection guardrails. High latency could mean increased guardrail activity in the period.
- Count and percentage of offensive content and prompt injection occurrences.
- Skills where offensive content and prompt injection occurrences were detected.
The dashboard does not consider historical data for Health metrics.
Apply the filters on the dashboard to view guardrail activity for skills in a date range.
Content guardrail effectiveness
- Number of content items flagged
- This area of the dashboard shows the number of offensive content and prompt injection occurrences in the selected date range.
Figure 2. Number of content items flagged - Percentage of content items flagged of total use
- This area of the dashboard shows the percentage of requests and responses to and from the large language model (LLM) service that are flagged for offensiveness and prompt injection.
Figure 3. Percentage of content items flagged of total use
Offensive content visualizations
- Guardrail-added latency
- This area of the dashboard shows the average latency as a result of the active offensive content guardrail for the selected skills and date range.
Figure 4. Guardrail-added latency for offensiveness - Percentage flagged as offensive
- This area of the dashboard shows the percentage of requests and responses to and from the large language model (LLM) service that are flagged for offensive content.
Figure 5. Percentage flagged as offensive - Total offensive content occurrences
- This area of the dashboard shows the total number of offensive content occurrences for the selected skills and date range.
Figure 6. Total offensive content occurrences - Categories of offensive content
- This area of the dashboard shows a breakdown of offensive content occurrences by the categories. If content is deemed to be offensive under more than one category, for example, toxic and defamatory, the occurrence is counted
individually toward both the categories. For more information on offensive content categories, see Now Assist Guardian.
Figure 7. Categories of offensive content - Offensive content occurrences by skill
- This area of the dashboard shows the number of offensive content occurrences over time by the skills in which the content is detected.
Figure 8. Offensive content occurrences by skill
Prompt injection visualizations
- Guardrail-added latency
- This area of the dashboard shows the average latency as a result of the active prompt injection guardrail for the selected skills and date range.
Figure 9. Guardrail-added latency for prompt injection - Percentage flagged as prompt injection
- This area of the dashboard shows the percentage of requests and responses to and from the LLM service that are flagged for offensive content.
Figure 10. Percentage flagged as prompt injection - Total prompt injection occurrences
- This area of the dashboard shows the total number of offensive content occurrences for the selected skills and date range.
Figure 11. Total prompt injection occurrences - Prompt injection occurrences by skill
- This area of the dashboard shows the number of prompt injection occurrences over time by the skills where prompt injection attempts were detected.
Figure 12. Prompt injection occurrences by skill