Deep Dive: Managing and Optimizing Cache for Now Assist Q&A Genius Results in ServiceNow

Yogesh Shinde · ‎12-22-2025

Introduction:

Caching for Now Assist Q&A Genius Results is essential for performance and cost optimization in ServiceNow. However, stale cache entries can lead to incorrect answers and bloated tables. This article provides a technical guide covering the following steps which I have validated in my instances:

Cache architecture (L1 vs L2)
Purge and refresh
KB reindexing and automation
Operational best practices and edge cases

Cache Architecture Overview

First-Level Cache (L1)
- Location: In-memory (node-local)
- Structure: Key-value pairs
  - Key: Search Query + KB sys_id
  - Value: Summary generated by Now LLM
- Match Type: Exact keyword match required
- Behavior: Extremely fast but not shared across nodes.
Second-Level Cache (L2)
- Location: sn_ais_assist_semantic_cache table
- Indexing: Semantic vector search (matches by meaning)
- Fields: Query (query_term), KB, Source Identifier, Updates (sys_mod_count), sys_updated_on
- Behavior:
  - If L1 misses → Check L2
  - If L2 misses → Query goes to LLM

Language Support

Only English is supported for cache hits.
With Dynamic Translation:
- Cache is bypassed → Query goes to LLM.
- Impact: Lower cache hit rates → Higher LLM usage.

Purge Before Populate:

The Update Semantic Cache job (in the sys_trigger table) must purge stale entries first before repopulating. Here’s why:

No built-in TTL → Old summaries persist indefinitely.
Performance risk → Large L2 table slows semantic search.
Accuracy risk → Outdated KB content leads to wrong answers.

Note: While there’s no execute UI action on this job, you can run it on demand by just modifying the time in the “Next Action” field.

Scheduled Job: Update Semantic Cache

Located in sys_trigger table. Sample script:

gs.info('[Update Semantic Cache] Forced run (purge+populate) at ' + gs.nowDateTime());
var util = new snaisassist.SemanticCacheUpdate();

// 1) Purge all UNPINNED L2 entries not used in the past 7 days
util.purgeUnused(7);

// 2) Populate L2 from recent Search Events (adjust window if needed)
util.process(gs.minutesAgo(1440)); // last 24 hours

Update Semantic Cache Job

Manual Purge Option - In case you want to maintain manual script outside of the OOTB job.

Run in Background Scripts for emergency cleanup (or Fix Script in NA in AI Search scope):

// Fix Script (executes in App scope: Now Assist in AI Search)
(function () {
  gs.info('[Purge Semantic Cache] scope=' + gs.getCurrentScopeName());

  var gr = new GlideRecord('sn_ais_assist_semantic_cache');
  gr.addQuery('sys_updated_on', '<', gs.minutesAgoStart(5));
  gr.addQuery('pinned', '!=', true);
  gr.query();

  var purged = 0;
  while (gr.next()) { gr.deleteRecord(); purged++; }

  gs.info('[Purge Semantic Cache] Purged ' + purged + ' records older than 5 minutes.');
})();

KB Reindexing After Purge

Once purge and populate are complete, reindex KB articles to ensure semantic cache aligns with updated content.

Step 1: Reindex KB – Create a scheduled job (sys_trigger table).

// Create a scheduled job (sys_trigger)
new sn_ais.IndexEvent().indexTableNoBlock('kb_knowledge’);

Step 2: Validate Ingestion Status (Optional)

(function () {
var stats = new GlideRecord('ais_ingest_datasource_stats');
stats.addQuery('datasource', 'kb_knowledge');
stats.orderByDesc('sys_created_on');
stats.setLimit(1);
stats.query();
if (stats.next()) {
gs.info('[KB REINDEX STATUS] state=' + stats.state +
', semantic=' + stats.semantic_ingestion_state +
', keyword=' + stats.keyword_ingestion_state +
', records=' + stats.records_processed);
} else {
gs.info('No ingest stats found for datasource=kb_knowledge');
}
})();

Operational Best Practices

Pinning Strategy:
- Pin high-value entries to avoid purge.
- Review pinned entries monthly (they do not auto-refresh).
Monitoring:
- Use sn_ais_assist_qna_log for cache hit/miss tracking.
- Use sys_generative_ai_log and sys_generative_ai_metric for LLM metrics.
Dashboard: Create a dashboard to track cache hit ratio and LLM fallback frequency using the tables - sn_ais_assist_qna_log, sn_ais_assist_semantic_cache, sys_generative_ai_log (Note: ZP4 onwards you can access sys_generative_ai_log with admin access. It doesn’t need maint access anymore).

Edge Cases

KB deletion → Cache entries remain until purge.
Query variants → L1 requires exact match; punctuation differences cause misses.
Multi-node environments → L1 cache is node-local → inconsistent hits.

Performance Considerations

L1 cache = fast but node-local.
L2 cache = global but adds latency (For example ~100ms).
Dynamic Translation → expect higher LLM usage.

References:

sn_ais_assist_semantic_cache
sn_ais_assist_qna_log
sys_generative_ai_log
ais_ingest_datasource_stats
Documentation: https://www.servicenow.com/docs/bundle/zurich-platform-administration/page/administer/ai-search/conc...

#NowAssist

#AISearch

#KnowledgeManagement

#ServiceNowPlatform

#GenAI

#QnAGeniusResults

#ZurichRelease

#Servicenow

ABHAY_M · ‎01-02-2026

Hi @Yogesh Shinde ,
The Now Assist Q&A Genius results appear to be outdated, and I’ve heard that there will be no further updates to this. Could you please confirm this? Also, does caching apply to Now Assist Multi-Content Response Genius results?

Yogesh Shinde · ‎01-13-2026

Hi @ABHAY_M ,

Starting with the Now Assist in AI Search 11 release, the Now Assist Q&A Genius Results feature is in maintenance mode. This feature will have no new enhancements but will have continued support.

Unlike Q&A Genius Results, which uses L1/L2 query-time caches, Multi-Content Response dynamically aggregates multiple indexed sources (Knowledge, Catalog, external connectors) and generates answers in real time using the LLM pipeline. There is no semantic cache table or purge mechanism for this newer feature.

Please refer to following articles -