Yogesh Shinde
ServiceNow Employee
ServiceNow Employee

Introduction:

Caching for Now Assist Q&A Genius Results is essential for performance and cost optimization in ServiceNow. However, stale cache entries can lead to incorrect answers and bloated tables. This article provides a technical guide covering the following steps which I have validated in my instances:

  • Cache architecture (L1 vs L2)
  • Purge and refresh
  • KB reindexing and automation
  • Operational best practices and edge cases

Cache Architecture Overview

  1. First-Level Cache (L1)
    • Location: In-memory (node-local)
    • Structure: Key-value pairs
      • Key: Search Query + KB sys_id
      • Value: Summary generated by Now LLM
    • Match Type: Exact keyword match required
    • Behavior: Extremely fast but not shared across nodes.
  2. Second-Level Cache (L2)
    • Location: sn_ais_assist_semantic_cache table
    • Indexing: Semantic vector search (matches by meaning)
    • Fields: Query (query_term), KB, Source Identifier, Updates (sys_mod_count), sys_updated_on
    • Behavior:
      • If L1 misses → Check L2
      • If L2 misses → Query goes to LLM

Language Support

  1. Only English is supported for cache hits.
  2. With Dynamic Translation:
    • Cache is bypassed → Query goes to LLM.
    • Impact: Lower cache hit rates → Higher LLM usage.

Purge Before Populate:

The Update Semantic Cache job (in the sys_trigger table) must purge stale entries first before repopulating. Here’s why:

  • No built-in TTL → Old summaries persist indefinitely.
  • Performance risk → Large L2 table slows semantic search.
  • Accuracy risk → Outdated KB content leads to wrong answers.

Note: While there’s no execute UI action on this job, you can run it on demand by just modifying the time in the “Next Action” field.

Scheduled Job: Update Semantic Cache

Located in sys_trigger table. Sample script:

 

gs.info('[Update Semantic Cache] Forced run (purge+populate) at ' + gs.nowDateTime());
var util = new snaisassist.SemanticCacheUpdate();

// 1) Purge all UNPINNED L2 entries not used in the past 7 days
util.purgeUnused(7);

// 2) Populate L2 from recent Search Events (adjust window if needed)
util.process(gs.minutesAgo(1440)); // last 24 hours

 

Update Semantic Cache JobUpdate Semantic Cache Job

 

Manual Purge Option - In case you want to maintain manual script outside of the OOTB job. 

Run in Background Scripts for emergency cleanup (or Fix Script in NA in AI Search scope):

 

// Fix Script (executes in App scope: Now Assist in AI Search)
(function () {
  gs.info('[Purge Semantic Cache] scope=' + gs.getCurrentScopeName());

  var gr = new GlideRecord('sn_ais_assist_semantic_cache');
  gr.addQuery('sys_updated_on', '<', gs.minutesAgoStart(5));
  gr.addQuery('pinned', '!=', true);
  gr.query();

  var purged = 0;
  while (gr.next()) { gr.deleteRecord(); purged++; }

  gs.info('[Purge Semantic Cache] Purged ' + purged + ' records older than 5 minutes.');
})();

 

KB Reindexing After Purge

Once purge and populate are complete, reindex KB articles to ensure semantic cache aligns with updated content.

Step 1: Reindex KB – Create a scheduled job (sys_trigger table).

// Create a scheduled job (sys_trigger)
new sn_ais.IndexEvent().indexTableNoBlock('kb_knowledge’);

 

Step 2: Validate Ingestion Status (Optional)

(function () {
var stats = new GlideRecord('ais_ingest_datasource_stats');
stats.addQuery('datasource', 'kb_knowledge');
stats.orderByDesc('sys_created_on');
stats.setLimit(1);
stats.query();
if (stats.next()) {
gs.info('[KB REINDEX STATUS] state=' + stats.state +
', semantic=' + stats.semantic_ingestion_state +
', keyword=' + stats.keyword_ingestion_state +
', records=' + stats.records_processed);
} else {
gs.info('No ingest stats found for datasource=kb_knowledge');
}
})();

 

Operational Best Practices

  • Pinning Strategy:
    • Pin high-value entries to avoid purge.
    • Review pinned entries monthly (they do not auto-refresh).
  • Monitoring:
    • Use sn_ais_assist_qna_log for cache hit/miss tracking.
    • Use sys_generative_ai_log and sys_generative_ai_metric for LLM metrics.
  • Dashboard: Create a dashboard to track cache hit ratio and LLM fallback frequency using the tables - sn_ais_assist_qna_log, sn_ais_assist_semantic_cache, sys_generative_ai_log (Note: ZP4 onwards you can access sys_generative_ai_log with admin access. It doesn’t need maint access anymore).

 

Edge Cases

  • KB deletion → Cache entries remain until purge.
  • Query variants → L1 requires exact match; punctuation differences cause misses.
  • Multi-node environments → L1 cache is node-local → inconsistent hits.

 

Performance Considerations

  • L1 cache = fast but node-local.
  • L2 cache = global but adds latency (For example ~100ms).
  • Dynamic Translation → expect higher LLM usage.

 

References:

 

 

 

 

 

#NowAssist
#AISearch
#KnowledgeManagement
#ServiceNowPlatform
#GenAI
#QnAGeniusResults
#ZurichRelease
#Servicenow
Version history
Last update:
2 hours ago
Updated by:
Contributors