Yogesh Shinde
ServiceNow Employee

Introduction:

Caching for Now Assist Q&A Genius Results is essential for performance and cost optimization in ServiceNow. However, stale cache entries can lead to incorrect answers and bloated tables. This article provides a technical guide covering the following steps which I have validated in my instances:

  • Cache architecture (L1 vs L2)
  • Purge and refresh
  • KB reindexing and automation
  • Operational best practices and edge cases

Cache Architecture Overview

  1. First-Level Cache (L1)
    • Location: In-memory (node-local)
    • Structure: Key-value pairs
      • Key: Search Query + KB sys_id
      • Value: Summary generated by Now LLM
    • Match Type: Exact keyword match required
    • Behavior: Extremely fast but not shared across nodes.
  2. Second-Level Cache (L2)
    • Location: sn_ais_assist_semantic_cache table
    • Indexing: Semantic vector search (matches by meaning)
    • Fields: Query (query_term), KB, Source Identifier, Updates (sys_mod_count), sys_updated_on
    • Behavior:
      • If L1 misses → Check L2
      • If L2 misses → Query goes to LLM

Language Support

  1. Only English is supported for cache hits.
  2. With Dynamic Translation:
    • Cache is bypassed → Query goes to LLM.
    • Impact: Lower cache hit rates → Higher LLM usage.

Purge Before Populate:

The Update Semantic Cache job (in the sys_trigger table) must purge stale entries first before repopulating. Here’s why:

  • No built-in TTL → Old summaries persist indefinitely.
  • Performance risk → Large L2 table slows semantic search.
  • Accuracy risk → Outdated KB content leads to wrong answers.

Note: While there’s no execute UI action on this job, you can run it on demand by just modifying the time in the “Next Action” field.

Scheduled Job: Update Semantic Cache

Located in sys_trigger table. Sample script:

 

gs.info('[Update Semantic Cache] Forced run (purge+populate) at ' + gs.nowDateTime());
var util = new snaisassist.SemanticCacheUpdate();

// 1) Purge all UNPINNED L2 entries not used in the past 7 days
util.purgeUnused(7);

// 2) Populate L2 from recent Search Events (adjust window if needed)
util.process(gs.minutesAgo(1440)); // last 24 hours

 

Update Semantic Cache JobUpdate Semantic Cache Job

 

Manual Purge Option - In case you want to maintain manual script outside of the OOTB job. 

Run in Background Scripts for emergency cleanup (or Fix Script in NA in AI Search scope):

 

// Fix Script (executes in App scope: Now Assist in AI Search)
(function () {
  gs.info('[Purge Semantic Cache] scope=' + gs.getCurrentScopeName());

  var gr = new GlideRecord('sn_ais_assist_semantic_cache');
  gr.addQuery('sys_updated_on', '<', gs.minutesAgoStart(5));
  gr.addQuery('pinned', '!=', true);
  gr.query();

  var purged = 0;
  while (gr.next()) { gr.deleteRecord(); purged++; }

  gs.info('[Purge Semantic Cache] Purged ' + purged + ' records older than 5 minutes.');
})();

 

KB Reindexing After Purge

Once purge and populate are complete, reindex KB articles to ensure semantic cache aligns with updated content.

Step 1: Reindex KB – Create a scheduled job (sys_trigger table).

// Create a scheduled job (sys_trigger)
new sn_ais.IndexEvent().indexTableNoBlock('kb_knowledge’);

 

Step 2: Validate Ingestion Status (Optional)

(function () {
var stats = new GlideRecord('ais_ingest_datasource_stats');
stats.addQuery('datasource', 'kb_knowledge');
stats.orderByDesc('sys_created_on');
stats.setLimit(1);
stats.query();
if (stats.next()) {
gs.info('[KB REINDEX STATUS] state=' + stats.state +
', semantic=' + stats.semantic_ingestion_state +
', keyword=' + stats.keyword_ingestion_state +
', records=' + stats.records_processed);
} else {
gs.info('No ingest stats found for datasource=kb_knowledge');
}
})();

 

Operational Best Practices

  • Pinning Strategy:
    • Pin high-value entries to avoid purge.
    • Review pinned entries monthly (they do not auto-refresh).
  • Monitoring:
    • Use sn_ais_assist_qna_log for cache hit/miss tracking.
    • Use sys_generative_ai_log and sys_generative_ai_metric for LLM metrics.
  • Dashboard: Create a dashboard to track cache hit ratio and LLM fallback frequency using the tables - sn_ais_assist_qna_log, sn_ais_assist_semantic_cache, sys_generative_ai_log (Note: ZP4 onwards you can access sys_generative_ai_log with admin access. It doesn’t need maint access anymore).

 

Edge Cases

  • KB deletion → Cache entries remain until purge.
  • Query variants → L1 requires exact match; punctuation differences cause misses.
  • Multi-node environments → L1 cache is node-local → inconsistent hits.

 

Performance Considerations

  • L1 cache = fast but node-local.
  • L2 cache = global but adds latency (For example ~100ms).
  • Dynamic Translation → expect higher LLM usage.

 

References:

 

 

 

 

 

#NowAssist
#AISearch
#KnowledgeManagement
#ServiceNowPlatform
#GenAI
#QnAGeniusResults
#ZurichRelease
#Servicenow
Comments
ABHAY_M
Tera Guru

Hi @Yogesh Shinde ,
The Now Assist Q&A Genius results appear to be outdated, and I’ve heard that there will be no further updates to this. Could you please confirm this? Also, does caching apply to Now Assist Multi-Content Response Genius results?

Version history
Last update:
3 weeks ago
Updated by:
Contributors