Zurich ServiceNow AI Platform Administration

Caching for Now Assist Q&A Genius Results

Save as PDF

Share this page

Administer the ServiceNow AI Platform

Table of Contents

Caching for Now Assist Q&A Genius Results

Save as PDF

Share this page

Release version:
UpdatedOct 3, 2025
6 minutes to read

Zurich
AI Search

AI Search provides two query-time caches to improve search performance for Now Assist Q&A Genius Results. Caching enables AI Search to return previously generated Now Assist Q&A Genius Result answers without submitting Knowledge articles to the Now LLM Service for answer generation.

Now Assist Q&A Genius Results caching overview

When a search query triggers a Now Assist Q&A Genius Result configuration, AI Search uses the search query and the sys_id of the search's top-ranked Knowledge article search result to look for cached article summaries. If no article summary is found in the active caches, AI Search submits the search query and article sys_id to the Now LLM Service for answer generation.

Note: By default, AI Search only uses the first-level cache and the Now LLM Service when finding Now Assist Q&A Genius Result answers. To enable the second-level cache or disable use of the Now LLM Service, administrators can change the operational mode for the caches as described in this topic.

The query-time caches only support English-language searches, Knowledge articles, and answers.

Cache levels

The cache includes two levels which work in different ways to improve search performance for Now Assist Q&A Genius Results.

First-level cache: The first-level cache comprises a list of key-value pairs stored in memory for fast access. Each cache entry has a key that includes a search query and the sys_id of a Knowledge article returned by that query. The cache entry's value includes the summary generated by the Now LLM Service for the specified search query and Knowledge article.

When checking the first-level cache, AI Search compares your search query and the sys_id of your Knowledge article search result to the cache entry keys. If it finds a matching key, it returns the article summary from the corresponding cache entry value. Otherwise, it goes on to check the second-level cache.
Note: The first-level cache only yields a result when your search query is an exact keyword match for the cached search query. For example, if you search for avoiding scams you won't get a result for a cached entry with search query how to prevent scams because the two search queries don't contain the same terms.
Second-level cache: The second-level cache comprises a table that is configured as an AI Search indexed source. Each record on this table is a cache entry, and includes a search query, the sys_id for an associated Knowledge article search result, the summary generated for that query and Knowledge article, and other fields such as pinned, sys_updated_on, and run_as. AI Search updates the index for this table whenever its records are created, updated, or deleted. This index update operation can take up to a minute.

When checking the second-level cache, AI Search queries the indexed table, looking for an entry that matches your search query and the sys_id of your Knowledge article search result. If it finds a matching entry, it returns the article summary stored in the indexed table. Otherwise, it goes on to submit your search query and Knowledge article search result to the Now LLM Service.
Note: Unlike the first-level cache, the second-level cache compares search query meanings using semantic vector search, so you may get a cache result even if your search query isn't an exact keyword match for the cached search query. For example, if you search for avoiding scams you might get the result for a cached entry with search query how to prevent scams because the meanings of the two search queries are similar. For more information on semantic vector search, see Semantic vector search in AI Search.

Benefits of caching

Caching for Now Assist Q&A Genius Results provides the following benefits:

Decreases average response time for common Now Assist Q&A Genius Result answers
Lowers Now Assist entitlement consumption by reducing the number of search query results sent to the Now LLM Service for Now Assist Q&A answer extraction
Increases the likelihood of returning a Now Assist Q&A Genius Result answer
Improves search consistency by returning the same Now Assist Q&A Genius Result answer for similar searches

Content Security for cached queries

Because AI Search applies Content Security restrictions to your search before it matches Knowledge articles and checks the caches, neither cache returns hits for Knowledge articles that you don't have access to. For full details on AI Search's Content Security model, see Content security in AI Search.

Cache modes

The Now Assist Q&A Genius Result answer caches support the following operational modes:

off: Use the first-level cache and the Now LLM Service to find Now Assist Q&A Genius Result answers.

AI Search looks in the first-level cache for Now Assist Q&A Genius Result answers that exactly match your search query and Knowledge article result. If it doesn't find a matching answer, it sends your query and Knowledge article sys_id to the Now LLM Service for answer generation.

Now Assist Q&A Genius Result answers generated by the Now LLM Service populate the first-level cache.

When using Dynamic Translation, AI Search bypasses the caches and queries the Now LLM Service to generate an answer for the Now Assist Q&A Genius Result.
offline: Use the first-level and second-level caches to find Now Assist Q&A Genius Result answers. Don't submit queries to the Now LLM Service.

AI Search looks in the first-level cache for Now Assist Q&A Genius Result answers that exactly match your search query and Knowledge article result. If it doesn't find a matching answer, it uses semantic vector search to look for answers that match the meaning of your query in the second-level cache. If no cached answers match your query and Knowledge article result, AI Search returns no answer for the Now Assist Q&A Genius Result.

Now Assist Q&A Genius Result answers found in the second-level cache populate the first-level cache.

When using Dynamic Translation, AI Search bypasses the caches and returns no answer for the Now Assist Q&A Genius Result.
online: Use the first-level and second-level caches and the Now LLM Service to find Now Assist Q&A Genius Result answers.

AI Search looks in the first-level cache for Now Assist Q&A Genius Result answers that exactly match your search query and Knowledge article result. If it doesn't find a matching answer, it uses semantic vector search to look for answers that match the meaning of your query in the second-level cache. If no cached answers match your query and Knowledge article result, AI Search submits the query and article sys_id to the Now LLM Service for answer generation.

Now Assist Q&A Genius Result answers generated by the Now LLM Service populate the first-level and second-level caches.

When using Dynamic Translation, AI Search bypasses the caches and queries the Now LLM Service to generate an answer for the Now Assist Q&A Genius Result.

The default operational mode is off.

Administrators can change the operational mode for the Now Assist Q&A Genius Result answer caches by setting the value for the sn_ais_assist.semantic_cache_mode system property to off, offline, or online. For details on system property settings, see Add a system property.

For more details on using Dynamic Translation with Now Assist Genius Results, see Dynamic Translation for Now Assist Q&A Genius Results.

Scheduled job for cache management

The Update Semantic Cache scheduled job runs daily to perform the following tasks:

Populate the second-level cache with results for the most frequently submitted queries found in the Search Event [sys_search_event] search signal table. For more information on this table, see Search signal tables.
Purge all unpinned second-level cache entries that have not been used in the past seven days. Search administrators can pin results in the second-level cache table to prevent them from being purged. For more details on this procedure, see Pin cached answers for Now Assist Q&A Genius Results.

Pin cached answers for Now Assist Q&A Genius Results

Improve performance for Now Assist Q&A Genius Results by pinning frequently used answers in the second-level cache. Pinning an answer exempts it from the cache's purge mechanism.

Before you begin

The Now Assist in AI Search ServiceNow® Store application must be installed on your instance. For details on installing this application, see Install Now Assist in AI Search.

Role required: ais_admin

About this task

Search administrators can pin entries in the second-level cache for Now Assist Q&A Genius Result answers. The Update Semantic Cache scheduled job ignores pinned entries when purging the second-level cache.

Pinning frequently used entries helps improve search performance by enabling AI Search to return previously generated Now Assist Q&A Genius Result answers without submitting Knowledge articles to the Now LLM Service for answer generation.

To learn more about the second-level Now Assist Q&A Genius Result answer cache and its usage, see Caching for Now Assist Q&A Genius Results.

Procedure

Navigate to the Now Assist in AI Search Semantic Cache [sn_ais_assist_semantic_cache] table's list view.
1. Select All.
2. In the Filter field, enter sn_ais_assist_semantic_cache.list.
3. Press Enter.
Each record in this table represents a second-level cache entry. The table is populated with Now Assist Q&A Genius Result answers previously generated by the Now LLM Service.
Update each cache entry record that you want to pin.
1. Open the cache entry record by selecting it from the list view.
2. Set the value of the record's Pinned field to true.
3. Select Update.

Result

The Update Semantic Cache scheduled job ignores your pinned entries when purging the second-level cache.

Zurich ServiceNow AI Platform Administration