Caching for Now Assist Q&A Genius Results
- UpdatedOct 3, 2025
- 6 minutes to read
- Zurich
- AI Search
AI Search provides two query-time caches to improve search performance for Now Assist Q&A Genius Results. Caching enables AI Search to return previously generated Now Assist Q&A Genius Result answers without submitting Knowledge articles to the Now LLM Service for answer generation.
Now Assist Q&A Genius Results caching overview
The query-time caches only support English-language searches, Knowledge articles, and answers.
Cache levels
- First-level cache
The first-level cache comprises a list of key-value pairs stored in memory for fast access. Each cache entry has a key that includes a search query and the sys_id of a Knowledge article returned by that query. The cache entry's value includes the summary generated by the Now LLM Service for the specified search query and Knowledge article.
When checking the first-level cache, AI Search compares your search query and the sys_id of your Knowledge article search result to the cache entry keys. If it finds a matching key, it returns the article summary from the corresponding cache entry value. Otherwise, it goes on to check the second-level cache.Note: The first-level cache only yields a result when your search query is an exact keyword match for the cached search query. For example, if you search for avoiding scams you won't get a result for a cached entry with search query how to prevent scams because the two search queries don't contain the same terms.- Second-level cache
The second-level cache comprises a table that is configured as an AI Search indexed source. Each record on this table is a cache entry, and includes a search query, the sys_id for an associated Knowledge article search result, the summary generated for that query and Knowledge article, and other fields such as pinned, sys_updated_on, and run_as. AI Search updates the index for this table whenever its records are created, updated, or deleted. This index update operation can take up to a minute.
When checking the second-level cache, AI Search queries the indexed table, looking for an entry that matches your search query and the sys_id of your Knowledge article search result. If it finds a matching entry, it returns the article summary stored in the indexed table. Otherwise, it goes on to submit your search query and Knowledge article search result to the Now LLM Service.Note: Unlike the first-level cache, the second-level cache compares search query meanings using semantic vector search, so you may get a cache result even if your search query isn't an exact keyword match for the cached search query. For example, if you search for avoiding scams you might get the result for a cached entry with search query how to prevent scams because the meanings of the two search queries are similar. For more information on semantic vector search, see Semantic vector search in AI Search.
Benefits of caching
- Decreases average response time for common Now Assist Q&A Genius Result answers
- Lowers Now Assist entitlement consumption by reducing the number of search query results sent to the Now LLM Service for Now Assist Q&A answer extraction
- Increases the likelihood of returning a Now Assist Q&A Genius Result answer
- Improves search consistency by returning the same Now Assist Q&A Genius Result answer for similar searches
Content Security for cached queries
Because AI Search applies Content Security restrictions to your search before it matches Knowledge articles and checks the caches, neither cache returns hits for Knowledge articles that you don't have access to. For full details on AI Search's Content Security model, see Content security in AI Search.
Cache modes
-
off: Use the first-level cache and the Now LLM Service to find Now Assist Q&A Genius Result answers.
AI Search looks in the first-level cache for Now Assist Q&A Genius Result answers that exactly match your search query and Knowledge article result. If it doesn't find a matching answer, it sends your query and Knowledge article sys_id to the Now LLM Service for answer generation.
Now Assist Q&A Genius Result answers generated by the Now LLM Service populate the first-level cache.
When using Dynamic Translation, AI Search bypasses the caches and queries the Now LLM Service to generate an answer for the Now Assist Q&A Genius Result.
-
offline: Use the first-level and second-level caches to find Now Assist Q&A Genius Result answers. Don't submit queries to the Now LLM Service.
AI Search looks in the first-level cache for Now Assist Q&A Genius Result answers that exactly match your search query and Knowledge article result. If it doesn't find a matching answer, it uses semantic vector search to look for answers that match the meaning of your query in the second-level cache. If no cached answers match your query and Knowledge article result, AI Search returns no answer for the Now Assist Q&A Genius Result.
Now Assist Q&A Genius Result answers found in the second-level cache populate the first-level cache.
When using Dynamic Translation, AI Search bypasses the caches and returns no answer for the Now Assist Q&A Genius Result.
-
online: Use the first-level and second-level caches and the Now LLM Service to find Now Assist Q&A Genius Result answers.
AI Search looks in the first-level cache for Now Assist Q&A Genius Result answers that exactly match your search query and Knowledge article result. If it doesn't find a matching answer, it uses semantic vector search to look for answers that match the meaning of your query in the second-level cache. If no cached answers match your query and Knowledge article result, AI Search submits the query and article sys_id to the Now LLM Service for answer generation.
Now Assist Q&A Genius Result answers generated by the Now LLM Service populate the first-level and second-level caches.
When using Dynamic Translation, AI Search bypasses the caches and queries the Now LLM Service to generate an answer for the Now Assist Q&A Genius Result.
The default operational mode is off.
Administrators can change the operational mode for the Now Assist Q&A Genius Result answer caches by setting the value for the sn_ais_assist.semantic_cache_mode system property to off, offline, or online. For details on system property settings, see Add a system property.
For more details on using Dynamic Translation with Now Assist Genius Results, see Dynamic Translation for Now Assist Q&A Genius Results.
Scheduled job for cache management
- Populate the second-level cache with results for the most frequently submitted queries found in the Search Event [sys_search_event] search signal table. For more information on this table, see Search signal tables.
- Purge all unpinned second-level cache entries that have not been used in the past seven days. Search administrators can pin results in the second-level cache table to prevent them from being purged. For more details on this procedure, see Pin cached answers for Now Assist Q&A Genius Results.
Pin cached answers for Now Assist Q&A Genius Results
Improve performance for Now Assist Q&A Genius Results by pinning frequently used answers in the second-level cache. Pinning an answer exempts it from the cache's purge mechanism.
Before you begin
The Now Assist in AI Search ServiceNow® Store application must be installed on your instance. For details on installing this application, see Install Now Assist in AI Search.
Role required: ais_admin
About this task
Search administrators can pin entries in the second-level cache for Now Assist Q&A Genius Result answers. The Update Semantic Cache scheduled job ignores pinned entries when purging the second-level cache.
Pinning frequently used entries helps improve search performance by enabling AI Search to return previously generated Now Assist Q&A Genius Result answers without submitting Knowledge articles to the Now LLM Service for answer generation.
To learn more about the second-level Now Assist Q&A Genius Result answer cache and its usage, see Caching for Now Assist Q&A Genius Results.
Procedure
Result
The Update Semantic Cache scheduled job ignores your pinned entries when purging the second-level cache.