Segments in the Query Generation semantic layer
Summarize
Summary of Segments in the Query Generation semantic layer
Segments are predefined filter conditions in the Query Generation semantic layer that map business terminology to specific query filters. They help translate natural language queries into accurate database queries by providing context that assists in selecting the correct entities, dimensions, and values. For example, segments clarify terms like "open" or "emergency" by linking them to precise filters and classifications.
Show less
There are two segment types:
- Automated Segments: System-generated from reports, dashboards, filters, modules, and indicator sources with typically technical names (e.g., "Incidents.Open").
- Manual Segments: Created by administrators with user-friendly names (e.g., "Critical Open Incidents") and managed through a configuration table, allowing better control and customization.
How Segments Work
The system uses AI search to find segments semantically similar to the user's query by indexing their names, descriptions, entities, and filters. The Query Generation engine then passes these segments to the large language model (LLM), which uses them as building blocks to generate precise executable queries.
For example, a query like "Unassigned Incidents located in San Diego" leverages the "Unassigned Incidents" segment filter as a base, adding location criteria on top.
Manual Segments Specifics
- Entity Discovery: Manual segments help identify or boost relevant entities when no prior context exists, improving query understanding.
- Filter Provision: Manual segments are included in the LLM prompt with their detailed filters, guiding the model to fully or partially reuse these filters for query construction.
- Scoring Boost: Manual segments receive a configurable semantic similarity score boost by default (5%), ensuring domain-specific, admin-defined segments rank higher than automated ones when relevant.
Automated Segments and Sources
Automated segments are generated on a schedule from sources such as saved reports, analytics data sources, Performance Analytics indicators, saved filters, and application modules. These segments are refreshed by a system job that runs initially at installation and weekly thereafter.
Only active and relevant records are used to create automated segments, based on criteria like recent usage and roles of report creators. There are restrictions on domain-separated instances and configurable recency thresholds for source data validity.
Configuration and Management
- Segments can be disabled globally or by individual source type via system properties, useful for troubleshooting or reducing noise.
- Manual segments can be created and managed by admins and packaged within applications via plugins, enabling domain-specific saved searches that are available immediately after installation.
- The segment scoring process filters out low-relevance segments and ranks results based on boosted similarity scores to optimize query accuracy.
Practical Benefits for ServiceNow Customers
- Segments enable the semantic layer to better interpret natural language queries by associating business terms with precise database filters.
- Manual segments give admins direct control to improve search relevance and query accuracy with domain-specific, user-friendly filters.
- Automated segments ensure ongoing, up-to-date coverage of common filters derived from existing reports and analytics sources, reducing manual effort.
- Configurable scoring and disabling options allow fine-tuning of segment behavior to optimize query generation performance and relevance.
Segments are predefined filter conditions that map business terminology to specific query filters, helping the semantic layer translate natural language questions into accurate database queries.
Segments provide non-obvious context to assist the semantic layer in selecting the correct entity, dimension, and values. For example, in the utterance "How many open emergency change requests are there?", a segment identifies that "open" means "active=true" and "emergency" is a Type, not a Priority.
When a user asks a question, the Query Generation engine searches for matching segments and includes their filters in the LLM prompt so the model can reuse them to construct accurate queries.
There are two types of segments:
- Automated Segments
- System-generated from reports, dashboards, filters, modules, and indicator sources. Names are often technical—for example, "Incidents.Open".
- Manual Segments
- Created by admins via the Manual Segment Config table. Names are user-friendly— for example, "Critical Open Incidents". Manual segments use a two-table data model with automatic synchronization and can be shipped with applications via plugins.
| Manual Segments | Automated Segments | |
|---|---|---|
| Created by | Admin, shipped via update set | System-generated from reports, dashboards, filters, modules |
| Name quality | User-friendly, tuned for search | Often technical—for example, "Incidents.Open" |
| Search priority | 5% boost over automated (adjustable insn_query_gen.segments.manual_segment_scale_factor) | Standard scoring |
| LLM treatment | Retain all filters unless irrelevant | Critique each filter individually |
| Prompt label | user_defined_segment |
automated_segment |
| Lifecycle | Fully controlled by admin | Tied to source record activity/usage |
| Shipped with app | Yes (update set) | No (generated at runtime) |
How segments work
The system uses AI search to find segments that are semantically similar to the user's query. AI search indexes the Name, Description, Entity, and Filter fields in the Segments table, comparing them to the user's query to produce a subset of relevant segments.
In the LLM call, the system passes the Name, Description, Entity, and Filters. The LLM uses the segments as building blocks for generating a new query. For example, if a user asks "Unassigned Incidents located in San Diego" and the segment "Unassigned Incidents" is passed to the LLM, the LLM uses the segment's filter as the starting point and attaches the location filter "San Diego" on top of the segment.
| Step | Purpose | Output |
|---|---|---|
| 1: Input | Capture user's natural language query | Raw query text |
| 2: Search | Find semantically similar prebuilt segments | Subset of relevant segments |
| 3: Scoring | Rank the set of relevant segments based on their semantic similarity scores | The subset of relevant segments, now ranked and sorted |
| 4: Context | Provide segment metadata to LLM | Structured segment data |
| 5: Generate | Combine segment logic with new conditions | Complete executable query |
How manual segments work
Manual segments serve two roles at query time:
- Entity discovery
- On first-time queries with no previous context, segment matches can add or boost entities in the entity list. A match against a manual segment name helps identify the intended entity by adding or boosting it in the candidate list.
If a user asks "Show me critical open incidents" and a manual segment named "Critical Open Incidents" exists on the Incident [incident] table, the
incidententity gets boosted in the results. - Filter provision
- Matching segments are formatted into the LLM prompt context. The LLM sees:
The LLM then decides whether to reuse the segment's filters fully or partially when constructing the query. Manual segments are labeled**Related Segments**: - **Critical Open Incidents** (user_defined_segment) - description : High priority incidents that are open and unresolved - entity : incident - filter : { conditions : [{"field":"incident.priority","operator":"=","value":"1"}, ...] }user_defined_segmentin the prompt, which tells the LLM to retain all filters unless completely irrelevant.
Manual segment scoring boost
Manual segments receive a priority boost. When the engine searches for relevant segments, it scores each result by semantic similarity—how closely the segment's name and description match the user's question. By default, manual segments receive a 5% boost applied on top of their raw similarity score.
The boost factor is configurable via the system property sn_query_gen.segments.manual_segment_scale_factor. Increasing it, for example to 1.10, elevates manual segments more
strongly. Setting it to 1.0 removes the boost entirely.
In practice, automated segments often have names that partially match user utterances. For example, a report called "Open Incidents" may score similarly to a manual segment called "Critical Open Incidents". The boost ensures that your handmade, domain-tuned segments surface ahead of system-generated ones when both are close matches.
How segment scoring works
- AI Search returns a raw semantic similarity score (0.0–1.0) for each candidate segment.
- Segments below the match threshold (default
0.70) are discarded. - Manual segment scores are multiplied by the scale factor (default
1.05). - Results are sorted by boosted score and capped at the result limit.
Automatic segment sources
The system auto-generates segments from existing data sources on a schedule. The Query Generation Sync Segments job creates segments automatically, running at installation and then weekly by default.
| Source | What it pulls |
|---|---|
| Saved Reports (sys_report) | Report filters from recently viewed reports |
| Report Sources (sys_report_source) | Analytics data source filters |
| PA Indicators (pa_cubes) | Performance Analytics indicator conditions |
| Saved Filters (sys_filter) | Global saved filters only (excludes user-specific and group-specific filters) |
| App Modules (sys_app_module) | Module-level list view filters |
Automated segment rules
To reduce noise from outdated and irrelevant segments, the job follows specific rules. Segments based on reports, report sources, or indicator sources are active only if the records meet certain criteria:
- Reports must be shared, created by a user with an analytics manager role (admin, dashboard_admin, report_admin, pa_admin, or viz_admin), and have run recently (within 180 days by default).
- Report sources must be included in a data visualization or used in a report that has run recently.
- Indicator sources must be linked to indicators with scores that have recently changed.
For reports, "run recently" is defined by the sn_query_gen.segments.reports.last_viewed_threshold_days system property. The default value is 180 days.
For indicator sources, the time span for "recently changed" depends on the indicator frequency:
- Daily: last 7 days
- Weekly: last 30 days
- Bi-weekly: last 30 days
- Monthly: last 90 days
- Four weeks: last 90 days
- Bi-monthly: last 90 days
- Quarterly: last 180 days
- Fiscal quarterly: last 180 days
- Six months: last 12 months
- Yearly: last 24 months
- Fiscal yearly: last 24 months
You can change the time spans for indicator sources by applying a multiplier using the sn_query_gen.segments.indicator.inactivity_threshold_multiplier system property. The value must be an integer, meaning you can only lengthen the periods, not shorten them.
Disabling segment sources
You can disable segment creation altogether, or for individual source types. You might disable segment generation to troubleshoot, or if segments from a source are "noisy." Each source type has a corresponding sn_query_gen.segments.disable.* system property. Disable segments for that source by setting the corresponding system property to true. All existing segments created from sources of that type are excluded from AI Data Explorer search results. No new segments of that type are created. During the next Sync Segments job, all segments of that type are deactivated. For more information, see Query Generation properties.