Segments in the Query Generation semantic layer
Summarize
Summary of Segments in the Query Generation semantic layer
Segments in the Query Generation semantic layer are predefined filter conditions that connect business language to specific query filters. They help translate natural language queries into precise database queries by providing context that guides selection of the correct entities, dimensions, and values. For instance, a segment can clarify that “open” means “active=true” or distinguish that “emergency” refers to a change request type rather than priority.
Show less
When a user submits a question, the Query Generation engine searches for matching segments and incorporates their filters into the prompt for the language model, enabling it to build accurate queries.
Key Features
- Segment Types:
- Automated Segments: System-generated from existing reports, dashboards, filters, modules, and indicator sources. These segments often have technical names like "Incidents.Open".
- Manual Segments: Created by administrators with user-friendly names such as "Critical Open Incidents". They are saved searches with filters, managed through a two-table data model, synchronized automatically, and can be shipped via update sets or plugins.
- Search and Scoring Process: The system uses AI search to find semantically similar segments by indexing segment names, descriptions, entities, and filters. Only segments above a similarity threshold are considered. Manual segments receive a configurable priority boost (default 5%) to ensure domain-tuned segments surface ahead of automated ones.
- Integration with LLM: Matching segments are passed to the language model as structured data, which uses segment filters as building blocks to generate complete executable queries. Manual segments are labeled to encourage the model to retain their filters unless irrelevant.
- Automated Segment Generation: Segments are automatically created on a schedule from reports, report sources, Performance Analytics indicators, saved global filters, and app modules. Only recently used or active sources contribute segments, with configurable recency thresholds to maintain relevance.
- Disabling Segment Sources: Administrators can disable segment creation globally or per source type via system properties, useful for troubleshooting or reducing noise.
How Manual Segments Enhance Query Generation
- Entity Discovery: Manual segments help identify or boost relevant entities during initial queries by matching segment names to user intent.
- Filter Provision: Manual segments provide explicit filter conditions in the prompt, guiding the language model to apply precise constraints.
- Configurable Scoring Boost: The priority boost for manual segments is adjustable, allowing admins to tune how prominently these curated segments influence query generation.
Practical Benefits for ServiceNow Customers
- Enable more accurate and context-aware natural language queries by leveraging both system-generated and admin-defined segments.
- Improve query precision and user satisfaction through manual segments tailored to your domain and common business terminology.
- Maintain up-to-date segments automatically while retaining full control over manual segments and their lifecycle.
- Control segment relevance and noise by configuring recency thresholds, disabling sources, and adjusting scoring priorities.
- Deliver domain-specific saved searches with applications by shipping manual segments via plugins, ensuring immediate query enhancement upon app installation.
Segments are predefined filter conditions that map business terminology to specific query filters, helping the semantic layer translate natural language questions into accurate database queries.
Segments provide non-obvious context to assist the semantic layer in selecting the correct entity, dimension, and values. For example, in the utterance "How many open emergency change requests are there?", a segment identifies that "open" means "active=true" and "emergency" is a Type, not a Priority.
When a user asks a question, the Query Generation engine searches for matching segments and includes their filters in the LLM prompt so the model can reuse them to construct accurate queries.
There are two types of segments:
- Automated Segments
- System-generated from reports, dashboards, filters, modules, and indicator sources. Names are often technical—for example, "Incidents.Open".
- Manual Segments
- Created by admins via the Manual Segment Config table. Names are user-friendly— for example, "Critical Open Incidents". Manual segments use a two-table data model with automatic synchronization and can be shipped with applications via plugins.
| Manual Segments | Automated Segments | |
|---|---|---|
| Created by | Admin, shipped via update set | System-generated from reports, dashboards, filters, modules |
| Name quality | User-friendly, tuned for search | Often technical—for example, "Incidents.Open" |
| Search priority | 5% boost over automated (adjustable insn_query_gen.segments.manual_segment_scale_factor) | Standard scoring |
| LLM treatment | Retain all filters unless irrelevant | Critique each filter individually |
| Prompt label | user_defined_segment |
automated_segment |
| Lifecycle | Fully controlled by admin | Tied to source record activity/usage |
| Shipped with app | Yes (update set) | No (generated at runtime) |
How segments work
The system uses AI search to find segments that are semantically similar to the user's query. AI search indexes the Name, Description, Entity, and Filter fields in the Segments table, comparing them to the user's query to produce a subset of relevant segments.
In the LLM call, the system passes the Name, Description, Entity, and Filters. The LLM uses the segments as building blocks for generating a new query. For example, if a user asks "Unassigned Incidents located in San Diego" and the segment "Unassigned Incidents" is passed to the LLM, the LLM uses the segment's filter as the starting point and attaches the location filter "San Diego" on top of the segment.
| Step | Purpose | Output |
|---|---|---|
| 1: Input | Capture user's natural language query | Raw query text |
| 2: Search | Find semantically similar prebuilt segments | Subset of relevant segments |
| 3: Scoring | Rank the set of relevant segments based on their semantic similarity scores | The subset of relevant segments, now ranked and sorted |
| 4: Context | Provide segment metadata to LLM | Structured segment data |
| 5: Generate | Combine segment logic with new conditions | Complete executable query |
How manual segments work
Manual segments serve two roles at query time:
- Entity discovery
- On first-time queries with no previous context, segment matches can add or boost entities in the entity list. A match against a manual segment name helps identify the intended entity by adding or boosting it in the candidate list.
If a user asks "Show me critical open incidents" and a manual segment named "Critical Open Incidents" exists on the Incident [incident] table, the
incidententity gets boosted in the results. - Filter provision
- Matching segments are formatted into the LLM prompt context. The LLM sees:
The LLM then decides whether to reuse the segment's filters fully or partially when constructing the query. Manual segments are labeled**Related Segments**: - **Critical Open Incidents** (user_defined_segment) - description : High priority incidents that are open and unresolved - entity : incident - filter : { conditions : [{"field":"incident.priority","operator":"=","value":"1"}, ...] }user_defined_segmentin the prompt, which tells the LLM to retain all filters unless completely irrelevant.
Manual segment scoring boost
Manual segments receive a priority boost. When the engine searches for relevant segments, it scores each result by semantic similarity—how closely the segment's name and description match the user's question. By default, manual segments receive a 5% boost applied on top of their raw similarity score.
The boost factor is configurable via the system property sn_query_gen.segments.manual_segment_scale_factor. Increasing it, for example to 1.10, elevates manual segments more
strongly. Setting it to 1.0 removes the boost entirely.
In practice, automated segments often have names that partially match user utterances. For example, a report called "Open Incidents" may score similarly to a manual segment called "Critical Open Incidents". The boost ensures that your handmade, domain-tuned segments surface ahead of system-generated ones when both are close matches.
How segment scoring works
- AI Search returns a raw semantic similarity score (0.0–1.0) for each candidate segment.
- Segments below the match threshold (default
0.70) are discarded. - Manual segment scores are multiplied by the scale factor (default
1.05). - Results are sorted by boosted score and capped at the result limit.
Automatic segment sources
The system auto-generates segments from existing data sources on a schedule. The Query Generation Sync Segments job creates segments automatically, running at installation and then weekly by default.
| Source | What it pulls |
|---|---|
| Saved Reports (sys_report) | Report filters from recently viewed reports |
| Report Sources (sys_report_source) | Analytics data source filters |
| PA Indicators (pa_cubes) | Performance Analytics indicator conditions |
| Saved Filters (sys_filter) | Global saved filters only (excludes user-specific and group-specific filters) |
| App Modules (sys_app_module) | Module-level list view filters |
Automated segment rules
To reduce noise from outdated and irrelevant segments, the job follows specific rules. Segments based on reports, report sources, or indicator sources are active only if the records meet certain criteria:
- Reports must be shared, created by a user with an analytics manager role (admin, dashboard_admin, report_admin, pa_admin, or viz_admin), and have run recently (within 180 days by default).
- Report sources must be included in a data visualization or used in a report that has run recently.
- Indicator sources must be linked to indicators with scores that have recently changed.
For reports, "run recently" is defined by the sn_query_gen.segments.reports.last_viewed_threshold_days system property. The default value is 180 days.
For indicator sources, the time span for "recently changed" depends on the indicator frequency:
- Daily: last 7 days
- Weekly: last 30 days
- Bi-weekly: last 30 days
- Monthly: last 90 days
- Four weeks: last 90 days
- Bi-monthly: last 90 days
- Quarterly: last 180 days
- Fiscal quarterly: last 180 days
- Six months: last 12 months
- Yearly: last 24 months
- Fiscal yearly: last 24 months
You can change the time spans for indicator sources by applying a multiplier using the sn_query_gen.segments.indicator.inactivity_threshold_multiplier system property. The value must be an integer, meaning you can only lengthen the periods, not shorten them.
Disabling segment sources
You can disable segment creation altogether, or for individual source types. You might disable segment generation to troubleshoot, or if segments from a source are "noisy." Each source type has a corresponding sn_query_gen.segments.disable.* system property. Disable segments for that source by setting the corresponding system property to true. All existing segments created from sources of that type are excluded from AI Data Explorer search results. No new segments of that type are created. During the next Sync Segments job, all segments of that type are deactivated. For more information, see Query Generation properties.