Test set creation and management
Summarize
Summary of Test set creation and management
ServiceNow’s NLU models, used in Virtual Agent and AI Search, come with a default test set to evaluate model performance and accuracy. This test set initially is empty and must be populated with utterances and expected intents to effectively test the model. Managing the test set over time helps maintain and improve the model’s accuracy.
Show less
To enable testing capabilities, customers must install the NLU Workbench - Advanced Features application from the ServiceNow Store.
Accessing and Managing the Default Test Set
Customers can access the default test set through the NLU Workbench by navigating to their model’s overview page and selecting the appropriate tabs or tiles such as Build and Train your model or Test Coverage. Alternatively, test sets are accessible via the Multi-model Batch Testing interface.
The test set can be populated in three ways:
- Manual entry: Add utterances and expected intents directly via the interface.
- Import: Upload utterances and intents from CSV files or other models.
- Expert Feedback Loop: Import real user utterances from Virtual Agent chat logs to enhance test relevance.
Test Coverage and Quality
Test Coverage indicates the percentage of enabled intents covered by test utterances. A minimum of 60% coverage with at least 5 utterances per intent is required for reliable confidence threshold settings during batch testing. Higher coverage improves the accuracy of performance results.
It is recommended to include about 10% of utterances marked as “not relevant” to test the model’s ability to handle irrelevant inputs without predicting intents.
Using the Test Set
The default test set supports model testing during the Test and publish your model phase and Multi-model Batch Testing. Proper use of the test set facilitates better confidence in model readiness and deployment decisions.
Characteristics and Maintenance
- Default test sets are automatically created upon instance upgrades for existing models lacking them.
- When duplicating a model, the original test set is copied to the new model.
- Test sets must be in the same language as the model and should not share utterances with the training set.
- Default test sets cannot be deleted independently of their associated models.
- Test sets are available only for Virtual Agent and AI Search NLU models.
Downloading and Moving Test Sets
Customers can download default test sets as CSV files (containing utterances and expected intents but not source information) from the model overview page. Additionally, test sets can be moved between instances using update sets, which include all test utterances, intents, and sources when the NLU model is added to an update set.
Note that exporting models as CSV does not include the test set content.
Use the default test set of your NLU model to test the model's performance and accuracy. Manage your test set over time by building or updating its content in the NLU Workbench.
Access your default test set
- Navigate to . Select the tab for your model's application, then the name of your model from the list.
On the model's overview
page, find the Build and Train your model card and select its View phase button.
Then select the Test set tab.
- Navigate to . Select the tab for your model's application, then the name of your model from the list.
On the model's overview page, select the Test Coverage tile.
- Navigate to tab. Find the name of your model. Default test sets are labeled as
Default.
Add content to your default test set
Add utterances and their expected intents to build and manage your test set over time. You can add content to the default test set with the following methods:
- Add test utterances and their expected intents manually.
From the model's overview page navigate
to tab. Type your input into the Type a test utterance here field, select an appropriate intent,
then select the Add button.
These test utterances are assigned a source of Manual.
- Import test utterances and their expected intents from a CSV file or from other
models. To import content to a default test set, from the model's overview page navigate
to tab. Select Import test utterances.
Imported test utterances are assigned a source of Manual.
- The Expert Feedback
Loop feature lets you add actual user utterances from Virtual Agent chat
logs to the test set.
These test utterances are assigned a source of Expert Feedback. For more information, see NLU Expert Feedback Loop.
Test Coverage
The Test Coverage score is the percentage of a model's enabled intents that have test utterances in the default test set. Before testing your model, ensure that there is at least 60% coverage. The higher the Test Coverage score, the more accurate the performance testing results.
Your test coverage needs to be at least 60%, with at least 5 test utterances per intent, in order for the system to provide an optimal confidence threshold during batch testing. For more information about the confidence threshold, see NLU model settings.
Aim to have about 10 percent of a model's test utterances marked as "not relevant", meaning that there is no intent associated. This helps assess how the model handles irrelevant utterances which should not have any intent predicted. For more information about irrelevant utterances, see Irrelevance detection in NLU.
Use the test set
To use the default test set from the Test and publish your model phase, see Test and publish your model.
To use the test set in Multi-model Batch Testing, see Multi-model Batch Testing.
Characteristics of default test sets
When an instance is upgraded, default test sets are created for any existing models that don't already have them.
When you copy a model using Duplicate this model, the original's default test set is copied into the new model. For more information, see Duplicate an NLU model.
The utterances in the test set shouldn't be the same as the utterances in the training set.
Default test sets can't be deleted separately from their models.
Test set utterances should be in the same language as their model.
Test sets are available for Virtual Agent or AI Search models.
Downloading or moving default test sets
Default test sets can be downloaded or moved as follows.
- Default test sets can be separately downloaded in CSV format. To download the test set,
from the model's overview page navigate to tab. Select Download test set.Note:Test sets that are downloaded from Download test set contain test utterances and their expected intents, but not the sources.
- Default test sets can be moved with update sets. When you add an NLU model to an update set, its default test set is added, including test utterances, expected intents, and sources. For more information, see Add an NLU model to an update set.
- When using the Export model as CSV function in the All existing models table, the default test set is not included. For more information, see Export an NLU model.