Now Assist Data Kit FAQ

Eliza · ‎11-14-2024

For the latest features, please review our what's new guide for version 3.0.16!

What is Now Assist Data Kit?

Now Assist Data Kit is a new tool that is available as of the Xanadu Patch 3. It allows you to perform the following actions:

Create, curate, and maintain datasets that then can be used in various AI tools
Create and evaluate Ground Truths (definition below)
Publish data collections to enable them to be used in the evaluation of custom skills

What are the different terms used in Now Assist Data Kit?

Dataset: A dataset is a set of records imported from a ServiceNow table. Ground truth can be added and reviewed for each of the records. A dataset can also be used to create smaller or derived datasets.
Derived Dataset: A subset of data derived from within an existing data set. Can also have ground truth added.
Ground Truth: The desired output from a generative AI action for a given record. This is used during evaluation when comparing the output of the LLM to the ground truth to determine if the response was successful according to a particular measure.
Data Collection: A data collection is used to combine one or more similar datasets, than are filtered and sampled for an AI use case. A data collection needs to be published for using the data in Skill Kit.
Now Assist Skill Kit: A feature that allows for users to create custom generative AI skills within ServiceNow
Autoevaluation: A new feature for NASK that allows for users to evaluate their custom skills against metrics. It uses published data collections from NADK to perform the evaluation.

When might I want to use Now Assist Data Kit?

The Now Assist Data Kit's initial release primarily assists with the evaluation feature found within the Now Assist Skill Kit (NASK).

For those unfamiliar, NASK is a feature that allows users to create custom generative AI skills within ServiceNow. In Xanadu Patch 3, we allow for users to evaluate the effectiveness of their custom skills through the feature aptly named “Evaluation”.

You can find a demo of how to conduct an evaluation using Now Assist Data Kit datasets in the video below:

The evaluation feature works by testing the custom skill against a dataset (created and published within NADK) to gather a collection of inputs (prompt + input data) and the responses from the LLM when that input is delivered. These inputs and their responses are then analysed by a “judge” model (which is another LLM) to measure how well it performs against the following metrics:

Faithfulness: Does the output stay true to the source material?
Correctness: Does the output correctly respond to each of the input instructions?

The results are returned to the user as a percentage, indicating how many of the records adhere to the thresholds set for the above metrics. Using this information, users can then iterate upon their prompt to improve the results of evaluation, or simply use positive results to grant an assurance of quality for their custom skill.

As a note, this evaluation feature leverages the ServiceNow OEM Azure OpenAI (GPT-4o) model. This model is not available in the APAC region as of January 2025. Please refer to this page for updates.

How do I get access to Now Assist Data Kit?

To access NADK, you must adhere to the following criteria:

Have an active license for a Now Assist for [x] product.
You have updated the Now Assist for [x] plugins to the latest versions.
Have an instance that is on at least the Xanadu Patch 3

As a note, you cannot access any Now Assist/generative AI features (and consequently NADK) on your personal development instance (PDIs).

Once you have confirmed the above, you then need to grant your users access. To do so, add the sn_data_kit.admin role to your users. If you still don’t have access, try logging out and in again.

What if I don't have good data in my instance?

You can utilise the synthetic data generation feature. This allows you to generate data for use in your data collections. Do note that synthetically generated data is stored in a table specifically made for use with Data Kit datasets [sn_data_kit_dataset_record], so any data you generate does not get added to the existing records within your instance.

Learn more in this video:

How many assists does Now Assist Data Kit consume?

If you are utilizing the Synthetic Data Generation feature, you will consume a number of assists. For the current consumption rate, refer to this pricing guide.

If you are using Now Assist Skill Kit’s autoevalution feature that leverages the datasets created within NADK then you will also consume assists, as each test asks the LLM to generate as many responses as there as records in the dataset in addition to using generative AI when judging the output.

To find specific details on assist consumption, view the assist overview guide.

How can I ensure that my dataset doesn't contain any sensitive or personally identifiable data?

You can use the Data Privacy plugin detect and mask that data before using it in an evaluation.

Learn more here:

Data Catalog

What is the relationship between datasets/derived datasets/ground truths/data collections?

We will explain this using an example. In this scenario, you have created a custom skill that uses generative AI to automatically categorize IT knowledge articles based on the test in the article. You wish to evaluate the effectiveness and quality of this skill, and thus you come to Now Assist Data Kit to determine which data you will be evaluating your custom skill against. We want to ensure that the data we select is representative of the records the custom skill will be operating on in a real scenario.

The first step would be to create a dataset. A dataset is a group of records from within your instance, and in our example, the custom skill works on knowledge articles, so we would create a dataset that contains records from the knowledge article table. We add filters to only include knowledge articles records that contain text and are within the IT knowledge base.

This however is a large number of records, which will lead to long evaluation times, be resource intensive (and thus expensive) and may contain records that are not within the expected range of inputs for your custom skill.

To account for this, we create a derived dataset. This is essentially a subset of your original dataset, and is an optional step. When creating the derived dataset, we manually select only the records that we specifically wish to include in the test data.

If so desired, one can create additional derived datasets to account for other scenarios.

Once we have created datasets that reflect the range of records we wish to inclue in testing, we then move onto the addition of ground truth. Adding ground truth allows for judge models to evaluate the actual output of your custom skill against what you have deemed to be the most correct response. In our scenario, we want to populate the ground truth of each record with the category that we would associate ourselves with the record.

Once we have finished creating our derived datasets and adding ground truths, we can then create a data collection. This is where you combine datasets to form the group of data that you ultimately want to include during the evaluation process. Once your data collection is complete, you then publish it, which will allow for it to be selected by users within Now Assist Skill Kit during the evaluation stage.

What do the states mean within the Data Catalog?

In Progress is for when dataset is in the process of getting created
Ready is when the dataset has been finalized, and ready to be consumed
Added to collection is when the dataset has been added to a Data Collection
Error is for when there is an error during the dataset creation process, and thus is unable to be used.

How can I delete datasets?

This functionality is not available today, but is on the roadmap.

Are there any limitations on which tables/fields I can select when creating datasets?

The only restriction placed on users is that they can only see tables they have access to, and thus adhere to ACLs of the instance.

Ground Truth

What is Ground Truth, and how is it useful?

Ground Truth is the desired output from a generative AI action for a given record. This is used during evaluation when comparing the output of the LLM to the ground truth to determine if the response was successful according to a particular measure.

What are Ground Truth Types?

When selecting a ground truth type, you are selecting the type of generative AI outcome you are creating the ground truth for. For example, if you are building a custom skill focused on a summarization use case within Now Assist Skill Kit, you would want to determine the ground truth of what a "good" summary of this particular record would look like within your custom skill.

What is the Ground Truth Guideline field?

It is a field that allows users to add context as far as how they are determining the Ground Truth. It is helpful for when you have multiple users adding and evaluating the ground truths, and the guidelines provide them with insight into why the values may be a particular way.

Where is the Ground Truth stored in the instance?

You can either navigate to the ground truth within the NADK interface or to the sn_data_kit_value table, which is mapped to the records in the sn_data_kit_dataset_record table.

Can I select individual records to add ground truth to?

At the moment you have to iterate through the dataset record to find a specific record. If you want to target specific records, you can create a derived dataset that contains the specific record you wish to add the ground truth to.

-Brian-Arndt · ‎04-16-2025

Can the NADK tool also be used for out-of-the-box skills? (now or potentially in the future)

If not, are there recommendations for how to evaluate the OOB skills?

Eliza · ‎04-17-2025

Hi @-Brian-Arndt,

You can perform auto-evaluation of certain OOTB skills within Now Assist Skill Kit - not all are present. You can find the list of skills here. This list will be growing each release, so keep an eye on it!

For those that are unavailable, you will have to manually test them for now.

Loknath · ‎06-24-2025

Hi

I am trying to add a data collection that I have created to my Now Assist Skill as a dataset to test my skill. But the dataset creation is not happening, it keeps on loading. Please refer below screenshot and help me to fix the issue.

Screenshot 2025-06-25 at 11.08.20 AM.png

Eliza · ‎06-30-2025

Hi @Loknath,

I would submit a case for our team to review.