Synthetic Data in Now Assist Data Kit

Max Dore · ‎06-25-2025

Building AI experiences in Servicenow is exciting, but what about testing them? If you've evaluated custom AI skills without touching real data, you know that fake data is often not useful and real data is off-limits. This is where Synthetic Data comes into play, a feature for creating high-quality, context-aware, risk-free test data that’s built to support AI skill development, evaluation, and iteration.

What Is Synthetic Data?

In Now Assist Data Kit, synthetic data is a dataset created entirely by NowLLM, Servicenow’s large language model. It mimics the structure and content of real records—without exposing real information. Once generated, synthetic records are:

Stored in a Now Assist Data Kit-specific table and not tables like Incidents, Requests, Cases, etc..
Added to a Data Collection, which is a key input for testing in the Now Assist Skill Kit
Used to auto-evaluate custom skills, with or without “golden truth” answers
Designed to reflect real business scenarios, workflows, and domains

Think of synthetic data as a highly trained actor: it plays the part convincingly, but there are no real records involved.

Why Use Synthetic Data?

Manually creating hundreds of realistic records is a hassle and using actual data can raise red flags with legal, compliance, and security teams. With synthetic data, you get the best of both worlds:

Rich, relevant content tailored to your test use case
No personal or sensitive information
Fully customizable and scalable
Reliable input for AI testing, evaluation, and demos

Whether you're simulating IT incidents, HR requests, or customer service tickets, synthetic data helps you work faster and safer.

How Does Synthetic Data Get Generated?

Synthetic data creation is guided by the details you provide. The more context you give, the better the results.

Here’s what the model uses as input:

Context: What’s the environment or domain? (e.g., HR, ITSM, Finance)
Data Definition: What kind of records are we generating? (e.g., incident, change request)
Categories: Keywords that describe the theme or type of data (e.g., email issue, benefits question)
Column Definitions: The fields and structure of the record (e.g., Short Description, Assigned To)
Amount: How many records to generate (1-100)
Seed Data (Optional): Existing examples that help guide tone, format, and accuracy

With this input, NowLLM generates records that look and feel like real data.

To learn more about the step by step creation, check out the documentation.

Using Synthetic Data in Your Workflows

Once generated, synthetic data doesn’t just sit around. It becomes immediately usable with the Now Assist Skill Kit, enabling Skill testing.

Here’s how you use it:

Add the records to a published Data Collection
Connect the collection to your Custom Skills
Use Auto-evaluation to test skill accuracy, intent recognition, and more
Batch test across hundreds of examples to validate performance at scale

And if you include golden truth (expected answers or responses), the evaluation becomes even more precise.

Key Takeaways

Here’s what you need to remember:

Feature	Details
LLM Engine	NowLLM (ServiceNow's large language model)
Record Limits	Minimum: 1, Maximum: 100 per generation job
Storage	Dedicated Now Assist Data Kit table
Customization	Fully guided by your input (context, fields, categories, etc.)
Golden Truth Support	Optional but highly recommended
Usage	Supports auto-evaluation and batch testing in Skill Kit
Cost	1 Assist per record generated

Please note that this process consumes assists. For consumption rates, refer to this pricing guide and check for the latest updates.

Watch this walkthrough on YouTube to see the process of generating synthetic data in an instance.