Evaluate a prompt

Australia Enable AI

Release

australia

ft:locale

en-US

ft:publication_title

Australia Enable AI

ft:clusterId

platai

bundleId

platai

workflow

Platform

Evaluate a prompt

Release version: Australia

Updated March 12, 2026

2 minutes to read

Use the Now Assist Skill Kit evaluation tools to evaluate the effectiveness of your skill prompts.

Role required: sn_skill_builder.admin

Create a dataset from a table or data collection.

Table 1. Create a data set
Method	Steps
Create a dataset from a table	Give the dataset a name and description. Select Table. Find the table that you want to use. Select the maximum number of records that you want to use. Add conditions. Select Generate Preview. Select the mappings. Select Create.
Create a dataset from a data collection	Give the dataset a name and description. Select Data Collection. Select a data collection that you created in Now Assist Data Kit. Select Generate Preview. Select the mappings. Select Create.

Select the metrics that you want to evaluate.

Table 2. Evaluation metrics
Evaluation method	Metric	Description
Human	Human Feedback	Human evaluation is the default option available for all prompt executions that generate a response. You can rate the response with a thumbs up or thumbs down, based on your satisfaction. You also have the option to provide more detailed feedback to explain your evaluation choice.
Automated	Correctness	The correctness metric assesses the generated response's accuracy, completeness, pertinence, and writing quality relative to the given instruction. This metric helps to check that the text accurately reflects the instruction, covers all important points, remains relevant, and is well written.
Automated	Correctness with Golden Response	The correctness with golden response metric uses a predefined reference to assess the generated response's accuracy, completeness, pertinence, and writing quality relative to the given instruction. This metric helps to check that the text accurately reflects the instruction, covers all important points, remains relevant, and is well written. You should use this metric whenever possible.
Automated	Faithfulness	The faithfulness metric assesses whether a generated response accurately reflects the information and context provided in the given instruction. This metric helps to check that the text contains no hallucinations, fabricated facts, or unsupported conclusions, maintaining alignment with the source material.