Components installed with the Evaluation dashboard
Several types of components are part of the Evaluation tab, including scheduled jobs, tables, system properties, and flows.
Scheduled jobs installed
| Scheduled job | Description |
|---|---|
|
CE Populate Value Aggregates Chats – Daily |
This scheduled script runs daily and randomly selects 1000 conversations from yesterday's conversations. After that, for each conversation, this job extracts the chat duration and classifies them as small, medium, or large. It also classifies the chats in which a Knowledge article or catalog item was invoked. For the evaluated chat, it also classifies the conversations based on the chat performance and populates that data into the Evaluation Value Aggregates table. |
| Evaluation Value Calcuation - Runs Only once after install | Deletes all the records on the Evaluation Value Aggregates tables, runs the calculations again and stores the aggregated value in the Evaluation Value Aggregates table. The data is from the first evaluation date. |
Tables installed
| Label | Name |
|---|---|
| Evaluation |
[sn_na_conv_eval_evaluation] |
| Evaluation configurations |
[sn_na_conv_eval_evaluation_configurations] |
| Evaluation Metrics |
[sn_na_conv_eval_evaluation_metrics] |
| Evaluation Set |
[sn_na_conv_eval_evaluation_set] |
|
Evaluation Value Aggregates |
[sn_na_conv_eval_evaluation_value_aggregates] |
Remote tables installed
| Table | Description |
|---|---|
|
Conversation Evaluator Value Calculations [sn_na_conv_eval_st_value_calcs] |
For the given query, the definition for this remote table calculates the time savings and efficiency percentage for small, medium, and large chats. Also, it returns the time savings and efficiency when a Knowledge article or catalog item was invoked. |
| Conversation weekly calculations [sn_na_conv_eval_weekly_cals] |
For the given query, the definition for this remote table calculates the time savings and efficiency percentage for small, medium, and large chats for different weeks of the selected date range. Also, it returns the time savings and efficiency when aKnowledge article or catalog item was invoked for all the different weeks of the selected date range. |
System properties installed
| Property | Description |
|---|---|
|
sn_na_conv_eval.errorBandMinRecords |
Minimum number of records required to calculate the error band for upper and lower deviation. By default, the value is 30. |
|
sn_na_conv_eval.evalWeights |
Contains weights to each evaluation metric for chat evaluation. This property is used to compute total or composite scores for evaluation records. |
|
sn_na_conv_eval.maxEvaluateCount |
Maximum number of records to evaluate in a day. By default, the value is 200. |
|
sn_na_conv_eval.total_sampled_conv_count |
Edit this property to control the total number of conversations that can be sampled for value calculations. By default, the value is 1000. |
| sn_na_conv_eval.value_chat_classifier |
Edit this property to change the definition of small, medium, and large conversations. By default, the value it stores is 4, 10. Here, 4 and 10 signify the total number of inbound messages. Fewer than or equal to 4 inbound messages in the sys_cs_message table for a conversation means that it’s a small conversation. More than 4 inbound messages and fewer than or equal to 10 inbound messages means that it’s a medium conversation, and more than 10 inbound messages means that it’s a large conversation. |
| sn_na_conv_eval.ce_value_calculation_weights | Value calculation weight values for each type of evaluated chat. |
| sn_na_conv_eval.eval_value_rerun_status | Reruns the value calculations once after the installation. This property will check the status of the Conversation Evaluator Value Rerun status. If it has run, then the script will change the value of this system property to false. |
Business rules installed
| Name | When | Insert | Update | Filter Conditions |
|---|---|---|---|---|
| Add info message for Evaluation set | after | TRUE | TRUE | stateCHANGESTOIn Progress^evaluation_type=conversation^EQ |
| Scale Up labeling metric | before | TRUE | TRUE | metric_type=Labeling^metric_nameINhelpfulness_chat_eval,intent_recognition_chat_eval,slot_filling_chat_eval,forgetfulness_chat_eval,hallucination_chat_eval,redundancy_chat_eval,deadlock_chat_eval,coherence_chat_eval^raw_scoreVALCHANGES^EQ |
| updateLabelingScoresOnEvaluation | after | TRUE | TRUE | metric_type=Labeling^raw_scoreVALCHANGES^metric_nameINhelpfulness_chat_eval,intent_recognition_chat_eval,slot_filling_chat_eval,forgetfulness_chat_eval,hallucination_chat_eval,redundancy_chat_eval,deadlock_chat_eval,coherence_chat_eval^EQ |
| Update deviation scores | before | TRUE | TRUE | metric_type=LLM Generated^scoreVALCHANGES^EQ |
| getAutoEvalCompositeScore | after | FALSE | TRUE | stateCHANGESTOComplete^total_scoreISEMPTY^EQ |
Flows installed
| Flow | Description |
|---|---|
|
Execute Evaluation |
Performs evaluations when conversations are completed. By default, the Execute Evaluation flow is deactivated. You can use the nightly scheduled job Execute Evaluations to evaluate the chats. If you want to evaluate the chats on chat completion, activate the Execute Evaluation flow. |
| Execute Batch Evaluation | Performs batch evaluations, evaluating up to 100 completed virtual agent conversations. Flow is triggered when the Evaluation set is created or updated and the Evaluation Type is Conversation. |
Flow actions installed
| Flow action | Description |
|---|---|
|
Randomize conversations |
Performs randomization of conversations and returns 100 conversations randomly from a given query. |
| invokeApiDefinition | Invokes OneExtend Capability in the large language model (LLM). |
| Chat Classifier Eval | Gives the title, category, and whether the evaluation should be executed. |
| buildTranscript | Builds the transcript from a conversation. |
| evalExecuteCondition | Checks if the transcript is good enough to be evaluated. |
Script includes installed
| Script includes | Description |
|---|---|
| evalExecuteCondition |
Use this script include to update the evaluation condition. |
| evalUtils | Primary Utility function for the Evaluator. |