Minimum number of records for a Similarity model

Johan H · ‎08-28-2023

We have a use case where we want to predict skills per group for a smaller number of groups, 16 at the moment. The groups are the traditional IT groups such as Network, Server, Voice and so on.

I assume, me being a ServiceNow developer with less than 1 year of experience, that I need one similarity model per group targeting a specific set of skills. And yes, I've created one skill type per team. So, as the incidents are dispatched to the team in scope, PI predicts the skill from a subset of our skills based on short description.

Now, all groups will not have the 10 skills required to setup and train a similarity model. You can change the value through glide.platform_ml.api.min_similarity_window_records, which I'm tempted to do. 5 seams like a more reasonable number from this perspective. Less than 5 skills, then all group members should be able to cover al types ot tickets assigned to them.

But, from the more experienced developers out there, what are the down sides tweaking this number from 10 to 5? I assume less accurate predictions for the models, anything else?

Johan H · ‎08-30-2023

Please correct me if I'm wrong, but similarity models doesn't have a configurable output field.

Our similarity models compare the keywords from the cmn_skill to the short_description from the incident as test table. Each team involved have its own similarity model where each model only looks at skills configured with the skill level type related to the team. So each team have their own set of skills.

The question now is IF I should create separate word corpus for each team only looking at closed incident assigned to the specific team last 6 months, or if I get the same accuracy using one word corpus for all models filtering on all closed incident last 6 months.

The linked skill is not stored on the task object but in table task_m2m_skill.