Join the #BuildWithBuildAgent Challenge! Get recognized, earn exclusive swag, and inspire the ServiceNow Community with what you can build using Build Agent.  Join the Challenge.

Stopword list not working predictive intelligence clustering

GijsBeerens
Tera Contributor

Hi all, 

I am currently working on a clustering solution for incidents. However many of our incident content is generated through templates, which causes my clustering solution to register the template texts as clusters. 

Problem: The stopword list I am using does not seem to work, whichever format I put in. 

Example: 
What I want to exclude: "BSO concerned:"
What I have tried as stopwords in separate stopword configurations:

  • "BSO concerned"
  • "BSO concerned:" 
  • BSO, concerned
  • BSO,concerned
  • BSO, concerned: 

However, these things still show up in my cluster analysis. Does anyone have an idea what I might be doing wrong? 

Looking forward to your replies! 

 

 



1 REPLY 1

Community Alums
Not applicable

@GijsBeerens Might be tokenized issue, the text preprocessing in clustering often breaks down text into tokens (words). The problem might be due to tokenization splitting "BSO concerned:" into separate tokens like "BSO" and "concerned". In this case, listing "BSO concerned" as a stopword will not work.Try this if works for you.