
- Post History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
09-01-2018 05:45 AM - edited 12-10-2023 02:38 PM
NOTE: MY POSTINGS REFLECT MY OWN VIEWS AND DO NOT NECESSARILY REPRESENT THE VIEWS OF MY EMPLOYER, ACCENTURE.
DIFFICULTY LEVEL: ADVANCED
Assumes having taken the class SSNF and has good advanced level of knowledge and/or familiarity with Scripting in ServiceNow.
So why am I pursuing this, what might seem, very boring topic? Well, this need has come up so often, and it is so inadequately addressed with the platform; that it needs a library of such functions to meet that need!
The idea here is to provide a fuzzy search that will allow the developer the ability to find close matches. Data-being-data you most likely will NOT be in control of what you are asked to process. Thus the need for this sort of thing!
So, the questions that present themselves: How many times have you had to go manually looking for a particular user in the sys_user table and been frustrated by effort? Sometime it takes you a bit to track them down via the source field? Wouldn't it be nice to have something human-like go through and find such things for you and just present you with the results? Quickly?
Answer for me: You bet! Happens all the time!
Okay, getting off my soap-box here. 🙂
In this next-to-last article on string search algorithms (yes, yes, there will be one more) I will present a really great fuzzy search method: Jaro-Winkler Distance.
I searched the internet over (sounds like an old Hee-Haw song doesn't it?), and found several implementations in JavaScript. I converted a few of these, that looked reasonable, into a single Script Include Library, and then created a Fix Script that utilized the same data and results from my previous article for comparison purposes.
Note: I have attached the Fix Script and Script Includes I created to this article for download. You accept the install as-is and unsupported if you decide to use it.
See: Community Code Snippets: Scripted String Matching Techniques - Part 2 (Levenshtein)
In running the Fix Script the results were very interesting! For the "Buddy" matches I used my Levenshtein results as control values.
The results below are from the Fix Script. I edited them to include the control information and the variance (if significant):
*** Script: --->
Jordan Thomas script:
aaapppp: 0
frog: 92.50
fly: 0
elephant: 44.17
hippo: 44.17
hippo: 0
hello: 88.00
ABC : 90.67
D N H Enterprises: 95.25
My Gym 94.20
PENNSYLVANIA: 89.80
Buddy: 66.71
Buddy (flipped): 66.71
Shannan Archer script:
aaapppp: 0
frog: 92.5
fly: 0
elephant: 44.166666666666664
hippo: 44.166666666666664
hippo: 0
hello: 88.0
ABC : 92.22222222222223
D N H Enterprises: 95.25189786059352
My Gym 95.16666666666667
PENNSYLVANIA: 91.501554001554
Buddy: 66.7055167055167
Buddy (flipped): 66.7055167055167
thsig script (Apache StringUtils Java library as control values):
aaapppp: 0
frog: 75.0
fly: 0
elephant: 58.25
hippo: 44.166666666666664
hippo: 0
hello: 88.96000000000001
ABC : 90.91228070175438
D N H Enterprises: 87.82239477891652
My Gym 86.39526508336331
PENNSYLVANIA: 83.47820512820513
Buddy: 90.69427268180065
Buddy (flipped): 87.10764538417449
As you can see the differences from the control values were not that far off! I was especially interested in the Levenshtein accuracy. Overall not to different from the Apache Java library, and more accurate than the Levenshtein algorithm.
Okay, so now leaning toward Jaro-Winkler Distance as the better string matching method.
In my next article I will cover two other algorithms and we will see if they work better than Jaro-Winkler. They are:
- Smith-Waterman
- Needleman-Wunsch
Enjoy!
Steven Bell.
If you find this article helps you, don't forget to log in and mark it as "Helpful"!
Originally published on: 09-01-2018 7:45 AM
I updated the code and brought the article into alignment with my new formatting standard.
- 2,091 Views