drjohnchun
Tera Guru

I've been running a blog series on Data Sampling Techniques where the latest topic was on Statistical Sampling Using Scripted Filter. While the series is targeted towards those interested in data analysis and data quality, I felt a variation of the technique might be of general interest to a wider audience and use cases. So here's a technique on random sampling, that is, randomly selecting GlideRecords using a Script Include and a Scripted Filter. Some use cases might include:

  1. Randomly picking top 3 prize winners for those who responded to Service Desk Satisfaction Survey in the last month, the grand prize being an iPad! This would be a good way to increase response rates to any survey.
  2. As you're launching the new Service Portal, you want to promote the portal and self service by giving away prizes; the more the customers use the portal, the better chances they have at winning the prizes.
  3. You've noticed your Knowledge Base is being underutilized, so you'd like to promote the use by giving away prizes.
  4. With the year-end holidays are approaching, you want to give out prizes to your customers as part of marketing campaign.
  5. An auditor is asking for a random sample of 10 change records for the Accounts Payable system from the last 12 months.
  6. You as Process Manager would like to review 30 incident records from the last month as part of Continual Service Improvement program.

There may be numerous other use cases not listed here and I'd like to hear about yours. For more analytical data sampling techniques, please see my blog series.

Let's add some fun and excitement!

OVERVIEW

Here's a quick overview of what we'll do; more technical details can be found in my other blog. Here, we'll focus more on various use cases.

  1. Create a Script Include with a function we'll call randomSample().
  2. Call randomSample() from Condition Builders using a Scripted Filter.
  3. Retrieve and review the records.

SCRIPT INCLUDE randomSample()

Let's first create a Script Include with the randomSample() function; this is similar to the statSample() function from my other blog, without the statistical part. Here's how the function works:

  1. Takes the table name and encoded query string needed for querying and sample size; if a field other than sys_id is to be returned, specify it.
  2. Query the table and get the row count using .getRowCount().
  3. Pick a random row from the record set and save the specified field value or sys_id; repeat until the sample size number of unique values are collected.
  4. Return the saved field values in an array.

To create a new Script Include,

  1. Log in with admin role.
  2. Navigate to System Definition > Script Includes.
  3. Click on New button to create a new Script Include.
  4. Fill the form as in the screenshot below:
    find_real_file.png
  5. This can be either Global or Scoped; if Scoped, make sure to jot down the API Name to be used later in Scripted Filter.
  6. Since we want to use this from other applications, set Accessible from to All application scopes.
  7. For this to be used as a Scripted Filter, Client callable must be checked.
  8. In the Script field, paste the below script (also attached below as a file).
  9. Finally, click on Submit.

NOTE: I noticed an unexplained behavior that the function is called twice in a row when it's used in a Scripted Filter; the first call generates the list view and the results from the second call are displayed in the Condition Builder's breadcrumb, resulting in different sets of data between the breadcrumb and the list view. This would go unnoticed in most cases because the repeated calls bring back the same results. However, due to the random nature of randomSample(), the return values are different each time the function is called. I added some special handling to the script to ensure the results are identical for all calls by sampling only during the first call. I also ensured the function can be called by other scripts as a Script Include without an issue in case it's used outside of Scripted Filter.

/**

* Performs random sampling against a filtered list of GlideRecords.

* Takes table name and encoded query string then returns an array of specified field or sys_id of sample records.

*

* SCRIPTED FILTER EXAMPLE

*

* EXAMPLE 1: GET RANDOM SAMPLE USING ENCODED QUERY STRING

* [Sys ID] [is] [javascript:randomSample('incident', 'active=1', 30)]

* returns 30 sample records from the population size of 54,939

*

* NOTE: Scripted Filter runs in rhino.sandbox context so not all classes/objects are available for scripting.

* NOTE: The function is run twice in Scripted Filter somehow, so use randomSampleRecords to run only once.

*

* MIT License http://tldrlegal.com/l/mit

* Copyright (c) 2016 John.Chun @snowaid.com

*

* @param {string} tableName - name of table for GlideRecord

* @param {string} encodedQuery - encoded query string

* @param {int} sampleSize - number of records to have in sample

* @param {string} fieldName - name of field whose unique values are to be returned

* @return {string[]} array of sys_id of random sample records

*/

var randomSampleRecords = [];   // this is in rhino.sandbox context in Scripted Filter; otherwise in global

function randomSample(tableName, encodedQuery, sampleSize, fieldName) {

  if (randomSampleRecords.length) return randomSampleRecords;   // in Scripted Filter, force to run only once

  try {

      //var gr = new GlideRecordSecure(tableName);   // enforce ACL; GlideRecordSecure undefined in Scripted Filter in Helsinki

      var isScriptedFilter = !this.GlideRecordSecure;   // use the fact that GlideRecordSecure is undefined in Scripted Filter

      var gr = new GlideRecord(tableName);

      if (!gr.isValid()) throw 'Invalid table name "' + tableName + '".';

      if (!gr.canRead()) throw 'No permission to read from "' + tableName + '".';   // test ACL for table

      fieldName = fieldName || 'sys_id';   // default to sys_id

      if (gr.getElement(fieldName) == null) throw 'Field "' + fieldName + '" not found.';

      if (!(sampleSize > 0)) throw 'Sample size must be a positive integer.';

     

      // get population

      if (encodedQuery) gr.addQuery(encodedQuery);

      gr.query();   // to getRowCount()

      var population = gr.getRowCount();

      if (!population || population < sampleSize) throw 'Total number of rows ' + population + ' is less than sample size ' + sampleSize + '.';

      // throw dice and get a random sample

      var offsets = [], records = [];

      while (records.length < sampleSize) {

          var offset = Math.floor(Math.random() * population);   // 0 <= offset < population

          if (indexOf(offsets, offset) >= 0) continue;   // dupe offset, so rethrow dice

          offsets.push(offset);

          if (offsets.length >= population) break;   // tried entire population

          gr.chooseWindow(offset, offset + 1);   // works in global & scoped

          gr.query();

          if (gr.next()) {

              var value = gr.getElement(fieldName).toString();

              if (indexOf(records, value) < 0) records.push(value);

          }

      }

      if (isScriptedFilter) randomSampleRecords = records;   // in Scripted Filter, save randomSampleRecords

      return records;

  }

  catch(e) {

      return 'ERROR: ' + e;   // return error message

  }

 

  // emulates Array.prototype.indexOf() in older JavaScript

  function indexOf(arr, val) { for (var i = 0; i < arr.length; i++) if (arr[i] == val) return i; return -1; }

}

SERVICE DESK SATISFACTION SURVEY WINNERS

Let's pick three lucky winners among those who responded to Service Desk Satisfaction Survey last month. If someone responded to more than one survey, it increases the odds of winning (if not, they may not be motivated to respond to subsequent surveys). Sent-out surveys are stored in the Survey Instances [asmt_assessment_instance] table (the Survey Responses [asmt_assessment_instance_question] table contains a row for each question answered; unless you want to increase the odds based on the number of questions answered, the Instance table is a better choice). We'll look at only Service Desk Satisfaction Survey and whom they were sent out to, in the user field. We'll also filter the taken_on field to last month only. Since we're selecting people, we'll do all this in a list view for Users. Below is the summary of the parameter values:

ParameterValue
List ViewOrganization > Users
tableNameasmt_assessment_instance
encodedQuerymetric_type.name=Service Desk Satisfaction Survey^taken_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()
fieldNameuser.name
Scripted Filterjavascript:randomSample('asmt_assessment_instance', 'metric_type.name=Service Desk Satisfaction Survey^taken_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user.name')

By default, the return value is an array of sys_id. However, you can pick any other field. For example, we can pick user.name, dot-walking to the user record's name field. We need to set the Condition Builder to

[Name] [is] [javascript:randomSample('asmt_assessment_instance', 'metric_type.name=Service Desk Satisfaction Survey^taken_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user.name')]

Make sure this is the only filter condition. When you run this, the result is

find_real_file.png

The breadcrumb shows the three winners, in the order they were picked at random. The list view under it shows the three user records, in the sort order you defined, which, in this case, is by Name in descending order. If you have first, second, and third prizes, you'll want to use the breadcrumb. Depending on your rules, you may want to add a few backup winners so if the winners don't claim their prizes within a certain time, the prizes are given to backup winners. You may also want to use backup winners in case Service Desk staff members are picked but disqualified.

Every time you refresh this, you'll get different winners; you may want to make sure to have that one official drawing (refresh) for the prizes.

If your organization has people with the same name, you may want to use User ID instead since it should be unique:

[User ID] [is] [javascript:randomSample('asmt_assessment_instance', 'metric_type.name=Service Desk Satisfaction Survey^taken_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user.user_name')]

find_real_file.png

If you really insist on using sys_id, here's what it looks like (user is a reference field that returns sys_id from the sys_user table):

[Sys ID] [is] [javascript:randomSample('asmt_assessment_instance', 'metric_type.name=Service Desk Satisfaction Survey^taken_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user')]

find_real_file.png

Notice the breadcrumb is not as useful as before unless you can tell who's who from the sys_ids (you can hover over the User ID column and look at the link displayed at the bottom of the browser, if needed).

If you don't have permission to the Users [sys_user] table, you can run the Scripted Filter from other list views, such as Incident. Simply navigate to Incident > Open and set the Condition Builder as below:

[Number] [is] [javascript:randomSample('asmt_assessment_instance', 'metric_type.name=Service Desk Satisfaction Survey^taken_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user.name')]

You can use Number or any other string field. This will show

find_real_file.png

This obviously doesn't show the names in the list view since it's not a Users list view but the names appear in the breadcrumb, as shown in the screenshot above.

KNOWLEDGE BASE USERS

You've noticed your Knowledge Base is being underutilized, so you'd like to promote the use by giving away prizes. You'll pick 3 winners from those who viewed KB articles last month; the more articles they viewed, the higher odds of winning. The data on who viewed which knowledge base article is stored in the Knowledge Use [kb_use] table. We'll use sys_updated_on and user fields to run similar conditions as before:

ParameterValue
List ViewOrganization > Users
tableNamekb_use
encodedQuerysys_updated_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()
fieldNameuser.name
Scripted Filterjavascript:randomSample('kb_use', 'sys_updated_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user.name')

We need to set the Condition Builder to

[Name] [is] [javascript:randomSample('kb_use', 'sys_updated_onONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 3, 'user.name')]

Make sure to run this on the first day of the month; if a user viewed the same article last month as well as this month, the sys_updated_on field will only show this month's date, removing the record from the pool.

INCIDENT SAMPLE RECORDS FOR CONTINUAL SERVICE IMPROVEMENT

You as Process Manager would like to review 30 incident records from the last month as part of Continual Service Improvement program. Let's look at only the closed incident tickets and randomly select 30 records:

ParameterValue
List ViewIncident > Open (or any Incident list view)
tableNameincident
encodedQueryclosed_atONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()
fieldName
Scripted Filterjavascript:randomSample('incident', 'closed_atONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 30)

We need to set the Condition Builder to

[Sys ID] [is] [javascript:randomSample('incident', 'closed_atONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 30)]

This returns an array of sys_ids; another option is to return an array of Numbers and match on Number:

[Number] [is] [javascript:randomSample('incident', 'closed_atONLast month@javascript:gs.beginningOfLastMonth()@javascript:gs.endOfLastMonth()', 30, 'number')]

We've looked at a few practical use cases. This should give you ideas on how to use random sampling so you can use it for other cases. Enjoy and have fun!

Please feel free to connect, follow, post feedback / questions / comments, share, like, bookmark, endorse.

John Chun, PhD PMP see John's LinkedIn profile

visit snowaid

ServiceNow Advocate
10 Comments