Jon50
ServiceNow Employee
ServiceNow Employee

A step by step guide to debugging slow transactions

 

This is a quick walkthrough of one approach to identifying the source of slowness in a ServiceNow transaction.

 

In this example I have deliberately configured a slow query in the service portal and will show how you can use some of our debugging tools to determine the source of the slowness.

 

Start off by trying to recreate the issue. If you can't recreate the issue you'll need a different approach, but I am not going to address that in this article.

 

We are going to investigate a slow service portal, so having confirmed that I can recreate the issue, I will try using the script from KB0744521 - How to identify a slow widget on a page, which should help me identify which widget is slow.

 

If you are looking at an issue that is not portal-related, you can simply skip this step and go straight to the Script Debugger.

 

Open the KB and copy the whole script.

 

Log into your instance and go to the portal that you want to investigate, in this case /esc.

 

Employee Service Centre portalEmployee Service Centre portal

 

You now need to open your browser developer tools, I am using Chrome, so I will click on the the three vertical dots for more options (apparently this is known as the kebab menu....), select More Tools and Developer Tools.

 

Open developer toolsOpen developer tools

 

Developer tools will open, typically at the bottom of the window, click on the console tab to switch to the script console.

 

Developer Tools ConsoleDeveloper Tools Console

 

At the bottom of the window, you should see the console prompt, typically the greater than sign, paste in the contents of the script and press return.

 

Paste the script into the consolePaste the script into the console

 

The script will now run in your browser and should notice red lines appear around all the widgets on the page. It messes up the display and alignment, but the key thing is that you can now see the individual widget load times, and there is also a refresh button next to the load time, that allows you to refresh the individual widgets.

 

Running the scriptRunning the script

 

Looking over the widgets, we can see that most of them load in a few hundred milliseconds, however the header, at the top of the page, has taken 4039ms, so that is the most likely cause of our slowness.

 

ESC slow headerESC slow header

 

Note that the portal header is loaded on every page load and refresh. Any slowness in the header will impact the entire portal and result in a very poor user experience.

 

OK, so we've identified that the header is the most likely cause of the slowness, let's now use the ServiceNow Script Debugger and see if we can find out why.

 

Duplicate the current tab and navigate to a classic (ie non-portal) UI page. In the navigator, type 'debug' and click on Enable All.

 

Enable All DebugEnable All Debug

 

This will open the script debugger in a second browser window.

 

Script debuggerScript debugger

 

Navigate back to your original browser tab that is showing the portal and click on the refresh icon, next to the slow widget time.

 

Refresh the portal header widgetRefresh the portal header widget

 

This time it was even worse, 4.8 seconds, navigate back to the script debugger window and you should see all the debug output from the transaction.

 

Script debugger after header refreshScript debugger after header refresh

 

We've enabled all debug messages so it tends to be very noisy, but at this stage we just want to see the transaction end statement, which contains the string '*** End' and should be at the bottom of the screen.

 

*** End log entry*** End log entry

Most interactive transactions on a ServiceNow instance will end with this line, which gives you an overview of the time the transaction took and where that time was spent, here it is in more detail:

 

*** End #47039 /api/now/sp/widget/d0009941eb103010ed7966d6475228c1 referer: /esc, user: jong, 
total time: 0:00:04.257,
processing time: 0:00:04.257,
CPU time: 0:00:00.175,
SQL time: 0:00:03.867 (count: 22),
ACL time: 0:00:00.017,
Script time: 0:00:00.003 (count: 4),
Total memory allocated: 26.18KB,
source: 148.139.0.229, type: rest, origin scope: sn_ex_sp, app scope: sn_ex_sp ,
method:POST, api_name:now/sp, resource:now/sp/widget/{sysId},
version:Default, user_id:543cbc6a472ac290b5b8e3cc236d435e,
response_status:201

We can see that the transaction time (on the server) was 4.257 seconds, and of that, 3.867 seconds were spent in SQL time, ie querying the database. This indicates that we need to look at the SQL queries to find the source of the slowness.

 

SQL is a common cause of slowness, either through poor query design or a lack of supporting indexes, other common causes to look out for are business rules (BR time) and access control lists (ACL time).

 

The ServiceNow platform automatically logs slow activities, such as SQL queries (>100ms), Business Rules (>100ms) and ACLs (>5ms), these are always logged without enabling additional debugging.

 

SQL and BR time will also be followed by a Count, which indicates the number of queries or rules executed in the transaction, a very high count for either of these can also provide clues to the source of the slowness.

 

In our example, we see SQL time: 0:00:03.867 (count: 22), which gives us an average query time of 175ms, the volume is low, but the time is high, which typically indicates a slow query or queries.

 

Back to the script debugger then. As we enabled ALL debug, there will be a lot of information displayed for the transaction, you can turn this down to focus on the specific area of interest, in our case, we're looking for a slow SQL query and we expect it to be more than 100ms, so it should appear in the normal 'Log' output.

 

At the top of the script debugger, you will see a dropdown for Filters, expand it and unselect all options except Log (the other options are self explanatory).

 

Debug Output filtersDebug Output filters 

 

Having done that, the main window will only show the Log output.

 

Script debugger log onlyScript debugger log only

 

In our example, this neatly shows us a couple of lines at the start of the transaction, then a single slow SQL query, followed by the End transaction summary.

 

We have now identified the slow SQL query that is causing our slow portal

 

04:18:22.554 Time: 0:00:03.813 id: empjgparis_1[glide.3] primary_hash=-1769840712 category=homepage_esc_portal_task_count (connpid=964279) for: 
SELECT count(*) AS recordcount
FROM (((task task0 LEFT JOIN task task1 ON task0.`sys_id` = task1.`sys_id` AND task1.`sys_class_name` IN ('incident')) LEFT JOIN sys_user sys_user2 ON task1.`a_ref_6` = sys_user2.`sys_id`) LEFT JOIN sys_user sys_user3 ON task1.`a_ref_4` = sys_user3.`sys_id`)
WHERE (((task0.`short_description` LIKE '%homer%' OR sys_user2.`name` LIKE '%homer%' OR sys_user3.`name` LIKE '%homer%')) OR (task0.`short_description` LIKE '%bart%' AND (sys_user2.`first_name` LIKE '%bart%' OR sys_user2.`last_name` LIKE '%simpson%')))
/* empjgparis001, gs:491995E71b03061072dcca64604bcbc7, tx:fa5c55ef1bcfc21072dcca64604bcba6, hash:-1769840712 */

but where is it coming from?

 

Fortunately, the platform keeps track of database queries in the Slow Queries module. Most queries in the log will be displayed with a primary_hash value, the hash value is based on the query structure without taking into account the specific variables that are used in the filter, and this allows the platform to track query performance without having to track individual queries.

 

In our example we can see primary_hash=-1769840712 and we can use this value to lookup the hash in the Slow Queries module.

 

Back in the instance, start by disabling debug as we don't need it anymore and it generates a lot of noise.

 

In the classic UI, search the navigator for debug again and this time select Disable All

 

Disable All DebugDisable All Debug

 

You may be prompted to Confirm.

 

Now we want to open the Slow Queries module, so again in the navigator filter, type 'Slow Q' and click on the Slow Queries module.

 

Slow Queries moduleSlow Queries module

This will open the Slow Queries table in list view.

 

Slow Queries list viewSlow Queries list view

 

It's always a good idea to check this list from time to time, sort it on Average Execution time and/or Total Execution time to see what your slowest queries are individually and overall. Also, don't forget to add the Last Sighted column, as sometimes you could end up looking into a slow query that hasn't been seen recently.

 

In our example, we have the hash for the query that we are interested in, so expand the list filter and filter on primary hash = -1769840712

 

Slow Queries hash filterSlow Queries hash filter

 

and there is our query, click on the Average execution time link to open the query.

 

Slow Queries hash detailSlow Queries hash detail

 

This will display the details for the query hash.

 

In addition to the query stats, we also log and display an example stack trace and URL from a recent query execution. Note that the same query can be called by different transactions, so the URL and stack trace can change and may not be for the transaction you are looking into .

 

In our example, we can see that the URL matches the widget we are investigating and so we can be confident that this is the actual stack track for our portal issue.

 

I'm not going to go into detail on Java Stack Traces here, suffice to say that it is a list of the Java method calls that were in progress at the time the query was logged. We can therefore use this to determine the source of the query.

 

Looking through our trace and reading from bottom to top

 

Default-thread-13
com.glide.db.stats.AbstractMetricStats.presentSampleToListeners(AbstractMetricStats.java:26)
com.glide.db.stats.query.QueryStats.recordSample(QueryStats.java:49)
com.glide.db.stats.query.QueryStats.lambda$recordQuery$0(QueryStats.java:36)
java.base/java.util.Optional.ifPresent(Optional.java:178)
com.glide.db.stats.query.QueryStats.recordQuery(QueryStats.java:26)
com.glide.db.DBI.saveStats(DBI.java:2105)
com.glide.db.DBI.sqlVerbosity(DBI.java:2096)
com.glide.db.DBI.executeStatement(DBI.java:1539)
com.glide.db.DBQueryExecutor.executeStatement(DBQueryExecutor.java:362)
com.glide.db.DBQueryExecutor.executeStatement(DBQueryExecutor.java:349)
com.glide.db.DBQueryExecutor.executeWithCategoryRouting(DBQueryExecutor.java:124)
com.glide.db.DBQueryExecutor.execute(DBQueryExecutor.java:112)
com.glide.db.DBQuery.executeAsResultSet0(DBQuery.java:773)
com.glide.db.DBQuery.executeAndReturnTable(DBQuery.java:686)
com.glide.db.DBAction.executeNormal(DBAction.java:382)
com.glide.db.DBAction.executeAndReturnException(DBAction.java:330)
com.glide.db.RDBMSQueryContext.executeQuery(RDBMSQueryContext.java:42)
com.glide.db.DBQuery.execute(DBQuery.java:3716)
com.glide.db.DBQuery.execute(DBQuery.java:3684)
com.glide.db.meta.Table.queryAggregate(Table.java:499)
com.glide.db.meta.Table.query(Table.java:231)
com.glide.script.GlideRecordITable.query(GlideRecordITable.java:129)
com.glide.script.GlideRecord.query0(GlideRecord.java:3511)
com.glide.script.GlideRecord.query(GlideRecord.java:3209)
com.glide.script.GlideAggregate.query(GlideAggregate.java:710)
com.glide.script.GlideRecord.jsFunction_query(GlideRecord.java:3092)
com.glide.script.fencing.ScopedGlideAggregate.jsFunction_query(ScopedGlideAggregate.java:64)
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.base/java.lang.reflect.Method.invoke(Method.java:568)
org.mozilla.javascript.MemberBox.invoke(MemberBox.java:151)
org.mozilla.javascript.FunctionObject.doInvoke(FunctionObject.java:717)
org.mozilla.javascript.FunctionObject.call(FunctionObject.java:654)
org.mozilla.javascript.ScriptRuntime.doCall(ScriptRuntime.java:2973)
org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:1720)
org.mozilla.javascript.Interpreter.interpret(Interpreter.java:1030)
org.mozilla.javascript.InterpretedFunction.lambda$call$0(InterpretedFunction.java:161)
com.glide.caller.gen.sys_script_include_fb5a924473b7130030f331d7caf6a764_script.call(Unknown Source)
com.glide.script.ScriptCaller.call(ScriptCaller.java:22)
org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:160)
org.mozilla.javascript.ScriptRuntime.doCall2(ScriptRuntime.java:3058)
org.mozilla.javascript.ScriptRuntime.doCall(ScriptRuntime.java:2981)
org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:1720)
org.mozilla.javascript.Interpreter.interpret(Interpreter.java:1030)
org.mozilla.javascript.InterpretedFunction.lambda$call$0(InterpretedFunction.java:161)
com.glide.caller.gen.sp_widget_d0009941eb103010ed7966d6475228c1_script.call(Unknown Source)
com.glide.script.ScriptCaller.call(ScriptCaller.java:22)
org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:160)
org.mozilla.javascript.ScriptRuntime.doCall2(ScriptRuntime.java:3058)
org.mozilla.javascript.ScriptRuntime.doCall(ScriptRuntime.java:2981)
org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:1720)
org.mozilla.javascript.Interpreter.interpret(Interpreter.java:1030)
org.mozilla.javascript.InterpretedFunction.lambda$call$0(InterpretedFunction.java:161)
com.glide.caller.gen.sp_widget_d0009941eb103010ed7966d6475228c1_script.call(Unknown Source)
com.glide.script.ScriptCaller.call(ScriptCaller.java:22)
org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:160)
org.mozilla.javascript.ScriptRuntime.doCall2(ScriptRuntime.java:3058)
org.mozilla.javascript.ScriptRuntime.doCall(ScriptRuntime.java:2981)
org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:1720)
org.mozilla.javascript.Interpreter.interpret(Interpreter.java:1030)
org.mozilla.javascript.InterpretedFunction.lambda$call$0(InterpretedFunction.java:161)
com.glide.caller.gen.sp_widget_d0009941eb103010ed7966d6475228c1_script.call(Unknown Source)
com.glide.script.ScriptCaller.call(ScriptCaller.java:22)
org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:160)
org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:725)
org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:4024)
org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:173)
com.glide.script.ScriptEvaluator.execute(ScriptEvaluator.java:412)
com.glide.script.ScriptEvaluator.evaluateString(ScriptEvaluator.java:223)
com.glide.script.ScriptEvaluator.evaluateString(ScriptEvaluator.java:141)
com.glide.service_portal.SPScriptEvaluator.eval(SPScriptEvaluator.java:37)
com.glide.service_portal.widget.SPScriptRunner.eval(SPScriptRunner.java:228)
com.glide.service_portal.widget.SPScriptRunner.runScript(SPScriptRunner.java:102)
com.glide.service_portal.widget.SPWidget.get(SPWidget.java:111)
com.glide.service_portal.widget.SPWidget.handleWidget(SPWidget.java:427)
com.glide.service_portal.SPRestService.widget(SPRestService.java:158)
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.base/java.lang.reflect.Method.invoke(Method.java:568)
com.glide.rest.handler.impl.ServiceHandlerImpl.invokeService(ServiceHandlerImpl.java:44)
com.glide.rest.processors.RESTAPIProcessor.process(RESTAPIProcessor.java:345)
com.glide.processors.AProcessor.runProcessor(AProcessor.java:762)
com.glide.processors.AProcessor.processTransaction(AProcessor.java:313)
com.glide.processors.ProcessorRegistry.process0(ProcessorRegistry.java:187)
com.glide.processors.ProcessorRegistry.process(ProcessorRegistry.java:175)
com.glide.ui.GlideServletTransaction.process(GlideServletTransaction.java:58)
com.glide.sys.Transaction.run(Transaction.java:2734)
com.glide.ui.HTTPTransaction.run(HTTPTransaction.java:35)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:840)

What we are looking out for here, are ServiceNow Sys IDs, the 32 byte hex strings that are used to identify an object in the ServiceNow platform, you won't always find them, but if the query has been generated by a widget, business rule, acl, script or script include, you will typically see them in the stack and be able to work back to the source.

 

In this case, reading from bottom to top, the first one we come to is:

com.glide.caller.gen.sp_widget_d0009941eb103010ed7966d6475228c1_script.call(Unknown Source)

which helpfully indicates what it is and where we can find it.

 

sp_widget, as the name implies, indicates it is a Service Portal Widget and we can find it in the sp_widget table.

 

Rather less helpfully (Unknown Source) at the end of the line, indicates that the calling line number is not known, but hey, at least we have the script.

 

Open the sp_widget table in the classic UI by typing sp_widget.LIST in the navigator filter and then expand the filter and search for Sys ID d0009941eb103010ed7966d6475228c1.

 

SP Widget filtered listSP Widget filtered list

As expected, this shows us that the calling widget is indeed the Employee Center Header, and we can click on that to open the detail.

 

We can see another Sys ID further up the stack trace and that is therefore 'closer' to the query.

 

com.glide.caller.gen.sys_script_include_fb5a924473b7130030f331d7caf6a764_script.call(Unknown Source)

 

This is the last Sys Id we see before the query, and we can see that the class is sys_script_include, ie it is a Script Include, so again, in the navigator filter enter sys_script_include.LIST and filter on the Sys ID.

 

Sys Script Include filtered listSys Script Include filtered list

This shows us the script include is called todoPageUtils, click on the name to view the details.

 

todoUtils Script IncludetodoUtils Script Include

 

We have (Unknown Source) in the stack trace again, which doesn't help us identify where, in this script, the query was called, so the only alternative is to roll up your sleeves and get stuck into the Javascript.

 

If we go back to the header widget and open the details, we can search the server script for calls to todoPageUtils, this gives us a single match.

 

var myTodosCountObject = new sn_hr_sp.todoPageUtils().getMyTodosCount(false, false, true);
 
This shows us that it is calling getMyTodosCount, we can also see that it is storing this in the data.todoCount variable, and looking in the HTML template we can see that this is used in the My Tasks indicator at the top right of the page.
 
<li class="main-header-item" ng-if="data.showTodos" aria-label="${My Tasks}, {{data.todoCountTranslation}}" role="menuitem">
     <a href="?id=hrm_todos_page" id="todos_mobile" aria-label="${My Tasks}, {{data.todoCountTranslation}}">
         <span ng-bind-html="'${My Tasks}'" aria-hidden="true"></span>
         <span ng-show="data.todoCount" class="label label-as-badge label-primary">{{data.todoCount}}</span>
     </a>
</li>
 

Going back to the script include, we can review the getMyTodosCount code and see that it is getting the list of active To-Do configurations from the sn_hr_sp_todos_config table, so let's take a look in there.

 

Use sn_hr_sp_todos_config.LIST to open the table.
 
ToDos Configurations listToDos Configurations list

Now it is a case of trying to match the slow query that we saw, with one of these configurations.

 

In this example, we can see that the only match is with the Simpsons Tasks configuration, so if we open that.

 

ToDos Configuration detailsToDos Configuration details

 

We finally get to see where the query that is causing our slow portal is being generated.

 

I know, this is a awful example, but it was surprisingly hard to generate a slow query on a limited data set, in this filter, I have broken the three cardinal rules for good ServiceNow query performance:

 

  • Using ORs
  • Using CONTAINS
  • Using Filters on dot-walked fields

all of which are terrible for performance.

 

Having identified the problem, what could we potentially do to solve it?

 

Assuming that the ToDo configuration is intentional and necessary, we would need to address the performance of the query itself and/or prevent it from being called in the portal header.

 

For the query, we need to make it performant, and that means addressing whatever is causing the slowness, typically, it's one of the examples given here. If that can't be addressed, then the only solution is to remove it from the header, which will then impact the functionality.

 

Clearly, this would need careful consideration, but I've worked with customers who have opted to either remove the My Tasks count indicator entirely or customise it to break the original query into smaller, more performant queries and then use JavaScript to combine the results.

 

Closing Thoughts

I have seen this problem, or something very similar, many times over the years.

 

Each time the problem was in production before it was noticed, and was either causing a poor user experience, or in the worst case, service interruption.

 

These problems can easily be identified and avoided in pre-production testing, but you have to have a pre-production instance which has a full copy of your production data set. This is often omited due to cost and data security implications, but there really is no excuse. You simply cannot test code or configurations on a small subset of your data and expect it to be a valid test.

 

Further reading

There is an excellent series of community articles linked to the @GTSPerformance account that I strongly recommend. Of particular relevance to this article are:

In addition to this, there is great performance admin guide on our Customer Success site:

 

Support?

If you have ServiceNow support, use it. If you need assistance trying to identifiy and address a slow transaction, try the steps outlined here, even if you only get through a few, you will have some additional information that you can provide support which will typically lead to a faster resolution.

 

 

Version history
Last update:
‎07-04-2024 02:29 AM
Updated by:
Contributors