- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Since Fuji, administrators can now add indexes to their own instances as needed. Fellow support agent Bill Tang wrote in detail how you can Improve your ServiceNow instance performance by creating database indexes via the User Interface To expand on this function, I want to focus on what you can do if you have a slow query in the slow query log and want to know how to make this specific query faster.
A note of caution though - adding too many indexes, even if they work well for a query - can be detrimental to the table's overall performance. Always feel free to consult with your support engineers to determine if adding the new index will be beneficial.
Example of a slow query logFirst you need to analyze the Slow Queries log under System Diagnostics and to do that, you will need to be able to interpret all the information given in every log entry. I have used my instance as an example on how and in which order to read the slow query information: First, I sort my list descending on Total execution time - this is where I can make the biggest difference, because these slow queries are executed most often. I usually don't worry about queries that are in the single or double digits, but everything higher than that for average execution time, I will analyze more closely. Once I have finished my analysis on those findings, I move on to the queries with the highest average execution time. Let us look at a single query that needs improvement and analyze it: Slow Queries Example: SELECTFROM (sys_dictionary sys_dictionary0 INNER JOIN sys_metadata sys_metadata0 ON sys_dictionary0.`sys_id` = sys_metadata0.`sys_id` ) WHERE sys_dictionary0.`choice_table` IS NOT NULL Total execution time: 50 Minutes Execution count: 40.091 Average execution time (ms): 75,72 Example stack trace: glide.scheduler.worker.0 com.glide.db.query_stats.QueryPattern.setStackTrace(QueryPattern.java:59) com.glide.db.query_stats.QueryStats.handleMetaData(QueryStats.java:111) com.glide.db.query_stats.QueryStats.addToNursery(QueryStats.java:86) com.glide.db.query_stats.QueryStats.recordQuery(QueryStats.java:74) com.glide.db.query_stats.QueryStats.recordQuery(QueryStats.java:63) com.glide.db.DBI.saveStats(DBI.java:1035) com.glide.db.DBI.sqlVerbosity(DBI.java:1029) com.glide.db.DBI.executeStatement(DBI.java:785) com.glide.db.DBI.executeStatement(DBI.java:766) com.glide.db.DBQuery.executeAsResultSet0(DBQuery.java:290) com.glide.db.DBQuery.executeAsResultSet0(DBQuery.java:265) com.glide.db.DBQuery.executeAndReturnTable(DBQuery.java:231) com.glide.db.DBAction.executeNormal(DBAction.java:197) com.glide.db.DBAction.executeAndReturnException(DBAction.java:166) com.glide.db.RDBMSQueryContext.executeQuery(RDBMSQueryContext.java:46) com.glide.db.DBQuery.execute(DBQuery.java:1772) com.glide.choice.ChoiceDependency.<init>(ChoiceDependency.java:39) com.glide.choice.ChoiceDependency.get(ChoiceDependency.java:28) com.glide.choice.ChoiceChangeListener.expandMe(ChoiceChangeListener.java:49) com.glide.choice.ChoiceChangeListener.flushChanges(ChoiceChangeListener.java:37) com.glide.db.AChangeListener.onExecute(AChangeListener.java:43) com.glide.db.DBAction.processListeners(DBAction.java:143) com.glide.db.DBAction.executeAndReturnException(DBAction.java:171) com.glide.script.GlideRecordITable.delete(GlideRecordITable.java:177) com.glide.script.GlideRecord.delete(GlideRecord.java:4913) com.glide.script.system.GlideSystemUtilDB.tableClearByTable(GlideSystemUtilDB.java:523) com.glide.script.system.GlideSystemUtilDB.cleanupTableReferences(GlideSystemUtilDB.java:504) com.glide.script.system.GlideSystemUtilDB.dropTable(GlideSystemUtilDB.java:741) com.glide.script.system.GlideSystemUtilDB.js_dropTable(GlideSystemUtilDB.java:428) ... org.mozilla.javascript.gen.c5175.call(sys_script_include.77e30b080a00052638e7274157b9b97a:30) org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1227) org.mozilla.javascript.gen.c3087.call(sys_script_include.bb1996e80a0a0b0f00499e4a748ddfb7:58) org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1227) org.mozilla.javascript.gen.c3088.call(sys_script_include.bb1996e80a0a0b0f00499e4a748ddfb7:67) org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1227) org.mozilla.javascript.gen.c5173.call(sysevent_script_action.2227b84a0a0a0b950004868f9ece6509:3) org.mozilla.javascript.gen.c5173.exec(sysevent_script_action.2227b84a0a0a0b950004868f9ece6509) com.glide.script.ScriptEvaluator.execute(ScriptEvaluator.java:233) ... com.glide.script.fencing.GlideScopedEvaluator.evaluateScript(GlideScopedEvaluator.java:192) com.glide.policy.ScriptActionHandler.executeScriptInScope(ScriptActionHandler.java:144) com.glide.policy.ScriptActionHandler.process0(ScriptActionHandler.java:62) com.glide.policy.ScriptActionHandler.process(ScriptActionHandler.java:39) com.glide.policy.EventProcessor.processEventDuringNormalOperation(EventProcessor.java:164) ... com.glide.policy.EventManager._process(EventManager.java:160) com.glide.policy.EventManager.process(EventManager.java:142) ... org.mozilla.javascript.FunctionObject.doInvoke(FunctionObject.java:597) org.mozilla.javascript.FunctionObject.call(FunctionObject.java:504) org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1227) ... com.glide.schedule.JobExecutor.execute(JobExecutor.java:79) com.glide.schedule.GlideScheduleWorker.executeJob(GlideScheduleWorker.java:177) com.glide.schedule.GlideScheduleWorker.process(GlideScheduleWorker.java:124) com.glide.schedule.GlideScheduleWorker.run(GlideScheduleWorker.java:56) Example URL: events process First sighting: 2015-09-22 17:10:33 The Explain PlanOrderSelect TypeTableTypePossible KeysKeyKey lengthRefRowsExtraID0SIMPLEsys_dictionary0ALLPRIMARY25.629Using where11SIMPLEsys_metadata0eq_refPRIMARYPRIMARY96empigeist2_2.sys_dictionary0.sys_id11 The Database Indexes
This is all the information available to you in the Slow Queries log. I will explain them in the order in which they make most sense to me - from the time the query was issued to the additional information that helps us understand what is going on. |
Interpreting the slow query log information
Query Source
Through the example URL we know that this query was issued by the events process. Another common example URL might be /home.do - meaning the slow query is on a user's home page. The stack trace confirms that this query comes from a worker: com.glide.schedule.GlideScheduleWorker.run(GlideScheduleWorker.java:56).
Following the stack further, I see org.mozilla.javascript.gen.c5175.call(sys_script_include.77e30b080a00052638e7274157b9b97a:30), which refers to a script include called TableDrop, which in line 30 gs.dropTable(tableName); which includes - of course - querying the dictionary record for which table to drop.
Query run times
Now that we know the background of this query - how often did it run, how long did it take? The answers to this can be found here:
Total execution time: 50 Minutes
Execution count: 40.091
Average execution time (ms): 75,72
Then the question becomes - why does it take so long for a simple query?
Query analysis
The query is:
SELECTFROM (sys_dictionary sys_dictionary0 INNER JOIN sys_metadata sys_metadata0 ON sys_dictionary0.`sys_id` = sys_metadata0.`sys_id` ) WHERE sys_dictionary0.`choice_table` IS NOT NULL
What does this query do?
It first queries the sys_dictionary table for all records where choice_table is not null. Then it joins the result of that with the information in the sys_metadata table for all records where the sys_ids match.
Interpreting an explain plan
Why does this take so long? To find that out, we look at the explain plan:
Order | Select Type | Table | Type | Possible Keys | Key | Key len | Ref | Rows | Extra | ID |
0 | SIMPLE | sys_dictionary0 | ALL | PRIMARY | 25.629 | Using where | 1 | |||
1 | SIMPLE | sys_metadata0 | eq_ref | PRIMARY | PRIMARY | 96 | empigeist2_2.sys_dictionary0.sys_id | 1 | Extra | 1 |
This is the real interesting part - what does all this mean?
The more common Select Types are:
- SIMPLE; no UNION or subquery
- PRIMARY; outermost SELECT
- UNION; second or later SELECT in a UNION
- SUBQUERY; first SELECT in subquery
- DERIVED; derived table (subquery in WHERE clause)
The most commonly seen types are - in order of efficiency:
Name | Description |
const | At most, one matching value, treated as a constant. I.e. active = 1 |
eq_ref | Joining or looking up unique index values. JOIN uses a unique index or key prefix. I.e. joining two tables on the PRIMARY key. |
ref | Joining or looking up non-unique index values. JOIN uses a non-unique index or key prefix. Indexed fields compared with '=', '!='. Best data access strategy for non-unique values. |
index_merge | Perform several index searches using different keys from same table and merge the results. |
range | Range index scan. The key column is compared to a constant using operators like BETWEEN, IN, >, >=. I.e. sys_created_on > "2015-09-23 00:00:00". |
index | Full Index scan. The entire index tree is scanned. |
ALL | Full Table scan. The Entire table is scanned. |
Now that we have all this information, what does our explain mean:
Order | Select Type | Table | Type | Possible Keys | Key | Key len | Ref | Rows | Extra | ID |
0 | SIMPLE | sys_dictionary0 | ALL | PRIMARY | 25.629 | Using where | 1 | |||
1 | SIMPLE | sys_metadata0 | eq_ref | PRIMARY | PRIMARY | 96 | empigeist2_2.sys_dictionary0.sys_id | 1 | 1 |
We are querying two tables, sys_dictionary first, and then sys_metadata. In the case of sys_dictionary, we use a full table scan through the primary key (sys_id) and have to scan through 25.629 records, that we then filter out from there through the where clause. On the sys_metadata table, we merge the results of the first query record by record based on the primary key, sys_id.
Why did it use the primary key and a full table scan on the dictionary query? The answer is simple - there is no index on choice_table, so it had no matching index to go with. If we were to add an index (non-unique, since we do not create unique indexes for query tuning) on choice_table, our explain would then look like something like this:
Order | Select Type | Table | Type | Possible Keys | Key | Key len | Ref | Rows | Extra | ID |
0 | SIMPLE | sys_dictionary0 | ref | PRIMARY, choice_table | choice_table | 40 | 5.427 | Using index; using where | 1 | |
1 | SIMPLE | sys_metadata0 | eq_ref | PRIMARY | PRIMARY | 96 | empigeist2_2.sys_dictionary0.sys_id | 1 | 1 |
As you can tell, the extra now tells us an index was found and used. The number of rows we had to comb through to get to our records is significantly smaller, and it returns way faster than before.
Talking about the extras, here is a sampling of what they can be:
- Using filesort; filtered rows are gone through again for sorting
- Using index; all data is retrieved from a single index (covering index)
- Using index; using where; index is also used for all key lookups
- Using temporary; a temporary table is required to process the query
- Using where; a WHERE clause is used
- Using join buffer; rows are re-used from the join buffer
Finding the best index for your query
How do you know if an index will be good and taken or not - looking at the query is the first step, but there is also this thing called "cardinality." At one point, the database will decide that taking the index has no benefit and will revert to a full table scan. You will look at the explain plan very confused and say "I gave it an index, why would it not take it???"
Imagine the following situation: You are searching for all users named Jim that work in the Finance department. Your company has a total of 30000 users, out of those, 500 are named Jim, there are nearly a thousand different names in your list of users, and you have 15 departments in your company (meaning about 2000 users per department).
The story of cardinality is - how can I narrow down my results fastest, so I don't have to search through too much information. In our search for Jim in Finance - are we better off searching for Jim first, or for the Finance Department?
If we search (and index) for Jim first, we will go through an index tree that has nearly a thousand sorted nodes (all the different first names). If we search (and index) for departments first, we will go through only 15 nodes. But if you look at how many records it would find within the record, you would have 500 for the name, and around 2000 for the department, meaning searching and indexing for name first will narrow our results down fastest and furthest.
Different situation - you want to find all closed incidents in your system. 99% of all your incidents are closed. Will an index on state help? Most likely not, since there are so many closed incidents compared to the complete set of data, that it very likely will revert to a full table scan at that point.
Sometimes you need to change they query, not the index
And of course there are situations where an index cannot be used at all. We all love the "contains" searches that ServiceNow offers. In SQL speak, they translate to a LIKE '%xxx%', or a double wildcard. Databases cannot use an index on that type of query at all - always ending up with a full table scan (unless there is a saving grace of another part of the query that can be indexed)
Another example are OR queries against different fields: WHERE first_name="Jim" OR last_name="Smith" - in this case, it will first search all Jims, then all Smith's and merge the results together - it cannot use one index for both. We always recommend making those two queries and merge the results through other methods.
One more common example - if you group by one thing and order by another, the same situation occurs - what to take? Then I won't take any... : GROUP BY first_name, ORDER BY last_name
---
Special thanks to our fantastic Performance Engineer Scott Nemes at ServiceNow who reviewed and provided some of those great tables that I included in here. Scott, you are the BEST!
- 5,078 Views
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.