Performance Best Practice for Before Query Busines...

GTSPerformance · ‎05-09-2022

Thanks for bringing up this question @ITSM Dev,

I am aware that this advice, that Query BRs are better than ACLs because they "give the data filtering task to the background database", has been given out to folks over the years and I wouldn't be surprised if it was still being given out by ServiceNow trainers. The advice is not necessarily wrong, it is just too simplified to be the right answer all the time.

You mention two advantages of Query BRs, so I'll break my answer into two parts:

1. "Query BRs are executed only once per query"

I think this argument boils down to a comparison of how many executions of a Query BR must be run versus an ACL to accomplish the same goal. The argument kind of assumes that if there are fewer executions of a Query BR as opposed to the corresponding ACL, then the Query BR must mean better performance. For example, if you are doing an list load, then it is true a Query BR might have fewer executions than an ACL that is designed for the same purpose. For a list load there will be a single query to return all the results and then ACLs will be applied against each of the results that end up getting displayed in the UI. If your pagination setting is set to 50 then you would have 50 ACLs applied as opposed to 1 Business Rule. If the ACLs and BR have roughly the same net impact per execution then it would stand to reason that the Query BR would be a better choice, right? In reality, there are a number of ways that this argument breaks down.

There is the simple case that perhaps the ACLs and BR do not have the same net impact. Suppose the ACLs take 10 milliseconds each, for a grand total of 500 milliseconds, and suppose the Business Rule takes 1 second of execution time. Then it doesn't matter that there are fewer ACLs executed, they will be twice as fast as the Business Rule solution (500ms vs. 1,000m). This is not a far fetched proposition. In many cases, ACLs can be constructed in ways that make them incredibly simple and fast (see the list I put at the bottom).

A slightly more complex argument for why execution count does not make ACLs better lies in one of the fundamental differences between ACLs and Query BRs. ACLs apply their conditions against one record at a time. They ask things like, is this record owned by the current user? If we go back to our example before of 50 records displayed on the page, you can imagine how incredibly fast it would be to answer that question 50 times - especially since ACLs do not need to make a subsequent call to the database to answer the question, the app layer already has all the information it needs to answer the question. On the other hand, a Query BR will be adding that condition to the database query that executes against the entire table. The impact is magnified because more records must be checked to see if they are owned by the current user. If you have 5,000,000 records in the table then the database essentially has to run the question against 5,000,000 records.

One oversight regarding the statement "Query BRs execute only once per query" is that it only applies to certain cases - like when the operation is a list load. During a list load only a single query is executed, returning multiple records. However, there is also the possibility that the operation is something like the Activity Formatter where many queries are executed, one for each record returned. In this case, Query BRs lose their assumed advantage since there would be the same number of ACLs executed as there would Query BRs.

Another thing to consider is that Query BRs execute in many circumstances where ACLs do not execute. As mentioned in the above article, Before Query Business Rules get applied to queries originating from other scripts. This means that the added complexity and execution cost is being applied in places where it was never needed or intended. This is probably the most important thing to think about. It can be somewhat mitigated by using the if (gs.isInteractive()) condition, which is understood to exclude things like scheduled jobs, but there are tons of weird exceptions. Like if a scheduled job does impersonation (like in a scheduled report) then it is still considered interactive. Also, gs.isInteractive() doesn't solve for cases where, for instance, an interactive transaction triggers some loop that is being executed as part of a Script Include, thus triggering thousands of GlideRecord calls against a table that has a Query BR. In that case an ACL wouldn't fire, but a Query BR would.

One more thing: table level ACLs will only be evaluated once per list load. Their results get cached and any subsequent execution has practically zero execution time. So if you are comparing Query BRs to table level ACLs then execution count advantage of Query BRs disappear.

2. "[Query BRs] give the data filtering task to the background database."

I think this part of the argument is predicated on the assumption that data filtering is always better handled by the database. I think it is probably one of those things that is mostly true when taken as a whole. However, it breaks down when you try to apply it to specific situations.

In many cases, ACLs can completely avoid doing any expensive processing. If your ACL condition does not contain custom script, then it can be cached and executed incredibly fast. Going back and forth to the database is expensive and the cumulative overhead can add up fast. ACLs have built in app tier caching, while Query BRs do not. Our most frequent approach to improving Query BR performance is to add app tier caching.
ACLs only run in contexts where security is needed, but Before Query Business Rules run everywhere. This is a very important point that I've already explained earlier. There are many things that can be done to mitigate the risk of this issue, but they usually involve complex customizations and deep understanding of custom business logic.
ACLs usually do not add to the unpredictability of database queries. Predictable queries lead to consistent query patterns and consistent query patterns can be identified and optimized.
Query BRs add to the complexity of database queries by adding conditions in various scenarios. As a general rule, the more complex the query, the less likely the database will be able to find an efficient way to execute it.
ACLs are very easy to debug and see where they are coming from (see Session Debugging in our product docs). Query BRs, on the other hand, obfuscate the source of the problem. There is no easy way to tell that a given condition in a query has come from a certain Query BR. Thus, slow queries from Query BRs are likely to go un-diagnosed or misdiagnosed and therefore, unsolved or partially solved. This creates a long-term performance headache.
ACLs have access to the "current" object for free - it is in memory already since the results have been returned from the database. Query BRs have access to a "current" object as well, but that object has not been populated with a result yet. The "current" object in Query BRs is basically a GlideRecord that hasn't had its query() method executed yet. In other words, because ACLs are run after results come back from the database, they can ask completely different questions than Query BRs. This fundamentally changes the way Before Query BRs must be designed. It often requires a Query BR to make queries to other tables to check for conditions and build lists of matching records. Or sometimes it necessitates the use of complex OR and JOIN operations that are difficult to optimize for the database.

Sincerely, your Global Technical Support Performance Team