UI Action exhibiting different behavior in Dev than in test or prod.

James Bengel · ‎04-22-2022

Update - Resolved: It seems that the HTML Sanitizer (or more accurately a subset of it) was responsible for what we were seeing, despite the fact that all of the offending tags are in the built in whitelist. The short version is that while you can't modify the property com.glide.security.check_unsanitized_html once it's been set to "enforce" you can delete it in the list view of system properties, which has the same effect. Once we did this, the output from the script returned to normal.

Leaving this here in case anybody else runs into a similar problem.

Update: We did discover that the HTML Sanitizer was set to "enforce" in Dev, but "log_only" in Test. I'm disinclined to blame this setting, however since the tags that were removed are included in the built in whitelist -- which is immutable. I suspect that it's something very much like this, though.

Both our Development and Test instances were recently cloned from Production and subsequently upgraded to San Diego in preparation for upgrading our production to San Diego.When we do this, we have a narrow window of time where the state of all three instances is essentially the same, and no additional development happens until everything is on the same release again.

We have a UI Action to generate a knowledge article for a release that lists all of the stories and component releases that's generated when the major release goes to Beta, and again when it goes to GA. The scrip in the UI Action hasn’t been modified since September of 2019, and it has been working fine in every release up to Quebec, and is working fine in San Diego as well -- in the Test Instance. But it's broken in Dev. Which leads me to the conclusion that there is something configured differently in Dev, but since I haven't worked on ServiceNow since around the time the last changes were made to this script, I don't know where to begin looking for the "what" that's different.

Getting to the specific problem, there's a form button that the release manager clicks to "Create BETA KB Post". The script that runs behind that button, hoovers up every sub-release that has this major release as a parent, and every story that has one of the sub-releases as a parent, and iterates over the results to create a table that lists each sub-release, and all of its dependent stories in turn. The example only had one story, so it's perhaps not obvious that there's a table there when you look at the post, but in Test it looks like this:

If we look at it in the editor we can see the table:

And if we look at the source code we see what the script hath wrought:

If we do the same thing with the same release in Dev (remember that both of these were cloned form Prod, so we have them in more or less identical states) we see something that isn't at all like what we expect:

The entire table -- headings and all -- has been condensed into one line. Whic hbecomes obvious when we look at it in the editor:

And even more obvious when we look at the source code:

The <table>, <th>, <tr>, and <td> tags were all replaced by a single <p>aragraph tag when the record was created.

I don't know how this happened, or why it's only happening in Dev and not Test. But if I edit the article and manually add the markup to create the table, it doesn't complain about that. I can save the knowledge record with the table in it just fine. But the script can't.

The clones of Dev and Test happened a few weeks apart, but they were both cloned from the same source (Production) where the script has been happily generating tables since 2019. And since there wasn't (and still isn't) any problem in Production, I'm forced to conclude that whatever is causing this behavior is unique to the Dev instance -- and originated there, and wasn't overwritten by the clone. Because Prod (Quebec) and Test (San Diego) are doing just fine.

Any ideas -- even strange one -- are welcome. Because I'm all out of them myself, and this is holding up the Production upgrade because nobody knows what the root cause is.

Thanks in advance!

James Bengel · ‎04-25-2022

It appears to be simpler than that even. Since I really don't work on that side of the house anymore, I'm not always up to date on what our awesome SN admins are up to (I got pulled in because I wrote the script back in 2019). But long story shorter, they were working through the security health check (or whatever SN calls it) and one of the recommendations was to change the setting for "Check Unsanitized Html" from "log_only" to "enforce".

This happened after Dev was cloned, and they didn't make the same change in Test pending evaluation of what effect it would have.

Now given that all of the offending tags that were stripped out of the KB post are inlcuded in the built in whitelist, I wouldn't have thought that this would cause us any heartburn, but apparently I would have been mistaken.

The good news is that while you can't change the property in question (com.glide.security.check_unsanitized_html) you can delete it -- which has the same effect. And as you have probably already guessed, once it was deleted the behavior returned to normal.

And the KB that these go into is used by the local admins at each campus to keep their ERP patches current, so it was important to make sure we didn't break it in Prod. As it turns out the San Diego upgrade (probably) isn't at the root of the problem per se, though the whitelist (apparently) being at least partially ignored could be an indicator of a larger problem.

View solution in original post

bammar · ‎04-24-2022

I might post a few times here as I mull this over- First of all what your doing is pretty neat.

So the purpose of this is to create a KB that pulls in a list of all the various tasks.

So even though dev is not formatted right- one can clean the required data from it- however the release is delayed out of concern perhaps this is a bigger issue? -

Ill offer some alternative ideas and lines of thinking though i know you desire the precise answer-

A. - you could try to copy xml of all the GOOD Kbs in test dump them in Dev- see how those look( or try one- if you export XML of a good KB from test and send it to Dev- and it looks weird then you know a script or setting is off)--- Now perhaps- if the test import of xml looks good- you delete the bad ones one from Dev- and replace - After all the whole upgrade is being delayed by something customers would never ever be aware of- and there are other ways to back up the data- if you need it as reference.

I think you may have something in terms of HMTLSantizer doing its thing- Check to see if certain components dont get changed in a clone- like Security Properties- maybe Dev has a Property Test doesnt?

James Bengel · ‎04-25-2022

It appears to be simpler than that even. Since I really don't work on that side of the house anymore, I'm not always up to date on what our awesome SN admins are up to (I got pulled in because I wrote the script back in 2019). But long story shorter, they were working through the security health check (or whatever SN calls it) and one of the recommendations was to change the setting for "Check Unsanitized Html" from "log_only" to "enforce".

This happened after Dev was cloned, and they didn't make the same change in Test pending evaluation of what effect it would have.

Now given that all of the offending tags that were stripped out of the KB post are inlcuded in the built in whitelist, I wouldn't have thought that this would cause us any heartburn, but apparently I would have been mistaken.

The good news is that while you can't change the property in question (com.glide.security.check_unsanitized_html) you can delete it -- which has the same effect. And as you have probably already guessed, once it was deleted the behavior returned to normal.

And the KB that these go into is used by the local admins at each campus to keep their ERP patches current, so it was important to make sure we didn't break it in Prod. As it turns out the San Diego upgrade (probably) isn't at the root of the problem per se, though the whitelist (apparently) being at least partially ignored could be an indicator of a larger problem.