Parsing HTML server-side without DOM

howard8
Tera Contributor

I am trying to work out how to do some post-processing of the HTML in submitted knowledge base articles via a business rule.   More specifically I want to parse the elements and auto insert/update ID attributes on heading tags, so user don't have to know anything about HTML but still have some bookmarks automatically generated in the article. The use case here is so an auto-generated task can point to a specific step in a work instruction or procedure.

This would be relatively trivial client side because the DOM is available to handle this; On the server-side (Mozilla Rhino JavaScript interpreter) - not so much. XML parsers (like XMLhelper) are very strict for HTML and you have to convert from a string to an object, then back again, which often result in the element being re-ordered (thus breaking the html entirely).

Env-js is a project to simulate a browser on the server-side so there is a DOM available, but not looks dormant since 2011, and even then it needed a newer version of Rhino than Servicenow runs in 2015!

Has anyone done anything like this with server-side code? If so, can you share your findings please?

4 REPLIES 4

BenPhillipsSNC
Kilo Guru

Hi Howard



Why can't you do this procedure using on-submit client scripts?



Thanks


Hi Ben,



Yeah that seems to be the only option right now. The down side is that I couldn't do bulk updates of existing article, meaning that users would have to open each one and save it again. This would also need to be done if we added new feature to the script. It does work for manually entered new articles though, thanks.


dmfranko
Kilo Guru

Hey Howard,



Did you ever find a way to do this?


drjohnchun
Tera Guru

Hi Howard - I did something similar, but using Client Script's onLoad(). For example, I added hyperlinks to ticket numbers embedded in a KB article by searching for the pattern and replacing them with links only in the browser display; this is a common technique used by many websites. Another advantage is you don't have to modify the articles stored on the server.



This would be viable only if you can reliably search for the patterns you want to replace.



Please feel free to connect, follow, mark helpful / answer, like, endorse.


John Chun, PhD PMP see John's LinkedIn profile

visit snowaid


ServiceNow Advocate