
- Post History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
on 09-17-2022 11:14 AM
Background: I was looking for a solution on how to read html tags without using DOM in a perfect way for a long time. By using Regular Expressions, you can do almost everything. We will now see an example where anyone can think of using the Regular expressions when we don't get a proper solution.
Example: I have taken Microsoft Support Search as an example for using regular expressions instead of using a Microsoft Graph Connector/Microsoft Graph API. However, for this kind of scenario, I don't see any solution provided by either Microsoft Graph API / in the community (but please do a comment if it has).
Microsoft Support Website: Click Here
Here I used the term as "Microsoft Teams" and we got the results for the same as below.
If you observe the Search Results URL, it will be something like this
https://support.microsoft.com/en-US/search/results?query=microsoft+teams&isEnrichedQuery=false
And if you right click on the page and click on "View page source". You will see all the HTML code of the results.
Now, we will see how this results can be included as an "External Search Source" to the Portal search in Servicenow.
Open your Instance ==> go to the Portals ==> Click on "SP" Portal ==> Under the "Search Sources" create a new Search source called "Microsoft Support Search"
On this Search Source, Copy the below code into "Search Page Template"
Search page template
<div>
<a href="{{item.url}}" class="h4 text-primary m-b-sm block" target="_blank">
<span ng-bind-html="highlight(item.primary, data.q)"></span>
</a>
<span>{{item.short_desc}}</span>
</div>
On the "Data fetch script" use the below script (Make sure you check the option for "Is scripted source" to true as below and copy the code and save this source.
Data fetch script
(function(query) {
var results = [];
/* Calculate your results here. */
try {
var enQuery = GlideStringUtil.urlEncode(query);
var eURL = "https://support.microsoft.com/en-US/search/results?query=" + enQuery;
var ws = new sn_ws.RESTMessageV2();
ws.setHttpMethod("get");
ws.setEndpoint(eURL);
var jsonOutput = ws.execute();
if (jsonOutput) {
var responseBody = JSON.stringify(jsonOutput.getBody());
responseBody = responseBody.replaceAll('\\r', ' ');
responseBody = responseBody.replaceAll('\\t', ' ');
responseBody = responseBody.replaceAll('\\n', ' ');
responseBody = responseBody.replaceAll('\\', '');
responseBody = responseBody.replaceAll('"', '');
responseBody = responseBody.replaceAll("<a class=header href=", "data-si-area");
var allurls = responseBody.match(/data-si-area(.*?)data-bi-area/g);
if (JSUtil.notNil(allurls)) {
if (allurls.length > 0) {
for (var u = 0; u < allurls.length; u++) {
var mm = allurls[u].toString().replaceAll("data-bi-area", "");
mm = mm.replaceAll("data-si-area", "");
var labelurl = mm.split("aria-label=");
var sdesc = labelurl[1].toString().trim().replaceAll("<b>", "");
try {
sdesc = removeTags(sdesc);
} catch (ee) {
sdesc = labelurl[1].toString().trim().replaceAll("<b>", "");
}
sdesc = sdesc.replaceAll("</b>", "");
var rslt = {
"url": labelurl[0].toString().trim(),
"target": "_blank",
"primary": labelurl[1].toString().trim(),
"short_desc": sdesc
};
results.push(rslt);
}
}
}
}
} catch (e) {
}
return results;
})(query);
function removeTags(str) {
var a = str.replace(/<\/?[^>]+(>|$)/g, "");
var b = a.replace(/&/g, '&');
return b.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}
Now, go to the SP Portal and type "Microsoft teams" like how you used in Microsoft Support site. You will see the results something like this.
Note: The results may vary based on the priority of that article by Microsoft at that point of time, but the results that you see is entirely from Microsoft Support site by just reading the HTML response
If you observe the above code, its completely reading a html tags by using the Regular expressions & String functions. In future, if the Microsoft support uses a different pattern, you may change your logic based on that..
Results in Servicenow Portal
Please do a comment if you have any other solution for reading HTML tags in a better way on Server side.
Thanks,
Narsing
- 2,220 Views

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Wonderful! We have been looking for close to 6 months for this!
Thanks a lot!!!!

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi
what was the reason you put in this code?
function removeTags(str) { var a = str.replace(/<\/?[^>]+(>|$)/g, ""); var b = a.replace(/&/g, '&'); return b.replace(/&#(\d+);/g, function(match, dec) { return String.fromCharCode(dec); });
It is fine for English searches, but once you search in French with special characters the result coming back is not good.
Eg. search term: créer un site dans sharepoint
Brings back this url:

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi,
I am not converting the url and looks like it is taking the exact url from the source.
I think it needs a little bit tweak to convert to unicode which will resolve the issue.
For your other question, "removeTags" is being used to make sure it doesn't have the html tags over the short description. I could see some of the html tags even though with filters, thats why to strip off these, used this function.
The Microsoft support site is automatically taking care of converting hex code to html char code. The same thing needs to be done in the code level
I will try from my end.
Thanks for your inputs.
Narsing
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
function removeTags(str) {
var a = str.replace(/<\/?[^>]+(>|$)/g, "");
//to remove &
var b = a.replace(/&/g, '&');
//to remove HTML characters.
return b.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @Narsing1 Narsing1, did you have any luck?

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @Job1 ,
Spent some time this weekend. Need to convert to UTF8 to be able to redirect to the correct URL. Here is the solution.
Create a Script Include and name that as "utf8Utils". Copy the below script
var utf8Utils = Class.create();
utf8Utils.prototype = {
initialize: function() {},
findHexAndConverttoUTF8: function(str) {
str = str.replace(/&(.*?);/g, function(match) {
var repl = new utf8Utils().toUTF8(match);
return repl;
});
return str;
},
toUTF8: function(str) {
var utf8 = [];
for (var i = 0; i < str.length; i++) {
var charcode = str.charCodeAt(i);
if (charcode < 0x80) utf8.push(charcode);
else if (charcode < 0x800) {
utf8.push(0xc0 | (charcode >> 6),
0x80 | (charcode & 0x3f));
} else if (charcode < 0xd800 || charcode >= 0xe000) {
utf8.push(0xe0 | (charcode >> 12),
0x80 | ((charcode >> 6) & 0x3f),
0x80 | (charcode & 0x3f));
}
// surrogate pair
else {
i++;
charcode = ((charcode & 0x3ff) << 10) | (str.charCodeAt(i) & 0x3ff);
utf8.push(0xf0 | (charcode >> 18),
0x80 | ((charcode >> 12) & 0x3f),
0x80 | ((charcode >> 6) & 0x3f),
0x80 | (charcode & 0x3f));
}
}
return utf8.join(",").replaceAll(",", "");
},
type: 'utf8Utils'
};
On the Data Fetch Script, use like this
Example: (For testing purpose, run this example using "Scripts - Background" after you copy the above script include and validate with the returned value)
var s = "https://support.microsoft.com/fr-fr/office/créer-un-site-communautaire-dans-sharepoint-8a890f58-9492-4be1-b6b3-481fb0f9b4a5";
s = new utf8Utils().findHexAndConverttoUTF8(s);
gs.print(s);
Output:
*** Script: https://support.microsoft.com/fr-fr/office/crundefineder-un-site-communautaire-dans-sharepoint-8a890f58-9492-4be1-b6b3-481fb0f9b4a5
Thanks,
Narsing

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thanks! Will try it out!