convert html field content into normal string

Rohit89
Kilo Expert

Hi,

I want to convert html field content to normal string without special characters or get the value as string which get displayed in the html field? Below is the value which it is giving for special character ? and &-

find_real_file.png

1 ACCEPTED SOLUTION

Kevin Dugger
Kilo Guru

Good Afternoon Rohit,

I had to do a bit of digging for this, as part of your example uses a full-width question mark which doesn't easily translate.  I have some code below which should convert properly for your purposes, but this may need to modified depending on your use case. 

For my example, I am using the GlideStringUtil API's unEscapeHTML() method, in order to parse out the standard html character entities from the string.  After that, I am using rejex to capture the hex values from the UTF-16 encoded characters, reconstitute the value in a way that parseInt() would accept, and utilizing the fromCharCode() String method I convert the resulting integer value to a normal character.

I should remark that it is very possible this may not work for all use-cases, however, I believe it will work for yours.

Code :

function decodeEntities(encodedString) {
    return (j2js(new GlideStringUtil().unEscapeHTML(encodedString))).replace(/&#x([0-9a-fA-F]+);/gi, function(match, numStr) {
        var num = parseInt("0x" + numStr.toString());
        return String.fromCharCode(num);
    });
}

gs.info(decodeEntities("? &"));

Results :

*** Script: ? &

 

To the question of why I converted the output of the unEscapeHTML() method using j2js(), the unEscapeHTML() method outputs a Java object as opposed to a JavaScript object.  It has to be converted in order to be parsed by the subsequent javascript string methods.

Hope this helps get you started.

 

Thanks,

Kevin

View solution in original post

3 REPLIES 3

sachin_namjoshi
Kilo Patron
Kilo Patron

 

You will have to use regex to exclude special characters from HTML field.

Use below script to get only string from HTML field

 

var str = "1234%&$1230kjsfd.,><?";
var test = str.replace(/[^\d\w]/gi, '');
gs.print("Str: " + str + "\ntest: " + test);

 

Regards,

Sachin

 

 

Slava Savitsky
Giga Sage
What are you trying to do with that converted value?

Kevin Dugger
Kilo Guru

Good Afternoon Rohit,

I had to do a bit of digging for this, as part of your example uses a full-width question mark which doesn't easily translate.  I have some code below which should convert properly for your purposes, but this may need to modified depending on your use case. 

For my example, I am using the GlideStringUtil API's unEscapeHTML() method, in order to parse out the standard html character entities from the string.  After that, I am using rejex to capture the hex values from the UTF-16 encoded characters, reconstitute the value in a way that parseInt() would accept, and utilizing the fromCharCode() String method I convert the resulting integer value to a normal character.

I should remark that it is very possible this may not work for all use-cases, however, I believe it will work for yours.

Code :

function decodeEntities(encodedString) {
    return (j2js(new GlideStringUtil().unEscapeHTML(encodedString))).replace(/&#x([0-9a-fA-F]+);/gi, function(match, numStr) {
        var num = parseInt("0x" + numStr.toString());
        return String.fromCharCode(num);
    });
}

gs.info(decodeEntities("&#xff1f; &amp;"));

Results :

*** Script: ? &

 

To the question of why I converted the output of the unEscapeHTML() method using j2js(), the unEscapeHTML() method outputs a Java object as opposed to a JavaScript object.  It has to be converted in order to be parsed by the subsequent javascript string methods.

Hope this helps get you started.

 

Thanks,

Kevin