How to parse incoming email which is HTML format

Community Alums
Not applicable

Fieldin the incoming email is coming like below:

<td valign="top" align="left" class="updates-diff-label" style="border-collapse:collapse; border-spacing:0px; color:#172b4d; padding:0px; max-width:150px; vertical-align:top; color:#5e6c84; text-align:left; padding-right:10px; padding-bottom:5px">
Detected in:</td>

 

If you see above we have "Detected in" field which I need to extract. With its value in another html as below

 

<td valign="top" align="left" class="updates-diff-content" style="border-collapse:collapse; border-spacing:0px; color:#172b4d; padding:0px; vertical-align:top; text-align:left; padding-bottom:5px">
Development</td>

 

The value i need to extract is "Development"

 

I wrote below Inbound action script just to check if I can extract something or not:

 

var det=email.body.detected_in;

 

If I try to print above its not printing. Can you guide how to read values of such fields. We have all the fields coming like this in incoming email

Please suggest?

 

 

 

 

2 ACCEPTED SOLUTIONS

Shubham_Jain
Mega Sage

Approach

You can use regular expressions to extract both the label (Detected in) and its corresponding value (Development). Since the content is HTML, we can look for the <td> tags that contain these values.

 

 

// Get the email body as a string
var emailBody = email.body_html; // Use body_html to get the HTML content

// Regular expression to capture the "Detected in" field and its value
var detectedInPattern = /<td.*?>Detected in:<\/td>\s*<td.*?>(.*?)<\/td>/i;

// Execute the regex pattern on the email body
var detectedInMatch = detectedInPattern.exec(emailBody);

if (detectedInMatch && detectedInMatch[1]) {
    // Extracted value of "Detected in" field
    var detectedInValue = detectedInMatch[1].trim();
    gs.log('Detected in: ' + detectedInValue);  // This should print "Development"
} else {
    gs.log('No match found for Detected in field');
}

 

✔️ If this solves your issue, please mark it as Correct.


✔️ If you found it helpful, please mark it as Helpful.



Shubham Jain


View solution in original post

Amit Verma
Kilo Patron
Kilo Patron

Hi @Community Alums 

 

I will suggest you to first get rid of the HTML tags from the email body using the regex

/<[^>]+>/g

Post that, you will be left with Detected in:Development. With this, you can split on : and extract Development. Refer below snips :

 

AmitVerma_0-1728907588032.png

var htmlString = '<td valign="top" align="left" class="updates-diff-label" style="border-collapse:collapse; border-spacing:0px; color:#172b4d; padding:0px; max-width:150px; vertical-align:top; color:#5e6c84; text-align:left; padding-right:10px; padding-bottom:5px">Detected in:</td><td valign="top" align="left" class="updates-diff-content" style="border-collapse:collapse; border-spacing:0px; color:#172b4d; padding:0px; vertical-align:top; text-align:left; padding-bottom:5px">Development</td>';
var plainText = htmlString.replace(/<[^>]+>/g,'').trim();
var detectedIn = (plainText.split(':')[1]).trim();
gs.print(detectedIn);

Output -

AmitVerma_1-1728907600363.png

Thanks and Regards

Amit Verma


Please mark this response as correct and helpful if it assisted you with your question.

View solution in original post

4 REPLIES 4

Shubham_Jain
Mega Sage

Approach

You can use regular expressions to extract both the label (Detected in) and its corresponding value (Development). Since the content is HTML, we can look for the <td> tags that contain these values.

 

 

// Get the email body as a string
var emailBody = email.body_html; // Use body_html to get the HTML content

// Regular expression to capture the "Detected in" field and its value
var detectedInPattern = /<td.*?>Detected in:<\/td>\s*<td.*?>(.*?)<\/td>/i;

// Execute the regex pattern on the email body
var detectedInMatch = detectedInPattern.exec(emailBody);

if (detectedInMatch && detectedInMatch[1]) {
    // Extracted value of "Detected in" field
    var detectedInValue = detectedInMatch[1].trim();
    gs.log('Detected in: ' + detectedInValue);  // This should print "Development"
} else {
    gs.log('No match found for Detected in field');
}

 

✔️ If this solves your issue, please mark it as Correct.


✔️ If you found it helpful, please mark it as Helpful.



Shubham Jain


Community Alums
Not applicable

When I try to Print below:

var emailBody = email.body_html;

 gs.log('emailbody'+emailBody);

This is not printing anything

Najmuddin Mohd
Mega Sage

Hi @Community Alums ,

Is the word 'Development' a pre defined choice field.

If Yes, you can check the email body contains Development or something xyz or something.
If contains Developement, then return development.

Hope this helps.

Regards,
Najmuddin.

Amit Verma
Kilo Patron
Kilo Patron

Hi @Community Alums 

 

I will suggest you to first get rid of the HTML tags from the email body using the regex

/<[^>]+>/g

Post that, you will be left with Detected in:Development. With this, you can split on : and extract Development. Refer below snips :

 

AmitVerma_0-1728907588032.png

var htmlString = '<td valign="top" align="left" class="updates-diff-label" style="border-collapse:collapse; border-spacing:0px; color:#172b4d; padding:0px; max-width:150px; vertical-align:top; color:#5e6c84; text-align:left; padding-right:10px; padding-bottom:5px">Detected in:</td><td valign="top" align="left" class="updates-diff-content" style="border-collapse:collapse; border-spacing:0px; color:#172b4d; padding:0px; vertical-align:top; text-align:left; padding-bottom:5px">Development</td>';
var plainText = htmlString.replace(/<[^>]+>/g,'').trim();
var detectedIn = (plainText.split(':')[1]).trim();
gs.print(detectedIn);

Output -

AmitVerma_1-1728907600363.png

Thanks and Regards

Amit Verma


Please mark this response as correct and helpful if it assisted you with your question.