Striping hyperlink from inbound email

Brendan Hallida
Kilo Guru

Hi all,

I was looking at the ability to strip hyperlinks from the inbound emails coming through via email.

We use McAfee, and it encrypts the email hyperlinks and we end up with 1000 character comments when 100 may be needed.

Cheers,

Brendan

6 REPLIES 6

Chuck Tomasi
Tera Patron

Hi Brendan,



Can you provide an example. It's likely that a RegEx pattern on the body of the message will work. It's just a matter of getting a sample for the encrypted hyperlink.



If it follows the standard <a href="http...">text</a> format, then it shouldn't be a major effort to strip them out. If, however they are in a different format (hopefully consistent format between messages) then it's a different pattern.



Examples are appreciated. If they are more than a few lines (you mentioned 1000 characters), consider putting them in a text file and attaching.



Thank you


Hi Chuck,



Thanks for your reply.



for an example, it looks like this:



www.example.com<http://cp.mcafee.com/d/5fHC54RTdsdesMUSyMU-MdffgfhfaDSSQsxdxMqenDS3tPqd   and then that encrypted link goes on for about 100 more characters



The one thing that I have noticed is - it always uses the same beginning - <http://cp.mcafee.com/d/


Do they all end in a space, or a bracket or something else predictable?


If predictable.... I use a Script Includes with my InBound Mail Rules to find a string in the middle of a larger string and then I pull that string out and save it for use later. You must know the string before and after the URL you are looking for. In your case maybe "<http://cp.mcafee.com/d/" and ">".


You could take my script and modify it slightly to save the pieces of text you need instead of the middle string like I was doing. You would also need to loop through this to get them all.



Here is the code to get you started.



//Example:


//var myVariable = new MagicExtract();


//myVariable.parseAndExtract('stringzzz~test~xxxtext','z~','~');



var MagicExtract = Class.create();


MagicExtract.prototype = {


  initialize: function() {


  },


  parseAndExtract: function(StrBlob, strDelimiterOne, strDelimiterTwo) {



//var strDelimiterOne = '03: ~';


//var strDelimiterTwo = '3~';




  var str = StrBlob;


  gs.print(str);



//Keep commented code below for t-shooting purposes. MLS


//var str = 'testing 01: ~Line 1 Data1~'; //26


//str += '/ntesting 02: ~Line   Data2~';     //28


//str += '/ntesting 03: ~Line3   Data3~';     //28


//str += '/ntesting 04: ~Line   Data4~';     //16


//str += '/ntesting 05: ~Line   Data5~';     //16




var varStrOne = strDelimiterOne; //'testing 04: ~';


  gs.print(varStrOne);


var varStrTwo = strDelimiterTwo; //'~';


  gs.print(varStrTwo);


var first = str.indexOf(varStrOne);


  gs.print(first);


var second = str.indexOf(strDelimiterTwo, first + varStrOne.length);


  gs.print(second);




//Verify we found both delimiters or return NULL


var res = ''; //setup the variable




if (first != -1 && second != -1) {



  res = '"' + str.substring(first + varStrOne.length, 5) + '"';



  //Keep commented code below for t-shooting purposes. MLS


  //gs.log('first: ' + first + ' second: ' + second);


  //gs.log('res: ' + res);


  //gs.log('varStrOne.length: ' + varStrOne.length);


  //gs.log('Perfect Fit:' + str.substring(95,106));


  //gs.log(second - first + second);


  //gs.log(str.substring(95,(106)));


  //gs.log(first + varStrOne.length);


  //gs.log(second);


  //gs.log(str.substring(first + varStrOne.length,second));




res = str.substring(first + varStrOne.length,second);




}


else {


  res = '<Err:"DELIMITER NOT FOUND ERROR">';



}


  return res;


  },



  type: 'MagicExtract'


};


Thanks Brendan.



If it's always 100 characters (no more, no less) and you just want to strip it out I'd go with something like this (warning: untested)



var str = 'YOUR TEXT';


var patt = /http:\/\/:cp.mcaffee.com\/d\/[0-9A-Za-z!@.,;:'"?-]{100}/g;


var result = str.replace(patt, '');



Result now contains your original string stripped clean of all long nasty URLs.