- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎09-25-2014 08:08 AM
I would like to be able to validate that the attachment file is in fact a valid pdf, and not just a random file renamed to ***.pdf...
Anyone know if this is possible?
Solved! Go to Solution.
- Labels:
-
Enterprise Release Management
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎09-26-2014 06:45 PM
I think it would be pretty challenging with JavaScript to really validate that a file is a valid PDF because there are several PDF versions and you'd have to check against each of them. Some tools that create PDFs do better than others at making proper PDFs (and some PDF readers will load the poorly-formatted PDFs), which makes true validation even more challenging.
A quick-and-dirty check, though, might be to see if the file starts with "%PDF-" as that is one thing that should be consistent. The first line of the header of a PDF is supposed to simply be that followed by the version number. (For instance: %PDF-1.4) If a file starts with that, I would think it's likely to be a PDF; if it doesn't, it's unlikely to be a PDF - or at least not a well-formatted one.
I played around with a Business Rule (which runs after insert on the sys_attachment table) to try to get the first 5 characters of an attachment and the best I could come up with is this:
var StringUtil = Packages.com.glide.util.StringUtil;
var sa = new GlideSysAttachment();
var head = StringUtil.base64Decode(StringUtil.base64Encode(sa.getBytes(current))).substring(0,5);
gs.log("head: " + head);
When I attach a PDF to an Incident, I find "head: %PDF-" in the System Log. Rather than write it to the log, of course, you'd probably want to use it in a conditional statement to take whatever action you have in mind based on whether it matches.
Because my script is in a Business Rule for the sys_attachment table, 'current' holds the reference to my attachment. If you run your script somewhere else, you would use the variable you declare for your GlideRecord object (which should also be initialized for the sys_attachment table).
I think there must be a more sensible way to get at the contents of the file, but I haven't found one. I've not been able to find much documentation on the StringUtil package or the GlideSysAttachment object. If someone knows of a better way (or a link to documentation), I hope they share it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎09-26-2014 06:45 PM
I think it would be pretty challenging with JavaScript to really validate that a file is a valid PDF because there are several PDF versions and you'd have to check against each of them. Some tools that create PDFs do better than others at making proper PDFs (and some PDF readers will load the poorly-formatted PDFs), which makes true validation even more challenging.
A quick-and-dirty check, though, might be to see if the file starts with "%PDF-" as that is one thing that should be consistent. The first line of the header of a PDF is supposed to simply be that followed by the version number. (For instance: %PDF-1.4) If a file starts with that, I would think it's likely to be a PDF; if it doesn't, it's unlikely to be a PDF - or at least not a well-formatted one.
I played around with a Business Rule (which runs after insert on the sys_attachment table) to try to get the first 5 characters of an attachment and the best I could come up with is this:
var StringUtil = Packages.com.glide.util.StringUtil;
var sa = new GlideSysAttachment();
var head = StringUtil.base64Decode(StringUtil.base64Encode(sa.getBytes(current))).substring(0,5);
gs.log("head: " + head);
When I attach a PDF to an Incident, I find "head: %PDF-" in the System Log. Rather than write it to the log, of course, you'd probably want to use it in a conditional statement to take whatever action you have in mind based on whether it matches.
Because my script is in a Business Rule for the sys_attachment table, 'current' holds the reference to my attachment. If you run your script somewhere else, you would use the variable you declare for your GlideRecord object (which should also be initialized for the sys_attachment table).
I think there must be a more sensible way to get at the contents of the file, but I haven't found one. I've not been able to find much documentation on the StringUtil package or the GlideSysAttachment object. If someone knows of a better way (or a link to documentation), I hope they share it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎09-29-2014 12:09 AM
Thank you very much for your answer. I think your solution would be enough to filter the majority of wrong uploads. And if my users are having trouble uploading real pdf's that are filtered out because of poorly-formatting, they will probably let me know anyways