- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-15-2017 06:17 AM
I have a requirement from our business with attachment indexing to not only return a record that has an attachment that contains specified text but to also tell them where within the attachment it has found it.
I can't seem to find any information about where the text indexes are stored or how this can be achieved on ServiceNow - has anyone come across this before or have any information to help?
Solved! Go to Solution.
- Labels:
-
Analytics and Reports
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-15-2017 03:09 PM
Hi Graham,
The way that ServiceNow handles text indexing of attachments there is no consideration for where in the document a given term exists. When an attachment is indexed, the content of the attachment is read with an input stream and as the stream is read, each term is pulled out into a simple list of "terms". These terms are then connected to a Word Stemming database that handles the connections between similar words (e.g. run, runs, running). Each root word found in the "terms" is also linked to a "document" (i.e. whatever record the attachment is attached to). The API that handles all of this is not exposed and even if it was there are just no mechanisms for determining where a given term exists within an attachment.
I started to wonder if there was a way to somehow scan through an attachment's text contents. The problem is that attachments are stored in the database as 10kb, base64 encoded, chunks of binary data in the sys_attachment_doc table. They have to be re-assembled to be useful and there is no API for getting excerpts of the text contents of an attachment. Unfortunately I do not think there is currently any way to do what you want to do. Perhaps someone has written a custom application.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-15-2017 03:09 PM
Hi Graham,
The way that ServiceNow handles text indexing of attachments there is no consideration for where in the document a given term exists. When an attachment is indexed, the content of the attachment is read with an input stream and as the stream is read, each term is pulled out into a simple list of "terms". These terms are then connected to a Word Stemming database that handles the connections between similar words (e.g. run, runs, running). Each root word found in the "terms" is also linked to a "document" (i.e. whatever record the attachment is attached to). The API that handles all of this is not exposed and even if it was there are just no mechanisms for determining where a given term exists within an attachment.
I started to wonder if there was a way to somehow scan through an attachment's text contents. The problem is that attachments are stored in the database as 10kb, base64 encoded, chunks of binary data in the sys_attachment_doc table. They have to be re-assembled to be useful and there is no API for getting excerpts of the text contents of an attachment. Unfortunately I do not think there is currently any way to do what you want to do. Perhaps someone has written a custom application.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-17-2017 12:49 AM
Hi Matthew,
Thank you for the information - I must admit I did think it was a long-shot. I'll look into custom applications made by others to see if there is anything there that can address my user case.
Thanks again