Finding and deleting orphaned attachments

Janel · ‎07-19-2023

Looks like I can't create articles anymore so I guess I'll post it as a question.

2025-01-29 Edit: As of Washington, there is now an out-of-box way to delete orphaned attachments...

...which ServiceNow is keeping really quiet about. Maybe this article will resurface one day.

https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB1706592

In the mean time, Mark Roethof was able to cover some of the out-of-box job here: https://www.servicenow.com/community/now-platform-articles/purge-orphan-attachments-custom-solution-...

Recently, we got to go through a lovely exercise of drastically reducing our storage footprint. I'd like to share how we are able to identify orphaned records on sys_attachment and delete them.

The biggest issue with orphans on this (or any) table is that it could happen at any time to any attachment record. A record from 2018 could be orphaned in 2024 and there is no way to know about via the sys_attachment record. There is no value change to the record, it still has a table_name and table_sys_id populated, but that table_sys_id is a deleted record. Not a chance you could find those in a list view.

So how do you know what is legit and what should be deleted?

Check every record.

First, we wanted to know where the orphans were coming from and fix those processes where we could. What tables are we commonly finding orphans on?

Since sys_attachment is so big, I manually ran this script in 6-month intervals and at the end, I had it updating a record so I had somewhere to hold on to the data.

Unfortunately, because an attachment could be orphaned at any time, I had to keep repeating this from the oldest record to the new one to get a few days worth of data.

//Capture how many attachments we have that have a table_name and table_sys_id populated, but not valid records.
var start_date = '2023-06-27';
var end_date = '2023-07-06';

var valid_attachments = new GlideRecord('sys_attachment');
valid_attachments.addEncodedQuery("table_nameNOT LIKEZZ_YY^table_nameNOT LIKEinvisible^table_name!=placeholder^sys_created_onBETWEENjavascript&colon;gs.dateGenerate('" + start_date + "','00:00:00')@javascript&colon;gs.dateGenerate('" + end_date + "','23:59:59')");
valid_attachments.orderBy('sys_created_on');
valid_attachments.query();

var skipped = 0;
var deleted = 0;
var orphaned_tables = []; //Capture number of times a table appears
var orphaned_ids = ''; //Only use for testing
while (valid_attachments.next()) {
    if (!gs.tableExists(valid_attachments.table_name)) { //If the table_name is invalid, so nothing
        //orphaned_tables.push('!table-' +valid_attachments.table_name); //Or capture it!
        continue;
    }
    var record_check = new GlideRecord(valid_attachments.table_name);
    if (record_check.get(valid_attachments.table_sys_id) && record_check.isValidRecord()) {
        skipped++;
        continue;
    } else {
        deleted++;
        orphaned_tables.push(valid_attachments.getValue('table_name'));
        //orphaned_ids += valid_attachments.sys_id + ','; //Uncomment for a SMALL amount if you want to look at some specific test records.
        //valid_attachments.deleteRecord();  //We chose to delete the records as we went.
    }
}

orphaned_tables.sort();
//We want to know how many times the same tables appear in our orphaned_table list.
var table = null; //The table we want to count on.
var table_count = 0;
var final_list = '';
for (var i = 0; i < orphaned_tables.length; i++) {
    if (orphaned_tables[i] != table) { //If the current element is not equal to the currently set table we're checking on, that means we moved on to the next table and we capture the final count now.
        if (table_count > 0) {
            final_list += table + ': ' + table_count + '\n';
        }
        table = orphaned_tables[i]; //Reset this to the next table to start counting on.
        table_count = 1; //And go back to one
    } else {
        table_count++; //If the current element is the same table, add to the counter.
    }
}
if (table_count > 0) {
    final_list += table + ': ' + table_count + '\n';
}

var text = '\n' + gs.nowDateTime() + " Valid attachment check: " + start_date + " to " + end_date + "\nValid: " + skipped + '\nDeleted: ' + deleted + '\n\n' + final_list + '-------- -------- -------- -------- \n'; // +orphaned_ids;

var updateStory = new GlideRecord('rm_story'); //Instead of trying to log all this, update the Description in STRY0057314
updateStory.get('544229f7c3a2a9d4caa7b1deb00131db');
updateStory.description += text;
updateStory.update();

The above helped us learn where we were seeing orphans come from and we were able to start addressing them. (mostly from archiving and ml_model_artifact).

Now that we figured out and fixed where some of them were coming from, we knew this was just going to be a thing going forward and needed to address them.

Next, I made a scheduled job that crawls the sys_attachment_table in increments of 200,000 records every hour and verifies if the record the attachment is linked to is valid or not. If it isn't, then delete it. This is very similar to above, just running in automated chunks (now that I think about it, I could have used this same setup below to automate the above with dates instead of manually adjusting the dates).

Adjust the values for what works for you and as always, test in a sub-prod first!

You will need 2 system properties to keep track of which chunk of records to look at (yes, this will shift slightly as records are added or deleted through normal means).

Both properties should be integers.

Property 1 (called "orphaned.attachment.cleanup.window1") should be set to zero

Property 2 (called "orphaned.attachment.cleanup.window2") should be set to the first chunk of records you want to delete (I started ours at 200,000).

Now you'll need the scheduled job.

Name: Delete orphaned attachments.

Run: Periodically

Repeat Interval 1 hour

//We get many orphaned attachments on sys_attachment where table_sys_id left with a dead sys_id.  Find these records and attempt to look them up.  If there is not a valid record found, delete the attachment.
//We know archiving leaves behind attachments and so does ml_model_artifact (there is a PRB for that).

//When an attachment gets orphaned there is no update to the attachment record so we can't do something simple like, look for updates made yesterday and only check those.  An attachment from 2018 can get orphaned in 2024 and we'd never know about it unless we look at all the attachments.  So we do that hourly and in chunks using chooseWindow to do paging.  This is better than setLimit because setLimit will just look at the same first chunk of records over and over.  We need to paginate.

var attach_total = new GlideAggregate('sys_attachment');
attach_total.addEncodedQuery("table_nameNOT LIKEZZ_YY^table_nameNOT LIKEinvisible^table_name!=placeholder");
attach_total.addAggregate('COUNT');
attach_total.query();
var total = 0;
attach_total.next();
total = attach_total.getAggregate('COUNT'); //We need to get the total so we know when to reset our windows sizes.


var chunky = 200000; //This is the number of records were will do at a time.
var window1 = Number(gs.getProperty('orphaned.attachment.cleanup.window1')); //This will be where we left off from last run
var window2 = Number(gs.getProperty('orphaned.attachment.cleanup.window2')); //This will be the next number of pages set by the last run.


var valid_attachments = new GlideRecord('sys_attachment');
valid_attachments.addEncodedQuery("table_nameNOT LIKEZZ_YY^table_nameNOT LIKEinvisible^table_name!=placeholder"); //ZZ_YY tables are fields that contain images (like catalog item list images), invisible is for OOB things, and placeholder has a ton of records that we have a table cleaner for already.
valid_attachments.chooseWindow(window1, window2);
valid_attachments.orderBy('sys_created_on');
valid_attachments.query();

while (valid_attachments.next()) {
    if (!gs.tableExists(valid_attachments.table_name)) { //If the table doesn't exist, abort.  This is up to you how you want to handle these.
        continue;
    }
    var record_check = new GlideRecord(valid_attachments.table_name);
    if (record_check.get(valid_attachments.table_sys_id) && record_check.isValidRecord()) { //If the record is valid...
        continue; //Do nothing and continue to the next one.
    } else {
        valid_attachments.deleteRecord();  //If the record is not valid, then delete it.
    }
} //End of while

//When window2 exceeds the total number of records, we reset the system properties.

//We get the properties via GlideRecord so we can use autoSysFields(false) and not flood the versions list with a zillion things since it runs hourly.
//If you don't care that the versions list will be huge, you can instead delete this and just do gs.setProperty('property.name', value);
var prop1 = new GlideRecord('sys_properties');
prop1.get('name', 'orphaned.attachment.cleanup.window1');
var prop2 = new GlideRecord('sys_properties');
prop2.get('name', 'orphaned.attachment.cleanup.window2');

if (total - window2 < 0) { //Once our total number of records minus our lagest pagnation number is less than zero, we restart everything.
    prop1.value = 0;
    prop2.value = chunky;
} else { //If we're not restart, then set our proeprties to the updated windows.
    prop1.value = window2.toFixed(0);
    prop2.value = (window2 + chunky).toFixed(0);
}
//If you don't care about the versions list, you can just use setProperty above instead.
prop1.autoSysFields(false); //Don't track as an update
prop1.update();
prop2.autoSysFields(false); //Don't track as an update
prop2.update();

And this has been working pretty well for us. 200,000 records is quick to get through and we haven't seen any sort of performance impact.

I hope this helps you identify and manage orphaned attachments!

Edvin-Karalius · ‎07-19-2023

Nah, you are still doing a gr query inside of the while loop. This goes back an forward between server and DB every loop. But because it is different tables each loop, you would ofc need do group the first query by tables and an array of it's for each table. Ofcourse this will still be multiple queries. But you are instead caped at 1 query per table as maximum. This can be done at the glide aggregate and used to group by table name.

Edvin-Karalius · ‎07-20-2023

Found some old screenshots of the two examples and execution times:

The slow method.
This is what your script is currently doing, using get() within a loop to fetch data from DB one by one.

The faster method:

Both of these methods were running on 10 000 incident records.
They all have this code above:

This is just getting 10k records and adding them to the array.

The difference is more or less 10 times faster 🙂
Hope this helps to understand it better!

Feel free to test it out on your own environment!

Janel · ‎07-20-2023

What you're suggesting would be good if you were targeting a specific set of tables, like your examples show. After figuring out the most common orphaned tables, simplifying the job to delete the specific targets like you're suggesting is could be the way to go!

For finding the orphans, I think your way is more viable. If I get some free time I'm going try test a few things. For deleting, I could see getting risky and running into memory issues.

It'd be nice if these were references (yes, I know it isn't a good place for it) instead of document_ids, so we could just dot-walk for something is null instead.

Edvin-Karalius · ‎07-20-2023

When you get the record id's, just add them to an object where the table name is the key...

Here is an example of all tables that have records in task.

var gaTask = new GlideRecord('task');
gaTask.setLimit(100);
gaTask.query();

var results = {};

while(gaTask.next()){
  var className = gaTask.getValue('sys_class_name');
  
  if(!results[className])
    results[className] = [];
  
  results[className].push(gaTask.getUniqueValue());
}

gs.log(results);

and then you can loop through the object and get all records in.
If you wanna skip the object -> array conversion later, you can also just push in an object into an array containing 2 properties. table name and id's.
(Note: This is much higher level of code, might be hard for others to understand how object reference in memory works and be able to maintain it).
Can also add more fun stuff like the count etc...
Would look like this:

var gaTask = new GlideRecord('task');
gaTask.setLimit(100);
gaTask.query();

var objectRef = {};
var results = [];

while(gaTask.next()){
  var className = gaTask.getValue('sys_class_name');
  
  if(!objectRef[className]){
    objectRef[className] = {
      table: className,
      count: 0,
      idArray: []
    };
    
    results.push(objectRef[className]);
  }
  
  objectRef[className].idArray.push(gaTask.getUniqueValue());
  objectRef[className].count++;
	
}

gs.log(results);

Janel · ‎07-26-2023

I hope people are bookmarking/copying your object example.

So after you get everything into an array, sorted by tables, when you do the big query against all the sys_ids, the issue you're left with is that the final query will only return valid records, when we're looking for invalid records.

I'd have to either keep track of the valid sys_id's in a new object and compare later or remove the the valid sys_id from the initial array (slice is your friend, but you still need the index), so I'm only left with the invalid ones.

Then after all that, I'd still need to take care of the invalid attachment. Now I have to do another query to go back to sys_attachment, query for the table_sys_ids, then delete the record. If you grouped it, doing a deleteMultiple() may end up with performance issues (depending on how much there is to do, but it has not been my friend lately) and I've witnessed it leaving orphans on sys_attachment_doc.

I still think you're going to run into memory issues trying to hang on to and manipulate that much data.