Archiving large volumes of records

Dan Metcalfe1 · ‎01-04-2019

We have an extremely large incident table, and have activated archiving of inactive incidents that have been closed for more than 14 months.

However, when we left the ‘max number of batches (Max Iterations) to process when archiver runs’ at the default value of 10, Batch Size = 100, we found that multiple batches were attempting to archive the same incident record. This resulted in a much lower number of records actually being archived, so have had to drop the Max Iterations to 1 to prevent this. Now the number of records being archived (with certain related records) is higher than the number of incidents being created each month, but not by much. At the current rate, it will take a considerable length of time to catch up.

Has anyone else had this issue and found a way to address it? Does anyone have any strategies to speed up the archiving process?

sachin_namjoshi · ‎01-04-2019

Hello Dan,

We had encountered similar situation for very large customer with huge number of records.

We implemented custom archiving solution for this requirement.

I used datapump utility to extract data from all big tables with their relationships to oracle DB instance. This way you don't need to maintain data in service now and you can direct users to report on remote DB instance. This will help you to save expensive reporting queries on service now DB.

Please follow below for more details on this solution

https://sourceforge.net/projects/servicenowpump/

Regards,

Sachin

Dan Metcalfe1 · ‎01-07-2019

Hi Sachin,

Thanks for replying! I will check it out. I'm not sure whether it's feasible for us but will see.

Regards

Dan

Sebas Di Loreto · ‎05-08-2020

I have run into the same Dan. You must leave the batch size at 100 and max iterations at 1.
Also you must have configured related tables so for each archive incident you actually have 20 or 30 more related records (slas, metrics, incident tasks, audit, journal entries, etc.). This could make it difficult to catch up but you are doing the right thing, which is cleaning all your tables at the right moment. Since you archived that incident, then all the related records lose meaning... why keeping them.

The next thing you have to do is to optimize the archiving schedule since you could be wasting time between batches. The calendar is incredibly tricky and you must be in GMT to configure it.

The best thing you can do is to use the script below on the archive job with an interval of 1 minute. In other words, it will check every minute if it can run a new batch. If it is not idle, it will simply move on. The thing is that the allotted time will be fully utilized to archive batches.

On the calendar they key is to use each day ONCE
Don't use the seconds field. Put 11:59:00, NOT 11:59:59.
Configure for example, Saturday from 12:00:00AM to 11:59:00PM

Hope this helps to improve the throughput

var arStat = new GlideRecord("sys_status");
arStat.addQuery("name", "glide.db.archiver");
arStat.query();

if (arStat.next()) {
//if the job is idle then we can run it
//if the status is anything other than idle then we need to wait until the previous run completes
if (arStat.getValue("value").indexOf("idle") == 0) {
new GlideArchiver().archive();
} else {
var now = new GlideDateTime();
var lastUpdate = arStat.getValue("sys_updated_on");
var hours = Math.floor(gs.dateDiff(lastUpdate, now, true) / 60 / 60);
//if the status hasn't changed for more than 3 hours then it is not actually running - this will happen after the node that was running the archive thread has been restarted. Delete the status and start a new archive thread
if (hours > 3) {
var eml = new GlideRecord("sys_email");
//here is where you can put the email of people you would like to be alerted about this
eml.recipients = "youremail@acme.com";
eml.type = "send-ready";
eml.body = "GlideArchiver job restarted because status has not changed for over 3 hours: " + arStat.value;
eml.notification_type = "SMTP";
eml.subject = "GlideArchiver job restarted";
eml.insert();
arStat.deleteRecord();
new GlideArchiver().archive();

} else {
var eml = new GlideRecord("sys_email");
//here is where you can put the email of people you would like to be alerted about this
eml.recipients = "youremail@acme.com";
eml.type = "send-ready";
eml.body = "GlideArchiver job skipped because it is already running: " + arStat.value;
eml.notification_type = "SMTP";
eml.subject = "GlideArchiver job skipped";
eml.insert();
}
}
} else {
new GlideArchiver().archive();
}

If I helped you with your case, please click the Thumb Icon and mark as Correct.

dhimple_b · ‎01-20-2025

Hi where to use this script?