How to handle long-running scheduled jobs that may overlap

sndevastik
Tera Contributor

Hi All,

 

I have a scenario where a scheduled job is configured to run every 2 hours and fetch/process millions of data records. The challenge is that, due to the large volume of data, the job can sometimes take up to 3 hours to complete.

In such cases, there is a possibility of overlapping executions, data loss, or errors.

What is the best way to design or handle this situation in ServiceNow to make sure:

  • Data is not lost

  • Jobs do not overlap

  • Errors are minimized

Any suggestions or best practices would be really helpful.

Thanks in advance!

2 ACCEPTED SOLUTIONS

Mark Manders
Mega Patron

You should evaluate the entire process. Why is it necessary to collect millions of records? Can't you use a delta on changed records? That would already limit the load. 

Then check why it is necessary to collect millions of records every two hours? Isn't once a day enough? Or maybe every 3-6 hours? 

This job runs continuously (every two hours and can take up more than that, means it is always running and importing/processing data). If the 'why' of it results in it being necessary and you can't limit the load, you will need to update the processing to check on the 'updated' of the imported records. If that's over two hours after the current job started, don't update it, because you are updating the record with old data.

 


Please mark any helpful or correct solutions as such. That helps others find their solutions.
Mark

View solution in original post

Keara1122
Tera Contributor

You can handle this by adding a mutex/flag check before the job starts, so a new run won’t trigger if the previous one is still active. Another option is to use job queues or break the process into smaller batches with checkpoints. This way you avoid overlaps, ensure data integrity, and reduce errors.

View solution in original post

2 REPLIES 2

Mark Manders
Mega Patron

You should evaluate the entire process. Why is it necessary to collect millions of records? Can't you use a delta on changed records? That would already limit the load. 

Then check why it is necessary to collect millions of records every two hours? Isn't once a day enough? Or maybe every 3-6 hours? 

This job runs continuously (every two hours and can take up more than that, means it is always running and importing/processing data). If the 'why' of it results in it being necessary and you can't limit the load, you will need to update the processing to check on the 'updated' of the imported records. If that's over two hours after the current job started, don't update it, because you are updating the record with old data.

 


Please mark any helpful or correct solutions as such. That helps others find their solutions.
Mark

Keara1122
Tera Contributor

You can handle this by adding a mutex/flag check before the job starts, so a new run won’t trigger if the previous one is still active. Another option is to use job queues or break the process into smaller batches with checkpoints. This way you avoid overlaps, ensure data integrity, and reduce errors.