Best practices for updating a large amount of records
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-17-2023 06:05 AM
Hi,
I need to update about 2 million records. There's no way to identify which record has already been updated, for the purpose of this post, assume that I'm converting the text of a text field to lowercase (so updateMultiple() isn't an option, since we need to access data from a field and convert it).
A basic update script would look like this:
var customTableGr = new GlideRecord('u_custom_table');
customTableGr.query();
while(customTableGr.next()) {
var lowerCaseText = customTableGr.getValue('u_text').toLowerCase();
customTableGr.setValue('u_text', lowerCaseText);
customTableGr.setWorkflow(false);
customTableGr.update();
}
I'm a little worried about execution time though, since testing with 100k records already took about 8-10 minutes, which would mean that 2 million records would probably take around 3 hours.
Is there any way to speed up the process, or at least ensure that I won't cripple the whole instance for 3 hours?
I'm wondering if slicing the data would improve performance (or instance health). I could slice the data into date intervals based on sys_created, so that each GlideRecord query would only contain 5 days worth of records for example, but that might still be 300k records in one query if I'm unlucky.
Does someone here have any experience and tips for updating so many records at once?
Thanks in advance,
Max
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-17-2023 01:45 PM
I was something similar in the past, and the only solution that worked was to split up the data into exclusive, non-overlapping sets (e.g. add an orderBy), and then fire up the background scripts in multiple sessions (e.g. by using different browser personas, incognito window, etc.).