Reporting on Duplicate records grouped by multiple columns for 2M+ records

tkrishna29
Giga Guru

Hi,

I'm trying to build a report to show the records that have duplicate values across 3 columns. As per my understanding, glideaggregate can only perform grouping for a single field. So, I wrote a script include which can find all the records with multiple grouping (Script referenced below). This runs as expected.

The challenge is I have to run for all records (2M+ records) and get all duplicates. I'm worried this will hang the system or impact performance if i directly use this scriptinclude as a javascript condition in my report. What would be the best practice to have this run as a background script and get the record numbers without impacting system performance? So that they can be utilized in the report?

One idea I have is to create a fix script, run the script and get all the data to a system property (String type) and use that as a reference in my report. Greatly appreciate it if you have any other ideas.

Here, my aggregation is on cmdb_ci, vulnerability & port.

   getDuplicateVITs: function() {
        var vitDuplicates = [];
        var ga = new GlideAggregate('sn_vul_vulnerable_item');
        //        ga.addEncodedQuery('active=true^assigned_toISNOTEMPTY');
        ga.addEncodedQuery('numberINVIT0913626,VIT0913625');
        ga.addAggregate('COUNT', 'cmdb_ci');
        ga.groupBy('cmdb_ci');
        ga.addHaving('COUNT', '>', 1);
        ga.query();
        while (ga.next()) {
            var ga1 = new GlideAggregate('sn_vul_vulnerable_item');
            ga1.addQuery('cmdb_ci', ga.cmdb_ci);
            ga1.addAggregate('COUNT', 'vulnerability');
            ga1.groupBy('vulnerability');
            ga1.addHaving('COUNT', '>', 1);
            ga1.query();
            while (ga1.next()) {
                var ga2 = new GlideAggregate('sn_vul_vulnerable_item');
                ga2.addQuery('cmdb_ci', ga.cmdb_ci);
                ga2.addQuery('vulnerability', ga1.vulnerability);
                ga2.addAggregate('COUNT', 'port');
                ga2.groupBy('port');
                ga2.addHaving('COUNT', '>', 1);
                ga2.query();
                while (ga2.next()) {
//                    gs.info(ga.cmdb_ci.getDisplayValue() + ":" + ga1.vulnerability.getDisplayValue() + ":" + ga2.port);
					var gr = new GlideRecord('sn_vul_vulnerable_item');
					gr.addQuery('cmdb_ci',ga.cmdb_ci);
					gr.addQuery('vulnerability',ga1.vulnerability);
					gr.addQuery('port',ga2.port);
					gr.query();
					while (gr.next()) {
						vitDuplicates.push(gr.number);
					}
                }				
            }
        }
		return vitDuplicates.toString();
    }
Output for the above script: VIT0913626,VIT0913626
1 ACCEPTED SOLUTION

tkrishna29
Giga Guru

Fortunately, I didn't see any performance issues in utilizing this script. I'll mark my post as Complete. Thank you!

View solution in original post

3 REPLIES 3

Yousaf
Giga Sage

Hi Krishna,
Check this link it has a discussion about your question see if it helps answering your question
https://www.reddit.com/r/servicenow/comments/ox6bgs/performance_issues_with_reporting_in_servicenow/

Mark Correct or Helpful if it helps.

Thanks,
Yousaf


***Mark Correct or Helpful if it helps.***

tkrishna29
Giga Guru

Fortunately, I didn't see any performance issues in utilizing this script. I'll mark my post as Complete. Thank you!