Merge Duplicate Records

mballinger
Mega Guru

Hello,

I am trying to write a fix/background script to find all duplicate records and then merge them. I am not sure how to really do this. Records will be uniquely identified using 3 fields from the table and the most recently updated record will be the primary record. 

Here is what I have so far:

var count = new GlideAggregate('cmdb_ci_computer');
count.addAggregate('count', 'name');
//count.addAggregate('count', 'serial_number'); //This doesn't work 
//count.addAggregate('count', 'model_id'); //This doesn't work 
//count.groupBy('serial_number'); //Doesn't print out the name of the record
count.addHaving('count', '>', 1);
count.query();
while(count.next()) {
    var name = count.name;
    var record_count = count.getAggregate('count', 'name');
    gs.info("Duplicates: " + name + ", Count: " + record_count;
}

The code above returns the duplicate record plus the total count. I am also only able to uniquely identify duplicate records by using only 1 field. I need to use name, serial_number, and model_id to uniquely identify records.

The next step would then need to be merging the records by most recently updated/created. I have no idea how to attempt that.

Thanks!

@Ankur Bawiskar @Brad Bowman 

1 ACCEPTED SOLUTION

Hi,

As per your requirement, I have implemented logic below.

var gaComputer = new GlideAggregate('cmdb_ci_computer');
gaComputer.groupBy("name");
gaComputer.groupBy("serial_number");
gaComputer.groupBy("model_id");
gaComputer.addAggregate("COUNT", "name");
gaComputer.query();

while (gaComputer.next()) {
    
    var duplicateCount = gaComputer.getAggregate("COUNT", "name");// This will return count of records where Name, Model ID and Serial Number is unique
    if (duplicateCount > 1) {

        var grComputer = new GlideRecord("cmdb_ci_computer");
        grComputer.addEncodedQuery("name=" + gaComputer.name + "^serial_number=" + gaComputer.serial_number + "^model_id=" + gaComputer.model_id);
        grComputer.orderByDesc("sys_updated_on"); // This will make sure to give most recently updated/created record first.
		grComputer.setLimit(1);// we are getting only primary record here (we will perform operation with secondary records in next while loop)
        grComputer.query(); // This will return number of records with same Name, Serial Number and Model ID

            if(grComputer.next()) {
                
               if(grComputer.getValue("size")=="" || grComputer.getValue("storage")=="" || grComputer.getValue("screen")==""){
				   
					var grComputer2=new GlideRecord("cmdb_ci_computer");
					grComputer2.addEncodedQuery("name=" + gaComputer.name + "^serial_number=" + gaComputer2.serial_number + "^model_id=" + gaComputer.model_id);
					grComputer2.orderByDesc("sys_updated_on"); // this will make sure that we are updating primary record with recent updated/created secondary records if there are more than 2 duplicate reocrds
					grComputer2.chooseWindow(1,duplicateCount); // this will provide us all duplicate records by excluding primary record as initial window starts with 1
					grComputer2.query();
					
					while(grComputer2.next()){
						
					if(grComputer.getValue("size")=="" && grComputer2.getValue("size")!=""){
					grComputer.setValue("size",grComputer2.getValue("size"));
					}
					if(grComputer.getValue("storage")=="" && grComputer2.getValue("storage")!=""){
					grComputer.setValue("size",grComputer2.getValue("storage"));
					}
					if(grComputer.getValue("screen")=="" && grComputer2.getValue("screen")!=""){
					grComputer.setValue("size",grComputer2.getValue("screen"));
					}
					grComputer.update();
				}
            }
        }
}

Please mark this as Correct or helpful if it helps.

Thanks and Regards,
Abhijit

By marking my response as correct or helpful, you contribute to helping future readers with similar issues.
Regards,
Abhijit
ServiceNow MVP

View solution in original post

16 REPLIES 16

I’m struggling to delete the duplicate records now. I passed grComputer2.deleteRecord() right above grComputer.update() and the record is not deleting for some reason. I also changed the while loop for the gaComputer to an if statement and nested it in a for loop to test 5 records.

for(var i=0; i < 5; i++) {
  //pasted rest entire gaComputer block here to test 5 records for testing purposes 
  If(gaComputer.next()) {.......


}

I’m not sure if this portion is breaking it.

Provide whole script please then only I would be able to identify issue.

By marking my response as correct or helpful, you contribute to helping future readers with similar issues.
Regards,
Abhijit
ServiceNow MVP

Sorry was out sick last week. Here is the code. You can see after the gaComputer query, I add a for loop to loop through 5 iterations of records. I change the while loop to an if for gaComputer. The last 2 lines of code, I ran a grComputer2.deleteRecord() to delete the duplicate record after it was copied over to the Primary record. 

Before adding the for loop, I ran the code as is, but I also added the grComputer2.deleteRecord() right before the update, and the dupe records were not deleting.

var gaComputer = new GlideAggregate('cmdb_ci_computer');
gaComputer.groupBy("name");
gaComputer.groupBy("serial_number");
gaComputer.groupBy("model_id");
gaComputer.addAggregate("COUNT", "name");
gaComputer.query();
for (var i = 0; i < 5; i++) {
    if (gaComputer.next()) {

        var duplicateCount = gaComputer.getAggregate("COUNT", "name"); // This will return count of records where Name, Model ID and Serial Number is unique
        if (duplicateCount > 1) {

            var grComputer = new GlideRecord("cmdb_ci_computer");
            grComputer.addEncodedQuery("name=" + gaComputer.name + "^serial_number=" + gaComputer.serial_number + "^model_id=" + gaComputer.model_id);
            grComputer.orderByDesc("sys_updated_on"); // This will make sure to give most recently updated/created record first.
            grComputer.setLimit(1); // we are getting only primary record here (we will perform operation with secondary records in next while loop)
            grComputer.query(); // This will return number of records with same Name, Serial Number and Model ID

            if (grComputer.next()) {

                if (grComputer.getValue("size") == "" || grComputer.getValue("storage") == "" || grComputer.getValue("screen") == "") {

                    var grComputer2 = new GlideRecord("cmdb_ci_computer");
                    grComputer2.addEncodedQuery("name=" + gaComputer.name + "^serial_number=" + gaComputer2.serial_number + "^model_id=" + gaComputer.model_id);
                    grComputer2.orderByDesc("sys_updated_on"); // this will make sure that we are updating primary record with recent updated/created secondary records if there are more than 2 duplicate reocrds
                    grComputer2.chooseWindow(1, duplicateCount); // this will provide us all duplicate records by excluding primary record as initial window starts with 1
                    grComputer2.query();

                    while (grComputer2.next()) {

                        if (grComputer.getValue("size") == "" && grComputer2.getValue("size") != "") {
                            grComputer.setValue("size", grComputer2.getValue("size"));
                        }
                        if (grComputer.getValue("storage") == "" && grComputer2.getValue("storage") != "") {
                            grComputer.setValue("size", grComputer2.getValue("storage"));
                        }
                        if (grComputer.getValue("screen") == "" && grComputer2.getValue("screen") != "") {
                            grComputer.setValue("size", grComputer2.getValue("screen"));
                        }

                        grComputer2.deleteRecord();
                        grComputer.update();

                    }
                }
            }
        }
    }
}

 

Script looks good to me. If its not working then we need to debug further by adding some log info.

Did you add logs to check if it is even going inside if condition?

It will go inside that loop only if primary record is having at least one of the field (storage, screen and size) is empty.

Please add logs and let me know issue further.

Thanks and Regards

Abhijit

By marking my response as correct or helpful, you contribute to helping future readers with similar issues.
Regards,
Abhijit
ServiceNow MVP

@Abhijit - I was able to get everything to work. I re-wrote the script quite a bit. The issue I ran into was not all records were not being deleted and updated, but the re-write fixed everything. Thanks again for all your help!