How to fix duplicate records in CMDB

Amit39
Kilo Expert

Hi Team,

Scenario -: I have added manually (bulk upload through transform map script) some of the servers to the respective CMDB server class (mostly Linux and Windows). After some days once the ACL and firewall got fixed on my discovery network, my discovery MID server is able to discover those target servers which got created manually earlier.

Here is the issue -: While looking into those servers list, I can see duplicate records gets created with the same Host Name.

Example: one showing created by me which added manually earlier and another one showing created by service-now which gets discovered by ServiceNow discovery. Both are having same Host Name, IP and Name information.

May I know how and where to start and fix this duplicate entry creation issue specially for Windows and Linux servers.

Also may I know how to fix this with "CI Identifier" setup in SerivceNow.

Thank you in advance.

Regards,
Amit

5 REPLIES 5

Jon G
Kilo Expert

I was struggling with something similar.  Lot's of weird duplicate devices either from bad imports or manual creation.  We've implemented the Identification & Reconciliation engine on our automated imports, but it still doesn't catch everything.

I wrote a script to help deal with this.  Please note that this is only tested against CMDB_CI although theoretically you can scan and log other tables for duplicate records.  

Note that this only creates the deduplication tasks for CMDB.  It does not process them, so you will need to go through those manually.

This is designed to be run as a background script but I'm sure it could be tailored to run other ways.

//Deduplication script

//This script will identify duplicate records based on the filter, table, and field set below
//Once identified, it can create a deduplication task on reconcile_duplicate_task table

//Variables - Adjust these as needed

//Variable 1 - Add an encoded filter query
var vFilter = "last_discovered<javascript:gs.dateGenerate('2020-01-25','00:00:00')^ORlast_discoveredISEMPTY";

//Variables 2 & 3 - Define the table and field you want to check for duplicates
var vTable = 'cmdb_ci'
var vField = 'serial_number';

//Variable 4 - What IF / Logging only option:
//Set to TRUE if you want to create dedupe tasks. Set to FALSE if you want to review logs without creating tasks
//NOTE - Deduplication tasks are for CMDB Only. A dedupe task will ONLY be created if vTable = CMDB_CI. Use filters instead of table if you want to scan a child table of CMDB_CI
//Theoretically, the logging functions of this script will work fine against other tables but that hasn't been fully tested
var vCreateDeDupeTask = false;

//Variable 5 - Stop after # of duplicate sets found.  Set this to -1 if you want to generate all results without stopping
var vStopAfter = -1;

/****************************************/
/* DO NOT EDIT ANYTHING BELOW THIS LINE */
/****************************************/

//Declarations
var vAggType = 'COUNT';
var vPasses = 0;

//Identify duplicate records based on variable filters above where Aggregate COUNT > 1
var dupeSet = new GlideAggregate(vTable);
dupeSet.addEncodedQuery(vFilter);
dupeSet.addAggregate(vAggType, vField);
dupeSet.orderByAggregate(vAggType, vField);
dupeSet.addHaving(vAggType, vField, '>', 1);
dupeSet.query();

//If duplicates are discovered, identify the individual duplicate records from the aggregate
while (dupeSet.next()) {
    vPasses++;
    var vDupeLog = '';
    var vDuplicateRecordsList = '';
    var vAggCount = dupeSet.getAggregate(vAggType, vField);
    var vAggQuery = dupeSet.getAggregateEncodedQuery();

    vDupeLog += 'Duplicate records found: ' + vAggCount + ' records where ' + vField + ' = ' + dupeSet.getValue(vField);
    
    //Get info from each device in aggregate
    var dupeDevices = new GlideRecord(vTable);
    dupeDevices.addEncodedQuery(vAggQuery);
    dupeDevices.addEncodedQuery(vFilter);
    dupeDevices.orderBy('sys_updated_on');
    dupeDevices.query();

    //Create Log Headers
    vDupeLog += '\n\t' + vField + '\t' + 'Updated On\tSys_ID'
    if (vTable == 'cmdb_ci') {
        vDupeLog += '\t' + 'Serial Number' + '\t' + '\t' + 'Last Discovered';
    }

    //Log individual records w/ details and add the sys_id to the duplicates list
    while (dupeDevices.next()) {

        
        vDupeLog += '\n\t' + dupeDevices.getValue(vField) + '\t' + dupeDevices.sys_updated_on + '\t' + dupeDevices.sys_id;

        if (vTable == 'cmdb_ci') {
            vDupeLog += '\t' + dupeDevices.serial_number + '\t' + '\t' + dupeDevices.last_discovered;
        }

        vDuplicateRecordsList += dupeDevices.sys_id + ',';

    }

    //If enabled, create a Deduplication Task on reconcile_duplicate_task table
    //Creating a de-duplication task will FAIL if one of the sys_ids is already a part of an existing open deduplication task
    if (vCreateDeDupeTask && vTable == 'cmdb_ci') {
        
        //remove trailing comma ','
        vDuplicateRecordsList = vDuplicateRecordsList.replace(/(,$)/g, "");

        var deDupeUtil = new CMDBDuplicateTaskUtils();
        try {
            var deDupeTaskID = deDupeUtil.createDuplicateTask(vDuplicateRecordsList);

            //If a dedupe task has been created, update that task with additional data and log the task info
            if (deDupeTaskID) {
                var deDupeTask = new GlideRecord('reconcile_duplicate_task');
                deDupeTask.get(deDupeTaskID);
                deDupeTask.query();
                deDupeTask.next()
                deDupeTask.short_description = 'Duplicate records found: ' + vAggCount + ' records where ' + vField + ' = ' + dupeSet.getValue(vField);
                deDupeTask.work_notes = vDupeLog;
                deDupeTask.update()

                vDupeLog += '\n' + 'Deduplication task created: ' + deDupeTask.number;
                vDupeLog += '\n' + 'with the following records: ' + vDuplicateRecordsList;
            }
            else {
                vDupeLog += '\n' + 'Deduplication task not created.\nThis can happen when one of the sys_id\'s is already a part of an existing open dedupe task OR the table is not set to cmdb_ci';
            }
        }
        catch (er) {
            vDupeLog += '\n' + 'ERROR CREATING DEDUPE TASK: ' + er;
        }
    }
    else {
        vDupeLog += '\n' + 'Deduplication task creation not enabled.';
    }


    gs.log(vDupeLog,'Duplicate Record Detection Script');

    //DEBUG
  //DEBUG
    if(vStopAfter == -1) { 
        //vStopAfter is disabled - do nothing and continue
    }
    else if (vPasses >= vStopAfter) {
        //The number of passes is greater than the vStopAfter variable
        break;
    }
}
gs.log('Total number of duplicate sets found: ' + vPasses, 'Duplicate Record Detection Script');

 

This is tested and works on Madrid.  I don't think there's anything that will break on newer releases, but ymmv.  Hopefully this helps someone in the future, or anyone who's still  having trouble with this.