Data Discovery API

  • Release version: Washingtondc
  • Updated February 21, 2024
  • 3 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Data Discovery API

    The Data Discovery API in ServiceNow allows users to validate input data against regular expression (regex) patterns and keywords, facilitating effective data management and discovery. It includes functionalities for pattern matching, validation, and scanning through input data for specified patterns.

    Show full answer Show less

    Key Features

    • DataPatternValidator - matches: Validates if the input matches a given regex pattern. Returns true if there is a match.
    • DataPatternValidator - isValid: Checks if a regex pattern is valid, returning true for valid patterns.
    • DataPatternValidator - keywordMatches: Validates if input matches a regex pattern alongside specified keywords within a defined proximity.
    • DataPatternScanner - scan: Scans input data against multiple data patterns and returns matching results, including positions of matches.

    Key Outcomes

    By utilizing the Data Discovery API, customers can enhance data integrity by ensuring that specific inputs conform to established patterns and keywords. This functionality aids in identifying sensitive information, ensuring compliance, and streamlining data processing workflows. Additionally, the configuration options allow users to set boundaries for keywords and scan timeouts, optimizing the API's performance according to their operational needs.

    Reference for Data Discovery API

    DataPatternValidator - matches(String pattern, String input)

    Validates if the input matches to the regex(regular expression) pattern.
    Table 1. Parameters
    Name Type Description
    pattern String The regex pattern
    input String The input data to be matched
    Table 2. Returns
    Type Description
    Boolean Returns true if the input matches with the pattern, otherwise false.

    Code Example

    var datapatternValidatorApi = new sn_data_discovery_api.DataPatternValidator();
    var pattern = '\\b[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}\\b';
    var input = '09/09/2023';
    var output = datapatternValidatorApi.matches(pattern, input);
    if (output) {
      gs.info('pattern found!');
    } else {
      gs.info('pattern not found');
    }
    

    DataPatternValidator - isValid(String pattern)

    Validates if the given pattern is a valid regex.
    Table 3. Parameters
    Name Type Description
    pattern String The regex pattern
    Table 4. Returns
    Type Description
    Boolean Returns true if the expression is a valid regex, otherwise false.

    Code Example

    var datapatternValidatorApi = new sn_data_discovery_api.DataPatternValidator();
    var pattern = '\\b[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}\\b';
    var output = datapatternValidatorApi.isValid(pattern);
    if (output) {
     gs.info('pattern is valid!');
    } else {
      gs.info('pattern is not valid');
    }

    DataPatternValidator - keywordMatches(String pattern, String input, String keywords, int keywordProximity)

    Validate if the given input data is matching with the regular expression (regex) pattern along with keywords. See Configure Data Discovery patterns for more information about keywords and keyword proximity.
    Table 5. Parameters
    Name Type Description
    pattern String The regex pattern
    input String The input data to be matched
    keywords String The comma seperated keyword values to match
    keywordProximity int The keyword proximity from the matched pattern
    Table 6. Returns
    Type Description
    Boolean Returns true if the expression is a valid regex, otherwise false

    Code Example

    var datapatternValidatorApi = new sn_data_discovery_api.DataPatternValidator();
    var pattern = '\\b[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}\\b';
    var keywords = 'dob,date of birth';
    var keywordProximity = 20;
    var matchInput = 'dob: 09/09/2023';
    var noMatchInput = '09/09/2023';
    var output = datapatternValidatorApi.keywordMatches(pattern, matchInput, keywords, keywordProximity);
    gs.info('match found for input: ' + matchInput + ' = ' + output);
    output = datapatternValidatorApi.keywordMatches(pattern, noMatchInput, keywords, keywordProximity);
    gs.info('match found for noMatchInput: ' + noMatchInput + ' = ' + output);

    DataPatternScanner - scan(String input)

    Note:
    The DataPatternScanner constructor must be passed an array of data pattern system IDs.
    Table 7. Parameters
    Name Type Description
    input String The input data to be scanned
    Table 8. Returns
    Type Description
    String Serialized JSON string
    hasMatches

    Return true if at least 1 pattern match is present.

    "finding" : [ { # for each pattern with match, contains list of start and end positions of matches.

    error
    Contains an error code and message if API failed, otherwise empty.
    unprocessedPatterns
    Returns an array of data pattern sys ids which were not processed
    finding
    Returns the ID of each pattern and a list of start and end positions of matches.

    Code Example

    var emailSysId = '8e5605bceb0561107977d256385228e6';
    var ssnSysId = '4964417ceb0561107977d256385228b8';
    var dataPatternSysIds = [emailSysId, ssnSysId] // Email and SSN
    var dataDiscoveryApi = new sn_data_discovery_api.DataDiscoveryScanner(dataPatternSysIds);
    
    var input = 'my ssn is 123-45-6789 and email is abcd@company.com'
    var jsonString = dataDiscoveryApi.scan(input);
    var output = JSON.parse(jsonString);
    
    if (output.hasMatches) {
       gs.info('found matches for patterns in input');
       for (var i=0; i<output.finding.length; i++) {
            curFinding = output.finding[i];
            gs.info('first match for ' + curFinding.pattern + ' is (' + curFinding.matches[0]['start'] + ',' + + curFinding.matches[0]['end'] + ')');
       }
    }
    Table 9. Configuration
    Name Configuration Mode Description
    Maximum length of keywords (csv) string. Cannot be configured Defines the maximum length of string that can be configured in DataPattern.keyword field
    • Default value is 128
    • Maximum value is 128
    Minimum and Maximum value for keyword proximity Cannot be configured Defines the minimum and maximum value that can be entered in DataPattern.keyword_proximity field.
    • Minimum of 0
    • Maximum of 64
    Maximum input size for matches and keywordMatches API Cannot be configured Defines the maximum input size supported by DataPatternValidator.matches and DataPatternValidator.keywordMatches APIs
    • Default value is 2048
    • Maximum value is 2048
    Timeout for scan API DataDiscoveryScanner.setScanTimeout(long timeoutMillis) API call to define the maximum time in milliseconds to complete the DataDiscoveryScanner.scan calls.
    • Default value is 20000(ms)
    • Range is an Integer value between 0-50000 (ms)