Reference for Data Discovery API
DataPatternValidator - matches(String pattern, String input)
Validates if the input matches to the regex(regular expression) pattern.
Table 1. Parameters
| Name |
Type |
Description |
| pattern |
String |
The regex pattern |
| input |
String |
The input data to be matched |
Table 2. Returns
| Type |
Description |
| Boolean |
Returns true if the input matches with the pattern, otherwise false. |
Code Example
var datapatternValidatorApi = new sn_data_discovery_api.DataPatternValidator();
var pattern = '\\b[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}\\b';
var input = '09/09/2023';
var output = datapatternValidatorApi.matches(pattern, input);
if (output) {
gs.info('pattern found!');
} else {
gs.info('pattern not found');
}
DataPatternValidator - isValid(String pattern)
Validates if the given pattern is a valid regex.
Table 3. Parameters
| Name |
Type |
Description |
| pattern |
String |
The regex pattern |
Table 4. Returns
| Type |
Description |
| Boolean |
Returns true if the expression is a valid regex, otherwise false. |
Code Example
var datapatternValidatorApi = new sn_data_discovery_api.DataPatternValidator();
var pattern = '\\b[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}\\b';
var output = datapatternValidatorApi.isValid(pattern);
if (output) {
gs.info('pattern is valid!');
} else {
gs.info('pattern is not valid');
}
DataPatternValidator - keywordMatches(String pattern, String input, String keywords, int keywordProximity)
Validate if the given input data is matching with the regular expression (regex) pattern along with keywords. See
Configure Data Discovery patterns for more information about keywords and keyword proximity.
Table 5. Parameters
| Name |
Type |
Description |
| pattern |
String |
The regex pattern |
| input |
String |
The input data to be matched |
| keywords |
String |
The comma seperated keyword values to match |
| keywordProximity |
int |
The keyword proximity from the matched pattern |
Table 6. Returns
| Type |
Description |
| Boolean |
Returns true if the expression is a valid regex, otherwise false |
Code Example
var datapatternValidatorApi = new sn_data_discovery_api.DataPatternValidator();
var pattern = '\\b[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}\\b';
var keywords = 'dob,date of birth';
var keywordProximity = 20;
var matchInput = 'dob: 09/09/2023';
var noMatchInput = '09/09/2023';
var output = datapatternValidatorApi.keywordMatches(pattern, matchInput, keywords, keywordProximity);
gs.info('match found for input: ' + matchInput + ' = ' + output);
output = datapatternValidatorApi.keywordMatches(pattern, noMatchInput, keywords, keywordProximity);
gs.info('match found for noMatchInput: ' + noMatchInput + ' = ' + output);
DataPatternScanner - scan(String input)
Note: The DataPatternScanner constructor must be passed an array of data pattern system IDs.
Table 7. Parameters
| Name |
Type |
Description |
| input |
String |
The input data to be scanned |
Table 8. Returns
| Type |
Description |
| String |
Serialized JSON string
- hasMatches
Return true if at least 1 pattern match is present.
"finding" : [ { # for each pattern with match, contains list of start and end positions of matches.
- error
- Contains an error code and message if API failed, otherwise empty.
- unprocessedPatterns
- Returns an array of data pattern sys ids which were not processed
- finding
- Returns the ID of each pattern and a list of start and end positions of matches.
|
Code Example
var emailSysId = '8e5605bceb0561107977d256385228e6';
var ssnSysId = '4964417ceb0561107977d256385228b8';
var dataPatternSysIds = [emailSysId, ssnSysId] // Email and SSN
var dataDiscoveryApi = new sn_data_discovery_api.DataDiscoveryScanner(dataPatternSysIds);
var input = 'my ssn is 123-45-6789 and email is abcd@company.com'
var jsonString = dataDiscoveryApi.scan(input);
var output = JSON.parse(jsonString);
if (output.hasMatches) {
gs.info('found matches for patterns in input');
for (var i=0; i<output.finding.length; i++) {
curFinding = output.finding[i];
gs.info('first match for ' + curFinding.pattern + ' is (' + curFinding.matches[0]['start'] + ',' + + curFinding.matches[0]['end'] + ')');
}
}
Table 9. Configuration
| Name |
Configuration Mode |
Description |
| Maximum length of keywords (csv) string. |
Cannot be configured |
Defines the maximum length of string that can be configured in DataPattern.keyword field
- Default value is 128
- Maximum value is 128
|
| Minimum and Maximum value for keyword proximity |
Cannot be configured |
Defines the minimum and maximum value that can be entered in DataPattern.keyword_proximity field.
- Minimum of 0
- Maximum of 64
|
| Maximum input size for matches and keywordMatches API |
Cannot be configured |
Defines the maximum input size supported by DataPatternValidator.matches and DataPatternValidator.keywordMatches APIs
- Default value is 2048
- Maximum value is 2048
|
| Timeout for scan API |
DataDiscoveryScanner.setScanTimeout(long timeoutMillis) |
API call to define the maximum time in milliseconds to complete the DataDiscoveryScanner.scan calls.
- Default value is 20000(ms)
- Range is an Integer value between 0-50000 (ms)
|