Data privacy / anonymization in practice

AttilaVarga · ‎06-21-2023

It goes without saying that when a new ServiceNow product is implemented (configured) or an existing one is improved / enhanced for customers, development happens on a non-production environment. The bug fixes also should be done on this environment.

Hundreds of regulations describe how to protect data based on their type. Customers want to stay compliant, so they have to take care of this area carefully.

There are a lot of activities (feature improvements, bug investigations), which are difficult or almost impossible without having the proper and / or well structured data content on the sub-production environment.

Long story short, the sensitive data can be available only on production environment in processable format.

Almost all (or all) service providers have already encountered such a demand from customers, that data have to be masked on the non-prod instance after some specific operations, like back-cloning, or 3rd party interface data collection. I'm sure that there are a lot of custom solutions for this problem, like post back-cloning scripts, Fix scripts, or just simple script includes which are responsible for data protection on sub-production instances.

In this article I will introduce a solution to this problem, with the help of out-of-the-box ServiceNow features.

ServiceNow Vault

ServiceNow has a product, which is called ServiceNow Vault. This product consist of several security related solutions, like, Platform encryption, Data anonymization, Secret management or Code signing.

These capabilities give high level data protection for customers, which is expected by authorities.

Data Anonymization

Data Anonymization gives the possibility to mask data on a specific environment and make it impossible to process or be accessed by unauthorised users. In order to be able to do feature implementations or bug fixing, the non-production environment must have data in the database, but sensitive information should be concealed.

The following part of the article is going to describe how it can be configured, used and also highlights its limitations.

So let's get started!

Data classification

In order to use this feature, data has to be classified. Only classified fields can be anonymized. Data classification is a simple process in ServiceNow (it is enabled by default), which gives another layer of categorization or grouping possibilities.

There is a new, user friendly front-end (coming with Tokyo release) which can be used to configure the data privacy settings in the system. This layout cannot be accessed without having the necessary role(s).

Roles

The Data Privacy application gives a security layer to the data. The functionality related to this can be accessed with elevated roles.

There are four roles related to the Data privacy application:

data_privacy_admin
- Can create technique and policy configurations. Cannot create, read or view jobs
data_privacy_auditor
- Can only view jobs
data_privacy_clone_processor
- Can create / modify / remove jobs, related to dataclass-based policies
data_privacy_processor
- Cancreate / modify / remove jobs, related to user-based policies

If a user has the data_privacy_admin role, one new element appears on the Elevate role modal window.

Configure Data Anonymization

The main configuration page is split into three parts:

Overview
- It gives an overall picture about the classified data based on some charts and also contains information about the purpose of data classification / anonymization.

Classification
- This is a simplified layout where the classification actions can be done

Anonymization
- This is the workbench of data anonymization. There are three actions which can be done on this layout:
  - Check / manage anonymization techniques
  - Manage anonymization policies
  - Manage jobs

Anonymization techniques

There are five predefined techniques (methods), which defines how the data will be masked:

Selective Replace
- It can be defined which character(s) of a string will be replaced to a specific one.
Static Replace
- The content of the field will be replaced to the specified string.
Random Replace
- Random characters will be used during the string replacement operation.
Remove
- This technique removes the content of field(s).
No action
- This technique doesn't do any actions on the content of field(s). This action can be useful if a specific field has to be bypassed.

It is possible to create a custom technique, but the options are quite limited. All custom techniques are based on OOTB ones.

When new technique is being created, one of the four base techniques can be selected:

Let's see the possibilities regarding customization.

Static Replace (There are four different types of fields where the anonymization algorithm can be configured)
- Date Time value
- Date value
- Number value
- String value

Selective Replace
- Start index
  - This value defines the position of the character, where the anonymization algorithm will start the character replacement action.
- End index
  - This value defines the position of the character, where the anonymization algorithm stops the character replacement action.
- Exclude character
  - A character can be defined, which will be excluded from the masking
- Replacement char
  - The content of this setting will be the replacement character

Random Replace
- Preserve data length
  - If this is enabled the length of content will be the same like the original one

Anonymization policy

The Anonymization policy defines rules of data masking. There are two types of data which can be masked in this application:

User specific data (fields in the sys_user table)
All other tables

The big difference between these two types is that the user specific policy is capable of selecting user records, which will be part of the process.

Important: Only 10 records can be selected in one policy!

In the other case, all records in the table are affected.

When a new policy is created, the following information has to be added:

Name of the policy
Data class (Only one data class can be selected)

The technique, which will be used on a specific field

In case of field(s) which we don't want to mask, selecting the No Action technique is needed. The Bulk Assign Techniques functionality can be used to select the desired technique for multiple fields.

When the configuration is done, the policy can be saved or published. Only the published policies can be scheduled and used for anonymization.

As it can be seen on the image the policy is published, but it is not possible to schedule it. The following conditions have to be met in order for scheduling to become available:

The current policy has to be active
One of the following roles has to be assigned to the current user:
- data_privacy_clone_processor
- data_privacy_processor
Finally one of the above mentioned roles have to be elevated

Remember: In order to assign data_privacy roles the user must have the security_admin role, and need to elevate the role.

Schedule anonymization job

The anynomization process is executed as a scheduled job. The following information has to be filled out:

Name of job
Start time (format: HH:MM:SS)
End time (format: HH:MM:SS)
- If the job doesn't finish until the end time, the execution will be continued on the next day.
Dry run (preview)
- If this property is true, the masked data won't be committed to the database.
Users
- This field is only visible, if the Data Class in the policy is user related (based on sys_user table)

The dashboard also contains the scheduled jobs:

There are two possible actions on a job:

Cancel job
Pause

After the job is executed, these actions won't be available.

If the Dry run (preview) option is enabled, the result is placed into a field of Data Privacy Job, called Summary (This information can be seen on the platform view):

In case of "accidents"

There can be a cases where the anonymized data has to be rolled back. This is a built-in functionality in Data anonymization.

There is a system property "glide.rollback.expiration_days_redact", which defines the availability of the Rollback option. The default value is 3 days. The function is available on the job view:

Important: A specific job can only be executed once.

Closing notes

To summarize the solution, in order to anyonimize data we need to do the following:

Classify the data
Define or select anonymization technique(s)
Create a policy
Define a job

All of these actions can be done on a modern front-end (based on UI builder), which makes the tasks of responsible people quicker and easier. The functionalities are structured and separated well and can be managed by roles.

Overall I like this solution. It has great potential and I'm looking forward to future improvements.

New features I would like to see:

User limitation lifted
A kind of condition builder (reference qualifier) should be added to the job in order to specify which records have to be anonymized (independently of the data type)
A job can be executed periodically

pradeep reddy · ‎06-22-2023

A very informative article @AttilaVarga #keepgoing

Edward Rosario · ‎06-23-2023

@AttilaVarga Good Job. Have you come across any issues? I attempting now and not all characters in the field I specified are getting anonymized

AttilaVarga · ‎07-02-2023

@Edward Rosario, thanks!

I would rather say that all the "issues", which I faced with are the lack of functionalities, which I have written into the article.

If you would describe here the problems, which you faced with I could add them into the article and that may be useful to the community.

Edward Rosario · ‎07-11-2023

@AttilaVarga

I have a table with a column, where field type = string, that contains personal information.
I Create a Data Privacy job and ran the job but once it completed you see that only a few fields are anonymized.

Looking at this doc, looks like string field types are supported, so I assume all the fields should be anonymized, but they are not

https://docs.servicenow.com/bundle/utah-platform-security/page/administer/security/reference/data-pr...

DiogoCosta · ‎08-21-2023

@AttilaVarga @Edward Rosario should this work in a PDI? I've scheduled jobs but they don't run at the time I set in the "Start time" field...

Edward Rosario · ‎08-21-2023

Yes, I started in my PDI. Make sure you have a privacy technique configuration set up

DiogoCosta · ‎08-22-2023

@Edward Rosario eventually it did ran, but definitely not at the time I had scheduled. It is a bit weird, but manageable I guess.

AttilaVarga · ‎08-22-2023

Hi @DiogoCosta,

Please check the time zone settings, it can be also the problem.

AttilaVarga · ‎08-22-2023

@DiogoCosta, I have scheduled a new Data Privacy Job in my PDI, then I checked the record in the sys_trigger table. Please take a look at the image below:

You can see that the Scheduled job record is created and the value of Next action is the same, which is set in the Privacy Job record.

Please take a look at the times in your PDI.

ncorbin003 · ‎01-19-2024

Is there a way to make the data privacy job run daily?

AttilaVarga · ‎01-19-2024

Hi @ncorbin003,

As far as I know this is not possible currently.

V V Satyanaray1 · ‎02-14-2025

@AttilaVarga Thanks for this!!

I have a doubt it is possible to mask only the PII or sensitive data which is present in the incident description which includes other statements as well?

Just like the real time anonymization, where it can be detected by patterns and can be masked. Where as the data which is already existing in tables uses anonymization techniques to mask PII data.

can we do the same for the existing data as well?

andrewrouch · ‎04-21-2025

Can this be distinguished between anonymizing data as part of cloning prod into sub-prod instance and anonymizing data in the production instance itself?

We are facing into situations where users are entering data into production records that we don't allow by design. While I understand Sensitive Data Handling is being extended to check data is being entered into forms, we still need to search for and anonymize data that has already been stored.

AttilaVarga · ‎05-21-2025

Hi @andrewrouch,

Yes it is possible.

There is a new policy type in the latest version of Data Privacy, called Real time data:

I think you can use this for the purpose, which you mentioned.

You can define regular expression(s) and when the record is created or updated the policy is executed, and based on the pattern and the selected technique, the content is modified. Let me show an example:

You can see a simple data pattern below, which checks the Hungarian License plates:

As a second step, you need to activate it (there is an application menu, where you can do that) and then, you can setup a policy:

Once it is done, the system starts checking the content of the selected field(s) and performs the necessary operation.

Take a look at the example below. I added a custom field to the sys_user table, called Demo sensitive data.

When you update the record, the content - based on the pattern - will be modified. You can see on the image below that the 'ABC-123' was replaced with xxxxxxx, but the rest of the content remained the same.

You can read more detailed information about this feature on the docs.

I hope it helps. 🙂

@V V Satyanaray1, maybe this information is useful for you as well.

andrewrouch · ‎05-29-2025

Thanks but my reading of the Real Time Anonymization is it is for Now Assist only, can it be used for form input in Yokohama?

AttilaVarga · ‎05-30-2025

Hi @andrewrouch,

"my reading of the Real Time Anonymization is it is for Now Assist only"

The example in my previous comment was created on an instance, where the Now Assist is not enabled.

"can it be used for form input in Yokohama?"

If I understand the question correctly, this is exactly, which I showed on my example. When you change something on the form and save or update it, the content check process runs, and the corresponding anonymization action (technique) is executed based on the given pattern(s).

I used Yokohama version in my example.

christophenow · ‎08-12-2025

a limitation I faced for Data Privacy Anonymization for Cloning:

- Not possible to apply specific "Ano. Policy" for specific clone profile

- There is an issue and if you use the new "Select child table filter (Optional)", it will NOT work if the child table is in another scope as the policy. Workaround is to create one policy per Scoped Application if you want to use this option.

Remark: To link cloning profile to Data Ano. Script, go to "clone cleanup scripts", open Apply Data Privacy Policies and add your clone profile to the related list.

christophenow · ‎08-19-2025

Another enoying limitation on data anonymization is that it will only works for current data store in the field. All activity history will remain visible and not anonymized. I've created an idea to ask this enhancement. Thanks for upvote: View Idea Page - Now Support