- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Introduction
In today's data-driven world, organizations are increasingly focused on protecting sensitive data and ensuring compliance with regulations and industry standards. Data masking is a key component of this effort, as it enables organizations to protect sensitive data by obfuscating it while still maintaining its usefulness for analytics and other purposes. Snowflake Tags provide a powerful way to implement data masking and governance policies in the cloud. By assigning tags to specific datasets, organizations can more easily track and manage sensitive data, ensuring that it is properly masked and protected. In this paper, we provide a comprehensive guide to using Snowflake Tags for data masking and governance, including practical examples and best practices. We also discuss the benefits of using Snowflake Tags for data governance and compliance, and highlight some of the challenges and limitations of this approach.
People analytics –
We process and store sensitive information related to employees, it is critical to protect sensitive data and ensure compliance with external regulations and industry standards.
People Analytics – How do we store the data? Data we process and store is sensitive in nature, and is stored in Sensitive Snowflake account with robust security measures in place. Access is restricted to a limited set of users who signed an NDA, and managed through RBAC (Role based access control) and subject to rigorous approval process and usage is subject to audit. Data is encrypted at rest and during transfer. Above all we felt there is a need for additional security measure as we on-board more users for various use cases, such as AI/ML and targeted employee level reporting etc. We have implemented Snowflake data governance capabilities like object tagging and data access policies, i.e. tag based data masking policies.
What is a Tag? Tag is a Snowflake object that can be assigned to Database, Table, Column.
Dynamic data masking? Is a columns level security feature that obfuscates or masks data at query time.
Tags & Masking Policies -
In Sensitive Snowflake environment, the Tags are set at Object level and at column level. Masking policies are assigned to the Tags at column level.
Object level tags –
- Tags are used to flag the objects that hold the sensitive information.
- Ex – Tag: DATA_CLASSIFICATION etc.
Column level tags –
- Tags are used to flag the columns holds the sensitive data and implements the role based masking policies.
- Ex – Tag: COMPENSATION, DOB etc.
Implementation Steps –
Defining the Tags and the lineage depends on how the data is stored in the warehouse and use cases –
Within People Analytics, we are trying to solve for below:
- Quick & easy way to identify the sensitive objects/columns.
- Mask the sensitive data based on the user/persona.
- Avoid Assigning the masking policies directly to the columns.
Data Governance at a Glance
Create the Tags –
A Tag can be created using ’CREATE TAG’ statement by specifying Tag values.
Create Masking Policy –
With dynamic data masking, the data stored as plain text at rest and masked when queries are executed.
Decide & define which roles can see the plain text vs a masked value and create the masking policy.
Assign the Masking policies on Tags –
Assign the masking policies on Tags using ALTER TAG
Assign the Tags to Objects/Columns –
Assign the Tags to Objects or Columns to enforce the masking policies. Tags can be assigned through SQL script or Using SnowFlake UI.
Tag based Masking policy in Action –
When a user with full access query the table; the masking policy will allow the user to view the actual data -
When a user with limited access query the table, the masking policy will kick in and obfuscates the data during compile time -
Learnings –
- Quick & easy way to identify the sensitive objects/columns
- With the help of Snow sight UI or by querying Information schema objects; it is very easy to extract/identify the sensitive objects/columns, this helps us in effective monitoring & auditing.
- Mask the sensitive data based on the user/persona.
- Making the data based on user persona; this helps in protecting the sensitive data access by limiting the authorized users. Though the user has access to schema/table, the access to sensitive columns can further be restricted.
- Avoid Assigning the masking policies directly to the columns.
- Assigning & managing the masking policy on every sensitive column is complex and inefficient; the Tag based masking policy is very easy to manage without breaking the existing workflow. Assign/remove the policy to a tag once and the change will be applicable to many objects.
What we haven’t discussed –
- Limitations around creating and dropping Tags & Masking policies.
- Role and access requirements to create & Set policies.
- Materialized views, Virtual columns, data type requirements etc.
Questions ?
- Reach out to People analytics team
- Refer Product documentation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.