Skip to content

You are viewing documentation for Immuta version 2024.1.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Snowflake Integration with Snowflake Governance Features Reference Guide

Snowflake Enterprise Edition required

This integration requires the Snowflake Enterprise Edition.

In this integration, Immuta manages access to Snowflake tables by administering Snowflake row access policies and column masking policies on those tables, allowing users to query tables directly in Snowflake while dynamic policies are enforced.

Like with all Immuta integrations, Immuta can inject its ABAC model into policy building and administration to remove policy management burden and significantly reduce role explosion.

Architecture

When an administrator configures the Snowflake integration with Immuta, Immuta creates an IMMUTA database and schemas (immuta_procedures, immuta_policies, and immuta_functions) within Snowflake to contain policy definitions and user entitlements. Immuta then creates a system role and gives that system account the following privileges:

  • APPLY MASKING POLICY
  • APPLY ROW ACCESS POLICY
  • ALL PRIVILEGES ON DATABASE "IMMUTA" WITH GRANT OPTION
  • ALL PRIVILEGES ON ALL SCHEMAS IN DATABASE "IMMUTA" WITH GRANT OPTION
  • USAGE ON FUTURE PROCEDURES IN SCHEMA "IMMUTA".immuta_procedures WITH GRANT OPTION
  • USAGE ON WAREHOUSE
  • OWNERSHIP ON SCHEMA "IMMUTA".immuta_policies TO ROLE "IMMUTA_SYSTEM" COPY CURRENT GRANTS
  • OWNERSHIP ON SCHEMA "IMMUTA".immuta_procedures TO ROLE "IMMUTA_SYSTEM" COPY CURRENT GRANTS
  • OWNERSHIP ON SCHEMA "IMMUTA".immuta_functions TO ROLE "IMMUTA_SYSTEM" COPY CURRENT GRANTS
  • OWNERSHIP ON SCHEMA "IMMUTA".public TO ROLE "IMMUTA_SYSTEM" COPY CURRENT GRANTS

Optional features, like automatic object tagging, native query auditing, etc., require additional permissions to be granted to the Immuta system account, are listed in the supported features section.

Policy enforcement

Snowflake is a policy push integration with Immuta. When Immuta users create policies, they are then pushed into the Immuta database within Snowflake; there, the Immuta system account applies Snowflake row access policies and column masking policies directly onto Snowflake tables. Changes in Immuta policies, user attributes, or data sources trigger webhooks that keep the Snowflake policies up-to-date.

For a user to query Immuta-protected data, they must meet two qualifications:

  1. They must be subscribed to the Immuta data source.
  2. They must be granted SELECT access on the table by the Snowflake object owner or automatically via the Snowflake table grants feature.

After a user has met these qualifications they can query Snowflake tables directly.

Comply with column length and precision requirements in a Snowflake masking policy

When a user applies a masking policy to a Snowflake data source, Immuta truncates masked values to align with Snowflake column length (VARCHAR(X) types) and precision (NUMBER (X,Y) types) requirements.

Consider these columns in a data source that have the following masking policies applied:

  • Column A (VARCHAR(6)): Mask using hashing for everyone
  • Column B (VARCHAR(5)): Mask using a constant REDACTED for everyone
  • Column C (VARCHAR(6)): Mask by making null for everyone
  • Column D (NUMBER(3, 0)): Mask by rounding to the nearest 10 for everyone

Querying this data source in Snowflake would return the following values:

A B C D
5w4502 REDAC 990
6e3611 REDAC 750
9s7934 REDAC 380

Hashing collisions

Hashing collisions are more likely to occur across or within Snowflake columns restricted to short lengths, since Immuta truncates the hashed value to the limit of the column. (Hashed values truncated to 5 characters have a higher risk of collision than hashed values truncated to 20 characters.) Therefore, avoid applying hashing policies to Snowflake columns with such restrictions.

For more details about Snowflake column length and precision requirements, see the Snowflake behavior change release documentation.

Query performance

When a policy is applied to a column, Immuta uses Snowflake memoizable functions to cache the result of the called function. Then, when a user queries a column that has that policy applied to it, Immuta uses that cached result to dramatically improve query performance.

Registering data sources

Best practice

Use a dedicated Snowflake role to register Snowflake tables as Immuta data sources. Then, include this role in the excepted roles/users list.

Register Snowflake data sources using a dedicated Snowflake role. No policies will apply to that role, ensuring that your integration works with the following use cases:

  • Snowflake project workspaces: Snowflake workspaces generate static views with the credentials used to register the table as an Immuta data source. Those tables must be registered in Immuta by an excepted role so that policies applied to the backing tables are not applied to the project workspace views.

  • Using views and tables within Immuta: Because this integration uses Snowflake governance policies, users can register tables and views as Immuta data sources. However, if you want to register views and apply different policies to them than their backing tables, those views must be registered in Immuta by an excepted role; otherwise, the backing table’s policies will be applied to that view.

Snowflake bulk data source creation

Private preview

This feature is only available to select accounts. Reach out to your Immuta representative to enable this feature.

Bulk data source creation is the more efficient process when loading more than 5000 data sources from Snowflake and allows for data sources to be registered in Immuta before running sensitive data discovery or applying policies.

To use this feature, see the Bulk create Snowflake data sources guide.

Resource allocations

Based on performance tests that create 100,000 data sources, the following minimum resource allocations need to be applied to the appropriate pods in your Kubernetes environment for successful bulk data source creation.

Web Database Query Engine
Memory 4Gi 16Gi 8Gi
CPU 2 4 2
Storage 8Gi 24Gi 16Gi

Limitations

  • Performance gains are limited when enabling sensitive data discovery at the time of data source creation.
  • External catalog integrations are not recognized during bulk data source creation. Users must manually trigger a catalog sync for tags to appear on the data source through the data source's health check.

Excepted roles/users

Excepted roles and users are assigned when the integration is installed, and no policies will apply to these users' queries, despite any Immuta policies enforced on the tables they are querying. Credentials used to register a data source in Immuta will be automatically added to this excepted list for that Snowflake table. Consequently, roles and users added to this list and used to register data sources in Immuta should be limited to service accounts.

Immuta excludes the listed roles and users from policies by wrapping all policies in a CASE statement that will check if a user is acting under one of the listed usernames or roles. If a user is, then the policy will not be acted on the queried table. If the user is not, then the policy will be executed like normal. Immuta does not distinguish between role and username, so if you have a role and user with the exact same name, both the user and any user acting under that role will have full access to the data sources and no policies will be enforced for them.

Data flow

  1. An Immuta application administrator configures the Snowflake integration and registers Snowflake warehouse and databases with Immuta.
  2. Immuta creates a database inside the configured Snowflake warehouse that contains Immuta policy definitions and user entitlements.
  3. A data owner registers Snowflake tables in Immuta as data sources.
  4. If Snowflake tag ingestion was enabled during the configuration, Immuta uses the host provided in the configuration and ingests internal tags on Snowflake tables registered as Immuta data sources.
  5. A data owner, data governor, or administrator creates or changes a policy or a user's attributes change in Immuta.
  6. The Immuta web service calls a stored procedure that modifies the user entitlements or policies.
  7. Immuta manages and applies Snowflake governance column and row access policies to Snowflake tables that are registered as Immuta data sources.
  8. If Snowflake table grants is not enabled, Snowflake object owner or user with the global MANAGE GRANTS privilege grants SELECT privilege on relevant Snowflake tables to users. Note: Although they are GRANTed access, if they are not subscribed to the table via Immuta-authored policies, they will not see data.
  9. A Snowflake user who is subscribed to the data source in Immuta queries the corresponding table directly in Snowflake and sees policy-enforced data.

Snowflake Integration

Authentication methods

The Snowflake integration supports the following authentication methods to install the integration and create data sources:

  • Username and password: Users can authenticate with their Snowflake username and password.
  • Key pair: Users can authenticate with a Snowflake key pair authentication.
  • Snowflake External OAuth: Users can authenticate with Snowflake External OAuth when using Snowflake with governance features.

Snowflake External OAuth

Immuta's OAuth authentication method uses the Client Credentials Flow to integrate with Snowflake External OAuth. When a user configures the Snowflake integration or connects a Snowflake data source, Immuta uses the token credentials (obtained using a certificate or passing a client secret) to craft an authenticated access token to connect with Snowflake. This allows organizations that already use Snowflake External OAuth to use that secure authentication with Immuta.

Workflow

  1. An Immuta application administrator configures the Snowflake integration or creates a data source.
  2. Immuta creates a custom token and sends it to the authorization server.
  3. The authorization server confirms the information sent from Immuta and issues an access token to Immuta.
  4. Immuta sends the access token it received from the authorization server to Snowflake.
  5. Snowflake authenticates the token and grants access to the requested resources from Immuta.
  6. The integration is connected and users can query data.

Supported Snowflake feature

The Immuta Snowflake integration supports Snowflake external tables. However, you cannot add a masking policy to an external table column while creating the external table in Snowflake because masking policies cannot be attached to virtual columns.

Supported Immuta features

The Snowflake integration with Snowflake governance features supports the Immuta features outlined below. Click the links provided for more details.

  • Immuta project workspaces: Users can have additional write access in their integration using project workspaces.
  • Tag ingestion: Immuta automatically ingests Snowflake object tags from your Snowflake instance and adds them to the appropriate data sources.
  • User impersonation: Native impersonation allows users to natively query data as another Immuta user. To enable native user impersonation, see the Integration user impersonation page.
  • Native query audit: Immuta audits queries run natively in Snowflake against Snowflake data registered as Immuta data sources.
  • Multiple Snowflake instances
  • Snowflake low row access policy mode: The Snowflake low row access policy mode improves query performance in Immuta's Snowflake integration by decreasing the number of Snowflake row access policies Immuta creates.
  • Snowflake table grants: This feature allows Immuta to manage privileges on your Snowflake tables and views according to the subscription policies on the corresponding Immuta data sources.

Immuta project workspaces

Immuta system account required Snowflake privileges
  • CREATE [OR REPLACE] PROCEDURE
  • DROP ROLE
  • REVOKE ROLE

Users can have additional write access in their integration using project workspaces. For more details, see the Snowflake project workspaces page.

Caveat

To use project workspaces with the Snowflake integration with governance features, the default role of the account used to create data sources in the project must be added to the "Excepted Roles/Users List." If the role is not added, you will not be able to query the equalized view using the project role in Snowflake.

Tag ingestion

Immuta system account required Snowflake privileges
  • GRANT IMPORTED PRIVILEGES ON DATABASE snowflake
  • GRANT APPLY TAG ON ACCOUNT

When configuring a Snowflake integration, you can enable Snowflake tag ingestion as well. With this feature enabled, Immuta will automatically ingest Snowflake object tags from your Snowflake instance into Immuta and add them to the appropriate data sources.

The Snowflake tags' key and value pairs will be reflected in Immuta as two levels: the key will be the top level and the value the second. As Snowflake tags are hierarchical, Snowflake tags applied to a database will also be applied to all of the schemas in that database, all of the tables within those schemas, and all of the columns within those tables. For example: If a database is tagged PII, all of the tables and columns in that database will also be tagged PII.

To enable Snowflake tag ingestion, follow one of the tutorials below:

Caveats

Snowflake has some natural data latency. If you manually refresh the governance page to see all tags created globally, users can experience a delay of up to two hours. However, if you run schema detection or a health check to find where those tags are applied, the delay will not occur because Immuta will only refresh tags for those specific tables.

Native query audit

Immuta system account required Snowflake privileges

  • IMPORTED PRIVILEGES ON DATABASE snowflake

Once this feature has been enabled with the Snowflake integration, Immuta will query Snowflake to retrieve user query histories. These histories provide audit records for queries against Snowflake data sources that are queried natively in Snowflake.

This process will happen automatically every hour by default but can be configured to a different frequency when configuring or editing the integration. Additionally, audit ingestion can be manually requested at any time from the Immuta audit page. When manually requested, it will only search for new queries that were created since the last native query that had been audited. The job is run in the background, so the new queries will not be immediately available.

For details about prompting these logs and the contents of these audit logs, see the Snowflake query audit logs page.

Multiple Snowflake instances

A user can configure multiple integrations of Snowflake to a single Immuta instance and use them dynamically or with workspaces.

Caveats

  • There can only be one integration connection with Immuta per host.
  • The host of the data source must match the host of the integration for the view to be created.
  • Projects can only be configured to use one Snowflake host.

Limitations

  • If there are errors in generating or applying policies natively in Snowflake, the data source will be locked and only users on the excepted roles/users list and the credentials used to create the data source will be able to access the data.

  • Once a Snowflake integration is disabled in Immuta, the user must remove the access that was granted in Snowflake. If that access is not revoked, users will be able to access the raw table in Snowflake.

  • Migration must be done using the credentials and credential method (automatic or bootstrap) used to install the integration.

  • When configuring one Snowflake instance with multiple Immuta instances, the user or system account that enables the integration on the app settings page must be unique for each Immuta instance.

  • A Snowflake table can only have one set of policies enforced at a given time, so creating multiple data sources pointing to the same table is not supported. If this is a use case you need to support, create views in Snowflake and expose those instead.

  • You cannot add a masking policy to an external table column while creating the external table because a masking policy cannot be attached to a virtual column.

  • If you create an Immuta data source from a Snowflake view created using a select * from query, Immuta column detection will not work as expected because Snowflake views are not automatically updated based on backing table changes. To remedy this, you can create views that have the specific columns you want or you can CREATE AND REPLACE the view in Snowflake whenever the backing table is updated and manually run the column detection job on the data source page.

Custom WHERE clause limitations

The Immuta Snowflake integration uses Snowflake governance features to let users query data natively in Snowflake. This means that Immuta also inherits some Snowflake limitations using correlated subqueries with row access policies and column-level security. These limitations appear when writing custom WHERE policies, but do not remove the utility of row-level policies.

Requirements for a custom WHERE policy

  1. All column names must be fully qualified:

    1. Any column names that are unqualified (i.e., just the column name) will default to a column of the data source the policy is being applied to (if one matches the name).
  2. The Immuta system account must have SELECT privileges on all tables/views referenced in a subquery:

    1. The Immuta system role name is specified by the user, and the role is created when the Snowflake instance is integrated.

Subquery limitations

Any subqueries that error in Snowflake will also error in Immuta.

  1. Including one or more subqueries in the Immuta policy condition may cause errors in Snowflake. If an error occurs, it may happen during policy creation or at query-time. To avoid these errors, limit the number of subqueries, limit the number of JOIN operations, and simplify WHERE clause conditions.
  2. For more information on the Snowflake subquery limitations see