Snowflake offers role-based access control (RBAC) as the mechanism to handle authorization of security principals (users, services, etc.) and grant or deny them access to different database objects and operations.
A key piece of a successful Snowflake implementation is the creation of an RBAC configuration and strategy that will meet the following objectives:
- Follow the principle of least privilege (to a certain degree, see the comments on granularity later)
- Can be easily modified incrementally as requirements change
- Easily understood and managed by the Snowflake security administrator
- Integrates transparently with other Snowflake features
With these objectives in mind, the methodology is split into the following steps:
- Determine the granularity of your data access privileges.
- Use Owner-Reader-Writer definitions to grant database-level permissions to create back-end roles. (More on this later.)
- Mix and match back-end roles to create front-end roles. (More on this later as well.)
- Grant Snowflake account-level permissions to front-end roles.
- Assign directory users or groups to the front-end roles.
Let’s break each one of these down in detail.
Data access privilege granularity
The data access privilege granularity is the lowest level of securable that you will use to provide data access. This can theoretically go all the way down to rows and all the way up to full databases.
I usually recommend that people start out with using Schema as their data access securable granularity. Database is usually too broad and you will inevitably have to re-do your roles and table level. Below is too specific to turn it into a general methodology—you would end up with way too many roles. See the FAQ later in this post on how to mix and match granularities if needed.
Once you have the granularity defined, you then create back-end roles at that level.
Back-end roles provide data access privileges at the level of granularity that you decided to support. The idea is that for each data access securable (schema in this case), you will define three levels of access:
- Owner: Can read, write, change the definition and grant access to the object
- Reader: Can read the object
- Writer: Can read and write the object
In the case of a schema for example, the Reader would be able to SELECT from all tables, views, and run functions and stored procedures that only read data. It is your responsibility to properly define the requirements for each schema to stay within the desired boundaries.
You could extend this model to be more complex. For example, you could create backend-roles that allow UPDATES and INSERTS but not DELETES. My recommendation is to take a good look at your requirements so that you don’t end up adding unnecessary complexity.
Assuming that we have decided that our data privilege granularity will be at the schema level and sticking to the three levels of access, we would end up with a matrix of back-end roles like the following:
Creating backend roles should be a mechanical process that is scripted: Every time you create a new schema, you should also create the corresponding Owner-Reader-Writer roles.
These back-end roles are granted access to the data objects and nothing else. They are also not granted directly to directory users or groups. This is where front-end roles come in.
Front-end roles represent actual roles in your organization. They could be finance team members, marketing managers, IT administrators, etc. They are composed through a mix of back-end roles and account-level permissions.
See for example the following definition:
|USAGE||MARKETING VIRTUAL WAREHOUSE|
From the table above, you can glimpse some of the rules that apply to front-end roles:
- The data access privileges are granted through back-end roles exclusively.
- Only front-end roles are granted to directory users and groups.
- Account grants like usage of a virtual warehouse are granted to front-end roles only.
- Unless it is directly related to data access, account permissions are granted to front-end roles.
- For example, you want a manager to be able to adjust a resource quota.
The separation of back-end and front-end roles provides the following advantages:
- Front-end roles are easily identifiable roles that leave no ambiguity as to what they represent—marketing members, IT operators, etc.
- Usage of virtual warehouses is applied at the front-end role level so that users are assigned to a specific warehouse and a clean chargeback and capacity planning model can be established.
- If the requirements of data access change—for example, some new stored procedures are deployed that can modify data—then only the corresponding back-end role needs to change and the appropriate privileges will propagate across the RBAC hierarchy automatically.
- If you need to adjust privileges, the model makes it very clear where the adjustment should be made and who gets impacted.
Frequently asked questions
I want to cover some of the questions I frequently receive when I present the methodology to our clients. Hopefully they will also answer most of the questions that you have as well.
What if many teams have the same data access requirements? Don’t we end up repeating the same front-end role configuration?
Yes, you do end up repeating the initial front-end role configuration. The methodology is not about minimizing the amount of roles to the max, it’s about being easy to manage and extend.
For example, at the beginning of your Snowflake account’s life, you might have a finance member role and a marketing member role with the exact same data access requirements. You need to resist the urge to collapse them into the same front-end role.
Not only does it not make sense logically to have a finance and marketing members role, but the data access they have today may differ from what they have tomorrow. You can easily create a script: simply replace the role name and run it twice.
I used schema-level granularity and now have a requirement for table or row-level access. What do I do?
Create a new back-end role that matches your new lower-level requirements, then adjust the front-end roles that are impacted by the new requirements. Do not go in and start modifying the existing back-end role; just create a new one.
It is better to handle finer-grain exceptions like this as they come up than it is to create a super granular strategy from the beginning.
I want to implement feature X that is not a simple data grant or virtual warehouse usage like in the examples. Where in the model do those grants go?
The rule of thumb: If the feature somehow provides access to data, create a back-end role for it and continue the rest of the methodology. If it’s a feature unrelated to data access then do the grant directly to the front-end role. The idea here is that you can grow and manipulate the change of data access requirements through back-end roles and let the changes there propagate to the front-end ones automatically.
What do I do if I have a front-end role that is a superset of another front-end role (like a team manager)?
In these cases, I lean toward making a front-end role that is granted another front-end role, and you add the new privileges at the higher role level. A marketing manager role would then be granted the marketing member role plus some other added grants, depending on requirements. This is the way to go if we want to be able to say, “Marketing manager is always allowed to do what the marketing members can do plus X, Y and Z”. Then as the marketing member requirements change, the manager’s RBAC definition adjusts automatically because of the nested front-end role.
Let’s assume that only the marketing manager should be able to read and write data from Schema 3 and also do everything the marketing team members can do. The definition would be:
Snowflake provides a complete and robust set of features that allow for a secure, efficient and extensible RBAC system. This same flexibility also allows security administrators to complicate and obfuscate things accidentally by simply granting access organically as needs arise instead of following a proper strategy from the beginning. The methodology outlined here can save security administrators many headaches down the road by providing a concrete set of rules that covers most RBAC configurations.