Automate your validated dataset deployment using Amazon QuickSight and Amazon CloudFormation

海外精选
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
0
0
{"value":"A lot of the power behind business intelligence (BI) and data visualization tools such as [Amazon QuickSight](https://aws.amazon.com/quicksight) comes from the ability to work interactively with data through a GUI. Report authors create dashboards using GUI-based tools, then in just a few clicks can share the dashboards with business users and decision-makers. This workflow empowers authors to create and manage the QuickSight resources and dashboards they’re responsible for providing.\n\nDeveloper productivity is a great benefit of UI-based development, but enterprise customers often need to consider additional factors in their BI implementation:\n\n- Promoting objects through environments (development, testing, production, and so on)\n- Scaling for hundreds of authors and thousands of users\n- Implementing data security, such as row-level and column-level rules to filter the data elements visible to specific users\n- Regulatory requirements, processes, and compliance controls.\n\nApproaches such as source control-backed CI/CD pipelines allow you to address compliance requirements and security gates with automation. For example, a hypothetical build pipeline for a Java Springboot application may enable developers to build and deploy freely to a dev environment, but the code must pass tests and vulnerability scans before being considered for promotion to upper environments. A human approval step then takes place before the code is released into production. Processes such as this provide quality, consistency, auditability, and accountability for the code being released.\n\nThe QuickSight API provides functionality for automation pipelines. Pipeline developers can use the API to migrate QuickSight resources from one environment to another. The API calls that facilitate handling QuickSight datasets enables inspection of the JSON representation of the dataset definition.\n\nThis post presents an example of how a QuickSight administrator can automate data resource management and security validation through use of the QuickSight API and [AWS CloudFormation](http://aws.amazon.com/cloudformation).\n\n#### **Solution overview**\nThe model implements security rules that rely on naming conventions for tables and columns as an integral part of the security model. Instead of relying on naming conventions, you may want to use a lookup table or similar approach to store the relationships between data tables and security tables.\n\nWe guide you through the following steps:\n1. Create relational database tables to be secured.\n2. Create a QuickSight dataset in your dev account.\n3. Generate a CloudFormation template using a Python script that allows you to enforce row-level and column-level security in each environment. You can customize this script to the needs of your organization.\n4. Use the generated CloudFormation template to deploy through dev, test, and prod using your change management process.\n\nYou can use AWS CloudFormation to manage several types of QuickSight resources, but dataset resources are a critical junction for security, so they are our focus in this post.\n\nTo implement data security rules in a large organization, controls must be in place to agree upon and implement the rules from a process perspective. This post dives deep into using code to validate security aspects of your QuickSight deployment, but data security requires more than code. The approaches put forward are intended as a part of a larger change management process, much of which is based around human review and approval.\n\nIn addition to having a change management process in place, we suggest managing your AWS resources using a CI/CD pipeline. The nature of change management and CI/CD processes can vary greatly, and are outside the scope of this post.\n\n#### **Prerequisites**\nThis post assumes a basic command of the following:\n- Familiarity with the QuickSight UI ([creating datasets](https://docs.aws.amazon.com/quicksight/latest/user/creating-data-sets.html), dashboards, and so on)\n- Python programming (Python 3 preferred)\n- Using the [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/sqs.html) library to interact with AWS\n- [Creating stacks with AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-create-stack.html)\n- SQL\n- A [Postgres database](https://aws.amazon.com/getting-started/hands-on/create-connect-postgresql-db/) running in AWS\n\nWe don’t go into the broader picture of integrating into a full CI/CD process, so an understanding of CI/CD is helpful, but not required.\n\n#### **Security rules for your organization**\n\nBefore we can write a script to confirm security rules have been applied correctly, we need to know what the security rules actually are. This means we need to determine the following:\n\n- **What** – What is the data we are trying to secure? Which fields in the database are sensitive? Which field values will be used to filter access?\n- **Who **– Who are the users and groups that should be provided access to the data and fields we have identified?\n\nIn concrete terms, we need to match identities (users and groups) to actual data values (used in row-level security) and sensitive fields (for column-level security). Identities such as users and groups typically correlate to entities in external systems such as Active Directory, but you can use native QuickSight users and groups.\n\nFor this post, we define the following rules that indicate the relationship between database objects (tables and fields) and how they should be secured. Keep in mind that these example rules may not apply to every organization. Security should be developed to match your requirements and processes.\n\n- Any field name with ```_sensitive``` appended to it is identified as containing sensitive data. For example, a column named ```salary_usd_sensitive``` should be ```restricted```. For our scenario, we say that the user should be a member of the QuickSight restricted group in order to access sensitive fields. No other groups are allowed access to these fields.\n- For a given table, a companion table with ```_rls``` appended to the name contains the row-level security rules used to secure the table. In this model, the row-level security rules for the ```employees``` table are found in the ```employees_rls``` table.\n- Row-level security rules must be sourced 100% from the underlying data store. This means that you can’t upload rules via the QuickSight console, or use custom SQL in QuickSight to create the rules. Rules can be provided as views (if supported by the underlying data store) as long as the view definition is managed using a change management process.\n- The dataset name should match the name of the underlying database table.\n\nThese rules rely on a well-structured change management process for the database. If users and developers have access to change database objects in production, the rules won’t carry much weight. For examples of automated schema management using open-source CI/CD tooling, refer to [Deploy, track, and roll back RDS database code changes using open source tools Liquibase and Jenkins](https://aws.amazon.com/blogs/opensource/rds-code-change-deployment/) and [How to Integrate Amazon RDS Schema Changes into CI/CD Pipelines with GitLab and Liquibase](https://aws.amazon.com/blogs/apn/how-to-integrate-amazon-rds-schema-changes-into-ci-cd-pipelines-with-gitlab-and-liquibase/).\n\nFrom the QuickSight perspective, our database becomes the source of the “what” and “who” we discussed earlier. QuickSight doesn’t own the security rules, it merely implements the rules as defined in the database.\n\n##### **Security rule management with database objects**\nFor this post, we source data from a Postgres database using a read-only user created for QuickSight.\n\nFirst, we create our schema and a data table with a few rows inserted:\n\n```\ncreate schema if not exists ledger;\n\n--the table we are securing\ndrop table if exists ledger.transactions;\ncreate table if not exists ledger.transactions (\n txn_id integer,\n txn_type varchar(100),\n txn_desc varchar(100),\n txn_amt float,\n department varchar(100),\n discount_sensitive float\n);\n\ninsert into ledger.transactions (\n txn_id,\n txn_type,\n txn_desc,\n txn_amt,\n department,\n discount_sensitive\n) \nvalues\n(1, 'expense', 'commission', -1000.00, 'field sales', 0.0),\n(2, 'revenue', 'widgets', 15000.00, 'field sales', 1000.00),\n(3, 'revenue', 'interest', 1000.00, 'corporate', 0.0),\n(4, 'expense', 'taxes', -1234.00, 'corporate', 0.0),\n(5, 'revenue', 'doodads', 1000.00, 'field sales', 100.0)\n;\n```\n\nNote the field ```discount_sensitive```. In our security model, any field name with ```_sensitive``` appended to it is identified as containing sensitive data. This information is used later when we implement column-level security. In our example, we have the luxury of using naming conventions to tag the sensitive fields, but that isn’t always possible. Other options could involve the use of SQL comments, or creating a table that provides a lookup for sensitive fields. Which method you choose depends upon your data and requirements, and should be supported by a change management process.\n\n##### **Row-level security table**\nThe following SQL creates a table containing the row-level security rules for the ```ledger.transactions table```, then inserts rules that match the example discussed earlier:\n\n```\ndrop table if exists ledger.transactions_rls;\ncreate table ledger.transactions_rls (\n groupname varchar(100),\n department varchar(1000)\n);\n\n\ninsert into ledger.transactions_rls (groupname, department) \nvalues\n('restricted', null), --null indicates all values\n('anybody', 'field sales');\n```\n\nFor more information about how to restrict access to a dataset using row-level security, refer to [Using row-level security (RLS) with user-based rules to restrict access to a dataset](https://docs.aws.amazon.com/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html)\n\nThese rules match the specified QuickSight user groups to values in the ```department``` field of the ```transactions``` table.\n\nOur last step in Postgres is to create a user that has read-only access to our tables. All end-user or SPICE refresh queries from QuickSight are run using this user. See the following code:\n\n```\ndrop role if exists qs_user;\ncreate role qs_user login password 'GETABETTERPASSSWORD';\ngrant connect on database quicksight TO qs_user;\ngrant usage on schema ledger to qs_user;\ngrant select on ledger.transactions to qs_user;\ngrant select on ledger.transactions_rls to qs_user;\n```\n#### **Create user groups**\nOur security model provides permissions based on [group membership](https://learnquicksight.workshop.aws/en/admin-workshop/4.user-groups.html). Although QuickSight allows for these groups to be sourced from external systems such as Active Directory, our example uses native QuickSight groups.\n\nWe create our groups using the following [AWS Command Line Interface](http://aws.amazon.com/cli) (AWS CLI) commands. Take note of the restricted group we’re creating; this is the group we use to grant access to sensitive data columns.\n\n```\naws quicksight create-group \\\n--aws-account-id YOUR_AWS_ACCOUNT_ID_HERE \\\n--namespace default \\\n--group-name restricted\n\naws quicksight create-group \\\n--aws-account-id YOUR_AWS_ACCOUNT_ID_HERE \\\n--namespace default \\\n--group-name anybody\n```\nYou can also add a user to your group with the following code:\n```\naws quicksight create-group-membership \\\n--aws-account-id YOUR_AWS_ACCOUNT_ID_HERE \\\n--namespace default \\\n--group-name anybody \\\n--member-name quicksightuser@withanemail.com\n```\n#### **The Python script**\nNow that we have set up our database and groups, we switch focus to the [Python script](https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/BDB-1687/QuickSightDataSetToCF.py) used for the following actions:\n- Extracting the definition of a manually created dataset using the QuickSight API\n- Ensuring that the dataset definition meets security standards\n- Restructuring the dataset definition into the format of a CloudFormation template\n- Writing the CloudFormation template to a JSON file\n\nIn the header of the script, you can see the following variables, which you should set to values in your own AWS environment:\n\n```\n# Parameters for the source data set\nregion_name = 'AWS_REGION_NAME'\naws_account_id = \"AWS_ACCOUNT_ID\"\nsource_data_set_id = \"ID_FOR_THE_SOURCE_DATA_SET\"\n\n# Parameters are used when creating the cloudformation template\ntarget_data_set_name = \"DATA_SET_DISPLAY_NAME\"\ntarget_data_set_id = \"NEW_DATA_SET_ID\"\ntemplate_file_name = \"dataset.json\"\n```\nQuickSight datasets have a name and an ID. The name is displayed in the QuickSight UI, and the ID is used to reference the dataset behind the scenes. The ID must be unique for a given account and Region, which is why QuickSight uses UUIDs by default, but you can use any unique string.\n\n#### **Create the datasets**\nYou can use the QuickSight GUI or Public API to create a dataset for the ```transactions_rls``` and transactions tables. For instructions, refer to [Creating a dataset from a database](https://docs.aws.amazon.com/quicksight/latest/user/create-a-database-data-set.html). Connect to the database, create the datasets, then apply ```transactions_rls``` as the row-level security for the ```transactions``` dataset. You can use the following ```list-data-sets``` AWS CLI call to verify that your tables were created successfully:\n\n```\n$ aws quicksight list-data-sets --aws-account-id YOURACCOUNT \n{\n \"DataSetSummaries\": [\n {\n \"Arn\": \"arn:aws:quicksight:us-west-2:YOURACCOUNT:dataset/<ID>\",\n \"DataSetId\": \"<ID>\",\n \"Name\": \"transactions\",\n \"CreatedTime\": \"2021-09-15T15:41:56.716000-07:00\",\n \"LastUpdatedTime\": \"2021-09-15T16:38:03.658000-07:00\",\n \"ImportMode\": \"SPICE\",\n \"RowLevelPermissionDataSet\": {\n \"Namespace\": \"default\",\n \"Arn\": \"arn:aws:quicksight:us-west-2: YOURACCOUNT:dataset/<RLS_ID>\",\n \"PermissionPolicy\": \"GRANT_ACCESS\",\n \"FormatVersion\": \"VERSION_1\",\n \"Status\": \"ENABLED\"\n },\n \"RowLevelPermissionTagConfigurationApplied\": false,\n \"ColumnLevelPermissionRulesApplied\": true\n },\n {\n \"Arn\": \"arn:aws:quicksight:us-west-2: YOURACCOUNT:dataset/<RLS_ID>\",\n \"DataSetId\": \"<RLS_ID>\",\n \"Name\": \"transactions_rls\",\n \"CreatedTime\": \"2021-09-15T15:42:37.313000-07:00\",\n \"LastUpdatedTime\": \"2021-09-15T15:42:37.520000-07:00\",\n \"ImportMode\": \"SPICE\",\n \"RowLevelPermissionTagConfigurationApplied\": false,\n \"ColumnLevelPermissionRulesApplied\": false\n }\n ]\n}\n```\n\n#### **Script overview**\n\nOur script is based around the ```describe_data_set``` method of the Boto3 QuickSight client. This method returns a Python dictionary containing all the attributes associated with a dataset resource. Our script analyzes these dictionaries, then coerces them into the structure required for dataset creation using AWS CloudFormation. The structure of the ```describe_data_set``` method and the ```AWS::QuickSight::DataSet``` CloudFormation resource are very similar, but not quite identical.\n\nThe following are the top-level fields in the response for the Boto3 QuickSight client ```describe_data_set``` method:\n\n```\n{\n 'DataSet': {\n 'Arn': 'string',\n 'DataSetId': 'string',\n 'Name': 'string',\n 'CreatedTime': datetime(2015, 1, 1),\n 'LastUpdatedTime': datetime(2015, 1, 1),\n 'PhysicalTableMap': {},\n 'LogicalTableMap': {...},\n 'OutputColumns': [...],\n 'ImportMode': 'SPICE'|'DIRECT_QUERY',\n 'ConsumedSpiceCapacityInBytes': 123,\n 'ColumnGroups': [...],\n 'FieldFolders': {...},\n 'RowLevelPermissionDataSet': {...},\n 'ColumnLevelPermissionRules': [...]\n },\n 'RequestId': 'string',\n 'Status': 123\n}\n```\nOur script converts the response from the API to the structure required for creating a dataset using AWS CloudFormation.\n\nThe following are the top-level fields in the ```AWS::QuickSight::DataSet``` CloudFormation resource:\n```\n{\n \"Type\" : \"AWS::QuickSight::DataSet\",\n \"Properties\" : {\n \"AwsAccountId\" : String,\n \"ColumnGroups\" : [ ColumnGroup, ... ],\n \"ColumnLevelPermissionRules\" : [ ColumnLevelPermissionRule, ... ],\n \"DataSetId\" : String,\n \"FieldFolders\" : {Key : Value, ...},\n \"ImportMode\" : String,\n \"IngestionWaitPolicy\" : IngestionWaitPolicy,\n \"LogicalTableMap\" : {Key : Value, ...},\n \"Name\" : String,\n \"Permissions\" : [ ResourcePermission, ... ],\n \"PhysicalTableMap\" : {Key : Value, ...},\n \"RowLevelPermissionDataSet\" : RowLevelPermissionDataSet,\n \"Tags\" : [ Tag, ... ]\n }\n}\n```\nThe key differences between both JSON structures are as follows:\n- ```describe_data_set``` contains ```Arn```, ```CreatedTime```, and ```LastUpdatedTime```, which are useful fields but only relevant to an existing resource\n- AWS CloudFormation requires ```AwsAccountId``` when creating the resource\n- AWS CloudFormation accepts tags for the dataset, but ```describe_data_set``` doesn’t provide them\n- The AWS CloudFormation ```Permissions``` property allows for assigning AWS Identity and [Access Management](http://aws.amazon.com/iam) (IAM) permissions at the time of creation\n\nOur script is able to selectively choose the top-level properties we want from the ```describe_data_set``` response, then add the fields that AWS CloudFormation requires for resource creation.\n\n#### **Validate security**\nBefore the script creates the CloudFormation template, it performs validations to ensure that our dataset conforms to the defined security rules.\n\nThe following is the snippet from our script that performs validation for row-level security:\n```\nif 'RowLevelPermissionDataSet' in describe_response['DataSet']:\n if describe_response['DataSet']['RowLevelPermissionDataSet'] is None:\n raise Exception(\"row level permissions must be applied!\")\n else:\n # now we look up the rls data set so that we can confirm that it conforms to our rules\n rls_dataset_id = describe_response['DataSet']['RowLevelPermissionDataSet']['Arn'].split('/')[-1]\n rls_response = client.describe_data_set(\n AwsAccountId = aws_account_id,\n DataSetId = rls_dataset_id\n )\n \n rls_table_map = rls_response['DataSet']['PhysicalTableMap']\n\n # rls table must not be custom SQL\n if 'CustomSql' in rls_table_map[list(rls_table_map.keys())[0]]:\n raise Exception(\"RLS data set can not contain custom SQL!\")\n\n # confirm that the database table name is what we expect it to be \n if rls_response['DataSet']['Name'] != describe_response['DataSet']['Name'] + '_rls':\n raise Exception(\"RLS data set name must match pattern tablename_rls!\")\n```\n\nThe steps in the code are as follows:\n1. Ensure that any row-level security is applied (this is the bare minimum).\n2. Look up the dataset that contains the row-level security rules using another Boto3 call.\n3. Confirm that the row-level security dataset is not custom SQL.\n4. Confirm that the name of the table is as expected, with ```_rls``` appended to the name of the table being secured.\n\nThe use of custom SQL for sourcing row-level security rules isn’t secure in our case, because a QuickSight developer could use SQL to alter the underlying rules. Because of this, our model requires that a physical table from the dataset is used as the row-level security rule source. Of course, it’s possible to use a view in the database to provide the rules. A view is okay because the definition (in our scenario) is governed by a change management process, as opposed to the custom SQL, which the QuickSight developer can create.\n\nThe rules being implemented for your specific organization will be different. You may need to connect to a database directly from your Python script in order to validate the dataset was created in a secure manner. Regardless of your actual rules, the ```describe_data_set``` API method provides you the details you need to begin validation of the dataset.\n\n#### **Column-level security**\nOur model for column-level security indicates that any database field name that ends in _sensitive should only be accessible to members of a QuickSight group named restricted. Instead of validating that the dataset has the column-level security rules applied correctly, we simply enforce the rules directly in two steps:\n\n1. Identify the sensitive fields.\n2. Create a dictionary and add it to our dataset with the key ```ColumnLevelPermissionRules```.\n\nTo identify the sensitive fields, we create a list and iterate through the input columns of the physical table:\n\n```\nsensitive_fields = []\ninput_columns = physical_table_map[list(physical_table_map.keys())[0]][\"RelationalTable\"][\"InputColumns\"]\nfor input_column in input_columns:\n field_name = input_column['Name']\n if field_name[-10:len(field_name)] == '_sensitive':\n sensitive_fields.append(field_name)\n```\n\nThe result is a list of sensitive fields. We can then take this list and integrate it into the dataset through the use of a dictionary:\n\n```\nif len(sensitive_fields) > 0:\n data_set[\"ColumnLevelPermissionRules\"] = [\n {\n \"Principals\": [\n {\"Ref\": \"RestrictedUserGroupArn\"}\n ],\n \"ColumnNames\": sensitive_fields\n }\n ]\n```\n\nInstead of specifying a specific principal, we reference the CloudFormation template parameter ```RestrictedUserGroupArn```. The ARN for the restricted group is likely to vary, especially if you’re deploying to another AWS account. Using a template parameter allows us to specify the ARN at the time of dataset creation in the new environment.\n\n#### **Access to the dataset QuickSight resources**\nThe ```Permissions``` structure is added to the definition for each dataset:\n\n```\n\"Permissions\": [\n {\n \"Principal\": {\n \"Ref\": \"QuickSightAdminPrincipal\"\n },\n \"Actions\": [\n \"quicksight:DescribeDataSet\",\n \"quicksight:DescribeDataSetPermissions\",\n \"quicksight:PassDataSet\",\n \"quicksight:DescribeIngestion\",\n \"quicksight:ListIngestions\",\n \"quicksight:UpdateDataSet\",\n \"quicksight:DeleteDataSet\",\n \"quicksight:CreateIngestion\",\n \"quicksight:CancelIngestion\",\n \"quicksight:UpdateDataSetPermissions\"\n ]\n }\n]\n```\n\nA value for the ```QuickSightAdminPrincipal``` CloudFormation template parameter is provided at the time of stack creation. The preceding structure provides the principal access to manage the QuickSight dataset resource itself. Note that this is not the same as data access (though an admin user could manually remove the row-level security rules). Row-level and column-level security rules indicate whether a given user has access to specific data, whereas these permissions allow for actions on the definition of the dataset, such as the following:\n- Updating or deleting the dataset\n- Changing the security permissions\n- Initiating and monitoring SPICE refreshes\n\nEnd-users don’t require this access in order to use a dashboard created from the dataset.\n\n#### **Run the script**\nOur script requires you to specify the dataset ID, which is not the same as the dataset name. To determine the ID, use the AWS CLI ```list-data-sets``` command.\n\nTo set the script parameters, you can edit the following lines to match your environment:\n```\n# parameters for the source data set\nregion_name = 'us-west-2'\naws_account_id = \"<YOUR_ACCOUNT_ID>\"\nsource_data_set_id = \"<SOURCE_DATA_SET_ID>\"\n\n# parameters for the target data set\ntarget_data_set_name = \"DATA_SET_PRESENTATION_NAME\"\ntarget_data_set_id = \"NEW_DATA_SET_ID\"\n```\nThe following snippet runs the Python script:\n\n```\n$ quicksight_security % python3 data_set_to_cf.py \nrow level security validated!\nthe following sensitive fields were found: ['discount_sensitive']\ncloudformation template written to dataset.json\ncli-input-json file written to params.json\n```\n#### **CloudFormation template**\nNow that the security rules have been validated, our script can generate the CloudFormation template. The ```describe_response_to_cf_data_set``` method ```accepts a describe_data_set``` response as input (along with a few other parameters) and returns a dictionary that reflects the structure of an ```AWS::QuickSight::DataSet``` CloudFormation resource. Our code uses this method once for the primary dataset, and again for the ```_rls``` rules. This method handles selecting values from the response, prunes some unnecessary items (such as empty tag lists), and replaces a few values with CloudFormation references. These references allow us to provide parameter values to the template, such as QuickSight principals and the data source ARN.\n\nYou can view the template using the ```cat``` command:\n```\n$ quicksight_security % cat dataset.json \n{\n \"AWSTemplateFormatVersion\": \"2010-09-09\",\n \"Description\": \"Creates a QuickSight Data Set\",\n \"Parameters\": {\n \"DataSourceArn\": {\n \"Type\": \"String\",\n \"Description\": \"ARN for Postgres data source resource\"\n },\n \"QuickSightOwnerPrincipal\": {\n \"Type\": \"String\",\n \"Description\": \"ARN for a QuickSight principal who will be granted API access to the datasets\"\n },\n \"RestrictedUserGroupArn\": {\n \"Type\": \"String\",\n \"Description\": \"ARN for a QuickSight principal who will be granted access to sensitive fields\"\n }\n },\n \"Resources\": {\n \"NewDataSet\": {\n \"Type\": \"AWS::QuickSight::DataSet\",\n \"Properties\": {\n \"DataSetId\": \"NEW_DATA_SET_ID\",\n \"Name\": \"DATA_SET_PRESENTATION_NAME\",\n \"AwsAccountId\": {\n \"Ref\": \"AWS::AccountId\"\n },\n \"Permissions\": [\n {\n \"Principal\": {\n \"Ref\": \"QuickSightAdminPrincipal\"\n },\n \"Actions\": [\n \"quicksight:DescribeDataSet\",\n \"quicksight:DescribeDataSetPermissions\",\n \"quicksight:PassDataSet\",\n \"quicksight:DescribeIngestion\",\n \"quicksight:ListIngestions\",\n \"quicksight:UpdateDataSet\",\n \"quicksight:DeleteDataSet\",\n \"quicksight:CreateIngestion\",\n \"quicksight:CancelIngestion\",\n \"quicksight:UpdateDataSetPermissions\"\n ]\n }\n ],\n \"FieldFolders\": {},\n \"ImportMode\": \"DIRECT_QUERY\",\n \"LogicalTableMap\": {\n \"e2305db4-2c79-4ac4-aff5-224b8c809767\": {\n \"Alias\": \"transactions\",\n \"DataTransforms\": [\n {\n \"ProjectOperation\": {\n \"ProjectedColumns\": [\n \"txn_id\",\n \"txn_type\",\n \"txn_desc\",\n \"txn_amt\",\n \"department\",\n \"discount_sensitive\"\n ]\n }\n }\n ],\n \"Source\": {\n \"PhysicalTableId\": \"someguid-2c79-4ac4-aff5-224b8c809767\"\n }\n }\n },\n \"PhysicalTableMap\": {\n \"e2305db4-2c79-4ac4-aff5-224b8c809767\": {\n \"RelationalTable\": {\n \"DataSourceArn\": {\n \"Ref\": \"DataSourceArn\"\n },\n \"Schema\": \"ledger\",\n \"Name\": \"transactions\",\n \"InputColumns\": [\n {\n \"Name\": \"txn_id\",\n \"Type\": \"INTEGER\"\n },\n {\n \"Name\": \"txn_type\",\n \"Type\": \"STRING\"\n },\n {\n \"Name\": \"txn_desc\",\n \"Type\": \"STRING\"\n },\n {\n \"Name\": \"txn_amt\",\n \"Type\": \"DECIMAL\"\n },\n {\n \"Name\": \"department\",\n \"Type\": \"STRING\"\n },\n {\n \"Name\": \"discount_sensitive\",\n \"Type\": \"DECIMAL\"\n }\n ]\n }\n }\n },\n \"RowLevelPermissionDataSet\": {\n \"Namespace\": \"default\",\n \"Arn\": {\n \"Fn::GetAtt\": [\n \"NewDataSetRLS\",\n \"Arn\"\n ]\n },\n \"PermissionPolicy\": \"GRANT_ACCESS\",\n \"FormatVersion\": \"VERSION_1\"\n },\n \"ColumnLevelPermissionRules\": [\n {\n \"Principals\": [\n {\n \"Ref\": \"RestrictedUserGroupArn\"\n }\n ],\n \"ColumnNames\": [\n \"discount_sensitive\"\n ]\n }\n ]\n }\n },\n \"NewDataSetRLS\": {\n \"Type\": \"AWS::QuickSight::DataSet\",\n \"Properties\": {\n \"DataSetId\": \"NEW_DATA_SET_ID_rls\",\n \"Name\": \"DATA_SET_PRESENTATION_NAME_rls\",\n \"AwsAccountId\": {\n \"Ref\": \"AWS::AccountId\"\n },\n \"Permissions\": [\n {\n \"Principal\": {\n \"Ref\": \"QuickSightAdminPrincipal\"\n },\n \"Actions\": [\n \"quicksight:DescribeDataSet\",\n \"quicksight:DescribeDataSetPermissions\",\n \"quicksight:PassDataSet\",\n \"quicksight:DescribeIngestion\",\n \"quicksight:ListIngestions\",\n \"quicksight:UpdateDataSet\",\n \"quicksight:DeleteDataSet\",\n \"quicksight:CreateIngestion\",\n \"quicksight:CancelIngestion\",\n \"quicksight:UpdateDataSetPermissions\"\n ]\n }\n ],\n \"FieldFolders\": {},\n \"ImportMode\": \"SPICE\",\n \"LogicalTableMap\": {\n \"someguid-51d7-43c4-9f8c-c60a286b0507\": {\n \"Alias\": \"transactions_rls\",\n \"DataTransforms\": [\n {\n \"ProjectOperation\": {\n \"ProjectedColumns\": [\n \"groupname\",\n \"department\"\n ]\n }\n }\n ],\n \"Source\": {\n \"PhysicalTableId\": \"someguid-51d7-43c4-9f8c-c60a286b0507\"\n }\n }\n },\n \"PhysicalTableMap\": {\n \"someguid-51d7-43c4-9f8c-c60a286b0507\": {\n \"RelationalTable\": {\n \"DataSourceArn\": {\n \"Ref\": \"DataSourceArn\"\n },\n \"Schema\": \"ledger\",\n \"Name\": \"transactions_rls\",\n \"InputColumns\": [\n {\n \"Name\": \"groupname\",\n \"Type\": \"STRING\"\n },\n {\n \"Name\": \"department\",\n \"Type\": \"STRING\"\n }\n ]\n }\n }\n }\n }\n }\n }\n}\n```\nYou can deploy this template directly into AWS via the CloudFormation console. You are required to provide the following parameters:\n\n- **DataSourceArn** – A QuickSight dataset is a reference to a table or other database object. In order for this object to be accessed, we need to specify a QuickSight data source resource that facilitates the connection.\n- **QuickSightAdminPrincipal** – The IAM principal allowing access to the data source resource via AWS API calls. You can exclude the IAM permissions from this script and template if your existing security policies automatically provide access to the appropriate users and groups.\n- **RestrictedUserGroupArn** – The ARN of the QuickSight group that is granted access to the sensitive columns.\n\nYou can also deploy the template using the AWS CLI. Although it’s possible to pass in all the parameters directly via the command line, you may find it a bit clunky when entering long values. To simplify this, our script generates a ```params.json``` file structured to capture all the parameters required by the template:\n```\n{\n \"Parameters\": [\n {\n \"ParameterKey\": \"DataSourceArn\",\n \"ParameterValue\": \"YOUR_DATA_SOURCE_ARN_HERE\"\n },\n {\n \"ParameterKey\": \"QuickSightAdminPrincipal\",\n \"ParameterValue\": \"YOUR_ADMIN_GROUP_PRINCIPAL_HERE\"\n },\n {\n \"ParameterKey\": \"RestrictedUserGroupArn\",\n \"ParameterValue\": \"YOUR_RESTRICTED_USER_GROUP_ARN_HERE\"\n }\n ]\n}\n```\nUse the following command to build the stack, with ```params.json``` as input:\n```\naws cloudformation create-stack \\\n--stack-name SecuredDataSet \\\n--template-body file://dataset.json \\\n--cli-input-json file://params.json\n```\nYou can use the AWS CloudFormation console to monitor the stack progress. When the creation is complete, you should see your new dataset in QuickSight!\n\n#### **Conclusion**\nThough the functionality is relatively new, I consider the API and AWS CloudFormation capabilities to be one of QuickSight’s biggest strengths. Automated validation and enforcement of security rules allows for scale and better security. Being able to manage dataset definitions using AWS CloudFormation provides repeatability, and all of this sets you up for automation. The API and AWS CloudFormation provide tooling to customize QuickSight to suit your workflow, bringing BI into your organization’s cloud management strategy.\n\nIf you are looking for related information about dashboard management and migration in QuickSight, refer to [Migrate Amazon QuickSight across AWS accounts](https://aws.amazon.com/blogs/big-data/migrate-amazon-quicksight-across-aws-accounts/).\n\n##### **About the Author**\n\n![image.png](https://dev-media.amazoncloud.cn/a915834b760944bb992feb789b4e20bc_image.png)\n\n**Jeremy Winters** is an Architect in the AWS Data Lab, where he helps customers design and build data applications to meet their business needs. Prior to AWS, Jeremy built cloud and data applications for consulting customers across a variety of industries.\n\n\n\n","render":"<p>A lot of the power behind business intelligence (BI) and data visualization tools such as <a href=\"https://aws.amazon.com/quicksight\" target=\"_blank\">Amazon QuickSight</a> comes from the ability to work interactively with data through a GUI. Report authors create dashboards using GUI-based tools, then in just a few clicks can share the dashboards with business users and decision-makers. This workflow empowers authors to create and manage the QuickSight resources and dashboards they’re responsible for providing.</p>\n<p>Developer productivity is a great benefit of UI-based development, but enterprise customers often need to consider additional factors in their BI implementation:</p>\n<ul>\n<li>Promoting objects through environments (development, testing, production, and so on)</li>\n<li>Scaling for hundreds of authors and thousands of users</li>\n<li>Implementing data security, such as row-level and column-level rules to filter the data elements visible to specific users</li>\n<li>Regulatory requirements, processes, and compliance controls.</li>\n</ul>\n<p>Approaches such as source control-backed CI/CD pipelines allow you to address compliance requirements and security gates with automation. For example, a hypothetical build pipeline for a Java Springboot application may enable developers to build and deploy freely to a dev environment, but the code must pass tests and vulnerability scans before being considered for promotion to upper environments. A human approval step then takes place before the code is released into production. Processes such as this provide quality, consistency, auditability, and accountability for the code being released.</p>\n<p>The QuickSight API provides functionality for automation pipelines. Pipeline developers can use the API to migrate QuickSight resources from one environment to another. The API calls that facilitate handling QuickSight datasets enables inspection of the JSON representation of the dataset definition.</p>\n<p>This post presents an example of how a QuickSight administrator can automate data resource management and security validation through use of the QuickSight API and <a href=\"http://aws.amazon.com/cloudformation\" target=\"_blank\">AWS CloudFormation</a>.</p>\n<h4><a id=\"Solution_overview_15\"></a><strong>Solution overview</strong></h4>\n<p>The model implements security rules that rely on naming conventions for tables and columns as an integral part of the security model. Instead of relying on naming conventions, you may want to use a lookup table or similar approach to store the relationships between data tables and security tables.</p>\n<p>We guide you through the following steps:</p>\n<ol>\n<li>Create relational database tables to be secured.</li>\n<li>Create a QuickSight dataset in your dev account.</li>\n<li>Generate a CloudFormation template using a Python script that allows you to enforce row-level and column-level security in each environment. You can customize this script to the needs of your organization.</li>\n<li>Use the generated CloudFormation template to deploy through dev, test, and prod using your change management process.</li>\n</ol>\n<p>You can use AWS CloudFormation to manage several types of QuickSight resources, but dataset resources are a critical junction for security, so they are our focus in this post.</p>\n<p>To implement data security rules in a large organization, controls must be in place to agree upon and implement the rules from a process perspective. This post dives deep into using code to validate security aspects of your QuickSight deployment, but data security requires more than code. The approaches put forward are intended as a part of a larger change management process, much of which is based around human review and approval.</p>\n<p>In addition to having a change management process in place, we suggest managing your AWS resources using a CI/CD pipeline. The nature of change management and CI/CD processes can vary greatly, and are outside the scope of this post.</p>\n<h4><a id=\"Prerequisites_30\"></a><strong>Prerequisites</strong></h4>\n<p>This post assumes a basic command of the following:</p>\n<ul>\n<li>Familiarity with the QuickSight UI (<a href=\"https://docs.aws.amazon.com/quicksight/latest/user/creating-data-sets.html\" target=\"_blank\">creating datasets</a>, dashboards, and so on)</li>\n<li>Python programming (Python 3 preferred)</li>\n<li>Using the <a href=\"https://boto3.amazonaws.com/v1/documentation/api/latest/guide/sqs.html\" target=\"_blank\">Boto3</a> library to interact with AWS</li>\n<li><a href=\"https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-create-stack.html\" target=\"_blank\">Creating stacks with AWS CloudFormation</a></li>\n<li>SQL</li>\n<li>A <a href=\"https://aws.amazon.com/getting-started/hands-on/create-connect-postgresql-db/\" target=\"_blank\">Postgres database</a> running in AWS</li>\n</ul>\n<p>We don’t go into the broader picture of integrating into a full CI/CD process, so an understanding of CI/CD is helpful, but not required.</p>\n<h4><a id=\"Security_rules_for_your_organization_41\"></a><strong>Security rules for your organization</strong></h4>\n<p>Before we can write a script to confirm security rules have been applied correctly, we need to know what the security rules actually are. This means we need to determine the following:</p>\n<ul>\n<li><strong>What</strong> – What is the data we are trying to secure? Which fields in the database are sensitive? Which field values will be used to filter access?</li>\n<li>**Who **– Who are the users and groups that should be provided access to the data and fields we have identified?</li>\n</ul>\n<p>In concrete terms, we need to match identities (users and groups) to actual data values (used in row-level security) and sensitive fields (for column-level security). Identities such as users and groups typically correlate to entities in external systems such as Active Directory, but you can use native QuickSight users and groups.</p>\n<p>For this post, we define the following rules that indicate the relationship between database objects (tables and fields) and how they should be secured. Keep in mind that these example rules may not apply to every organization. Security should be developed to match your requirements and processes.</p>\n<ul>\n<li>Any field name with <code>_sensitive</code> appended to it is identified as containing sensitive data. For example, a column named <code>salary_usd_sensitive</code> should be <code>restricted</code>. For our scenario, we say that the user should be a member of the QuickSight restricted group in order to access sensitive fields. No other groups are allowed access to these fields.</li>\n<li>For a given table, a companion table with <code>_rls</code> appended to the name contains the row-level security rules used to secure the table. In this model, the row-level security rules for the <code>employees</code> table are found in the <code>employees_rls</code> table.</li>\n<li>Row-level security rules must be sourced 100% from the underlying data store. This means that you can’t upload rules via the QuickSight console, or use custom SQL in QuickSight to create the rules. Rules can be provided as views (if supported by the underlying data store) as long as the view definition is managed using a change management process.</li>\n<li>The dataset name should match the name of the underlying database table.</li>\n</ul>\n<p>These rules rely on a well-structured change management process for the database. If users and developers have access to change database objects in production, the rules won’t carry much weight. For examples of automated schema management using open-source CI/CD tooling, refer to <a href=\"https://aws.amazon.com/blogs/opensource/rds-code-change-deployment/\" target=\"_blank\">Deploy, track, and roll back RDS database code changes using open source tools Liquibase and Jenkins</a> and <a href=\"https://aws.amazon.com/blogs/apn/how-to-integrate-amazon-rds-schema-changes-into-ci-cd-pipelines-with-gitlab-and-liquibase/\" target=\"_blank\">How to Integrate Amazon RDS Schema Changes into CI/CD Pipelines with GitLab and Liquibase</a>.</p>\n<p>From the QuickSight perspective, our database becomes the source of the “what” and “who” we discussed earlier. QuickSight doesn’t own the security rules, it merely implements the rules as defined in the database.</p>\n<h5><a id=\"Security_rule_management_with_database_objects_61\"></a><strong>Security rule management with database objects</strong></h5>\n<p>For this post, we source data from a Postgres database using a read-only user created for QuickSight.</p>\n<p>First, we create our schema and a data table with a few rows inserted:</p>\n<pre><code class=\"lang-\">create schema if not exists ledger;\n\n--the table we are securing\ndrop table if exists ledger.transactions;\ncreate table if not exists ledger.transactions (\n txn_id integer,\n txn_type varchar(100),\n txn_desc varchar(100),\n txn_amt float,\n department varchar(100),\n discount_sensitive float\n);\n\ninsert into ledger.transactions (\n txn_id,\n txn_type,\n txn_desc,\n txn_amt,\n department,\n discount_sensitive\n) \nvalues\n(1, 'expense', 'commission', -1000.00, 'field sales', 0.0),\n(2, 'revenue', 'widgets', 15000.00, 'field sales', 1000.00),\n(3, 'revenue', 'interest', 1000.00, 'corporate', 0.0),\n(4, 'expense', 'taxes', -1234.00, 'corporate', 0.0),\n(5, 'revenue', 'doodads', 1000.00, 'field sales', 100.0)\n;\n</code></pre>\n<p>Note the field <code>discount_sensitive</code>. In our security model, any field name with <code>_sensitive</code> appended to it is identified as containing sensitive data. This information is used later when we implement column-level security. In our example, we have the luxury of using naming conventions to tag the sensitive fields, but that isn’t always possible. Other options could involve the use of SQL comments, or creating a table that provides a lookup for sensitive fields. Which method you choose depends upon your data and requirements, and should be supported by a change management process.</p>\n<h5><a id=\"Rowlevel_security_table_99\"></a><strong>Row-level security table</strong></h5>\n<p>The following SQL creates a table containing the row-level security rules for the <code>ledger.transactions table</code>, then inserts rules that match the example discussed earlier:</p>\n<pre><code class=\"lang-\">drop table if exists ledger.transactions_rls;\ncreate table ledger.transactions_rls (\n groupname varchar(100),\n department varchar(1000)\n);\n\n\ninsert into ledger.transactions_rls (groupname, department) \nvalues\n('restricted', null), --null indicates all values\n('anybody', 'field sales');\n</code></pre>\n<p>For more information about how to restrict access to a dataset using row-level security, refer to <a href=\"https://docs.aws.amazon.com/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html\" target=\"_blank\">Using row-level security (RLS) with user-based rules to restrict access to a dataset</a></p>\n<p>These rules match the specified QuickSight user groups to values in the <code>department</code> field of the <code>transactions</code> table.</p>\n<p>Our last step in Postgres is to create a user that has read-only access to our tables. All end-user or SPICE refresh queries from QuickSight are run using this user. See the following code:</p>\n<pre><code class=\"lang-\">drop role if exists qs_user;\ncreate role qs_user login password 'GETABETTERPASSSWORD';\ngrant connect on database quicksight TO qs_user;\ngrant usage on schema ledger to qs_user;\ngrant select on ledger.transactions to qs_user;\ngrant select on ledger.transactions_rls to qs_user;\n</code></pre>\n<h4><a id=\"Create_user_groups_130\"></a><strong>Create user groups</strong></h4>\n<p>Our security model provides permissions based on <a href=\"https://learnquicksight.workshop.aws/en/admin-workshop/4.user-groups.html\" target=\"_blank\">group membership</a>. Although QuickSight allows for these groups to be sourced from external systems such as Active Directory, our example uses native QuickSight groups.</p>\n<p>We create our groups using the following <a href=\"http://aws.amazon.com/cli\" target=\"_blank\">AWS Command Line Interface</a> (AWS CLI) commands. Take note of the restricted group we’re creating; this is the group we use to grant access to sensitive data columns.</p>\n<pre><code class=\"lang-\">aws quicksight create-group \\\n--aws-account-id YOUR_AWS_ACCOUNT_ID_HERE \\\n--namespace default \\\n--group-name restricted\n\naws quicksight create-group \\\n--aws-account-id YOUR_AWS_ACCOUNT_ID_HERE \\\n--namespace default \\\n--group-name anybody\n</code></pre>\n<p>You can also add a user to your group with the following code:</p>\n<pre><code class=\"lang-\">aws quicksight create-group-membership \\\n--aws-account-id YOUR_AWS_ACCOUNT_ID_HERE \\\n--namespace default \\\n--group-name anybody \\\n--member-name quicksightuser@withanemail.com\n</code></pre>\n<h4><a id=\"The_Python_script_154\"></a><strong>The Python script</strong></h4>\n<p>Now that we have set up our database and groups, we switch focus to the <a href=\"https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/BDB-1687/QuickSightDataSetToCF.py\" target=\"_blank\">Python script</a> used for the following actions:</p>\n<ul>\n<li>Extracting the definition of a manually created dataset using the QuickSight API</li>\n<li>Ensuring that the dataset definition meets security standards</li>\n<li>Restructuring the dataset definition into the format of a CloudFormation template</li>\n<li>Writing the CloudFormation template to a JSON file</li>\n</ul>\n<p>In the header of the script, you can see the following variables, which you should set to values in your own AWS environment:</p>\n<pre><code class=\"lang-\"># Parameters for the source data set\nregion_name = 'AWS_REGION_NAME'\naws_account_id = &quot;AWS_ACCOUNT_ID&quot;\nsource_data_set_id = &quot;ID_FOR_THE_SOURCE_DATA_SET&quot;\n\n# Parameters are used when creating the cloudformation template\ntarget_data_set_name = &quot;DATA_SET_DISPLAY_NAME&quot;\ntarget_data_set_id = &quot;NEW_DATA_SET_ID&quot;\ntemplate_file_name = &quot;dataset.json&quot;\n</code></pre>\n<p>QuickSight datasets have a name and an ID. The name is displayed in the QuickSight UI, and the ID is used to reference the dataset behind the scenes. The ID must be unique for a given account and Region, which is why QuickSight uses UUIDs by default, but you can use any unique string.</p>\n<h4><a id=\"Create_the_datasets_176\"></a><strong>Create the datasets</strong></h4>\n<p>You can use the QuickSight GUI or Public API to create a dataset for the <code>transactions_rls</code> and transactions tables. For instructions, refer to <a href=\"https://docs.aws.amazon.com/quicksight/latest/user/create-a-database-data-set.html\" target=\"_blank\">Creating a dataset from a database</a>. Connect to the database, create the datasets, then apply <code>transactions_rls</code> as the row-level security for the <code>transactions</code> dataset. You can use the following <code>list-data-sets</code> AWS CLI call to verify that your tables were created successfully:</p>\n<pre><code class=\"lang-\">$ aws quicksight list-data-sets --aws-account-id YOURACCOUNT \n{\n &quot;DataSetSummaries&quot;: [\n {\n &quot;Arn&quot;: &quot;arn:aws:quicksight:us-west-2:YOURACCOUNT:dataset/&lt;ID&gt;&quot;,\n &quot;DataSetId&quot;: &quot;&lt;ID&gt;&quot;,\n &quot;Name&quot;: &quot;transactions&quot;,\n &quot;CreatedTime&quot;: &quot;2021-09-15T15:41:56.716000-07:00&quot;,\n &quot;LastUpdatedTime&quot;: &quot;2021-09-15T16:38:03.658000-07:00&quot;,\n &quot;ImportMode&quot;: &quot;SPICE&quot;,\n &quot;RowLevelPermissionDataSet&quot;: {\n &quot;Namespace&quot;: &quot;default&quot;,\n &quot;Arn&quot;: &quot;arn:aws:quicksight:us-west-2: YOURACCOUNT:dataset/&lt;RLS_ID&gt;&quot;,\n &quot;PermissionPolicy&quot;: &quot;GRANT_ACCESS&quot;,\n &quot;FormatVersion&quot;: &quot;VERSION_1&quot;,\n &quot;Status&quot;: &quot;ENABLED&quot;\n },\n &quot;RowLevelPermissionTagConfigurationApplied&quot;: false,\n &quot;ColumnLevelPermissionRulesApplied&quot;: true\n },\n {\n &quot;Arn&quot;: &quot;arn:aws:quicksight:us-west-2: YOURACCOUNT:dataset/&lt;RLS_ID&gt;&quot;,\n &quot;DataSetId&quot;: &quot;&lt;RLS_ID&gt;&quot;,\n &quot;Name&quot;: &quot;transactions_rls&quot;,\n &quot;CreatedTime&quot;: &quot;2021-09-15T15:42:37.313000-07:00&quot;,\n &quot;LastUpdatedTime&quot;: &quot;2021-09-15T15:42:37.520000-07:00&quot;,\n &quot;ImportMode&quot;: &quot;SPICE&quot;,\n &quot;RowLevelPermissionTagConfigurationApplied&quot;: false,\n &quot;ColumnLevelPermissionRulesApplied&quot;: false\n }\n ]\n}\n</code></pre>\n<h4><a id=\"Script_overview_214\"></a><strong>Script overview</strong></h4>\n<p>Our script is based around the <code>describe_data_set</code> method of the Boto3 QuickSight client. This method returns a Python dictionary containing all the attributes associated with a dataset resource. Our script analyzes these dictionaries, then coerces them into the structure required for dataset creation using AWS CloudFormation. The structure of the <code>describe_data_set</code> method and the <code>AWS::QuickSight::DataSet</code> CloudFormation resource are very similar, but not quite identical.</p>\n<p>The following are the top-level fields in the response for the Boto3 QuickSight client <code>describe_data_set</code> method:</p>\n<pre><code class=\"lang-\">{\n 'DataSet': {\n 'Arn': 'string',\n 'DataSetId': 'string',\n 'Name': 'string',\n 'CreatedTime': datetime(2015, 1, 1),\n 'LastUpdatedTime': datetime(2015, 1, 1),\n 'PhysicalTableMap': {},\n 'LogicalTableMap': {...},\n 'OutputColumns': [...],\n 'ImportMode': 'SPICE'|'DIRECT_QUERY',\n 'ConsumedSpiceCapacityInBytes': 123,\n 'ColumnGroups': [...],\n 'FieldFolders': {...},\n 'RowLevelPermissionDataSet': {...},\n 'ColumnLevelPermissionRules': [...]\n },\n 'RequestId': 'string',\n 'Status': 123\n}\n</code></pre>\n<p>Our script converts the response from the API to the structure required for creating a dataset using AWS CloudFormation.</p>\n<p>The following are the top-level fields in the <code>AWS::QuickSight::DataSet</code> CloudFormation resource:</p>\n<pre><code class=\"lang-\">{\n &quot;Type&quot; : &quot;AWS::QuickSight::DataSet&quot;,\n &quot;Properties&quot; : {\n &quot;AwsAccountId&quot; : String,\n &quot;ColumnGroups&quot; : [ ColumnGroup, ... ],\n &quot;ColumnLevelPermissionRules&quot; : [ ColumnLevelPermissionRule, ... ],\n &quot;DataSetId&quot; : String,\n &quot;FieldFolders&quot; : {Key : Value, ...},\n &quot;ImportMode&quot; : String,\n &quot;IngestionWaitPolicy&quot; : IngestionWaitPolicy,\n &quot;LogicalTableMap&quot; : {Key : Value, ...},\n &quot;Name&quot; : String,\n &quot;Permissions&quot; : [ ResourcePermission, ... ],\n &quot;PhysicalTableMap&quot; : {Key : Value, ...},\n &quot;RowLevelPermissionDataSet&quot; : RowLevelPermissionDataSet,\n &quot;Tags&quot; : [ Tag, ... ]\n }\n}\n</code></pre>\n<p>The key differences between both JSON structures are as follows:</p>\n<ul>\n<li><code>describe_data_set</code> contains <code>Arn</code>, <code>CreatedTime</code>, and <code>LastUpdatedTime</code>, which are useful fields but only relevant to an existing resource</li>\n<li>AWS CloudFormation requires <code>AwsAccountId</code> when creating the resource</li>\n<li>AWS CloudFormation accepts tags for the dataset, but <code>describe_data_set</code> doesn’t provide them</li>\n<li>The AWS CloudFormation <code>Permissions</code> property allows for assigning AWS Identity and <a href=\"http://aws.amazon.com/iam\" target=\"_blank\">Access Management</a> (IAM) permissions at the time of creation</li>\n</ul>\n<p>Our script is able to selectively choose the top-level properties we want from the <code>describe_data_set</code> response, then add the fields that AWS CloudFormation requires for resource creation.</p>\n<h4><a id=\"Validate_security_273\"></a><strong>Validate security</strong></h4>\n<p>Before the script creates the CloudFormation template, it performs validations to ensure that our dataset conforms to the defined security rules.</p>\n<p>The following is the snippet from our script that performs validation for row-level security:</p>\n<pre><code class=\"lang-\">if 'RowLevelPermissionDataSet' in describe_response['DataSet']:\n if describe_response['DataSet']['RowLevelPermissionDataSet'] is None:\n raise Exception(&quot;row level permissions must be applied!&quot;)\n else:\n # now we look up the rls data set so that we can confirm that it conforms to our rules\n rls_dataset_id = describe_response['DataSet']['RowLevelPermissionDataSet']['Arn'].split('/')[-1]\n rls_response = client.describe_data_set(\n AwsAccountId = aws_account_id,\n DataSetId = rls_dataset_id\n )\n \n rls_table_map = rls_response['DataSet']['PhysicalTableMap']\n\n # rls table must not be custom SQL\n if 'CustomSql' in rls_table_map[list(rls_table_map.keys())[0]]:\n raise Exception(&quot;RLS data set can not contain custom SQL!&quot;)\n\n # confirm that the database table name is what we expect it to be \n if rls_response['DataSet']['Name'] != describe_response['DataSet']['Name'] + '_rls':\n raise Exception(&quot;RLS data set name must match pattern tablename_rls!&quot;)\n</code></pre>\n<p>The steps in the code are as follows:</p>\n<ol>\n<li>Ensure that any row-level security is applied (this is the bare minimum).</li>\n<li>Look up the dataset that contains the row-level security rules using another Boto3 call.</li>\n<li>Confirm that the row-level security dataset is not custom SQL.</li>\n<li>Confirm that the name of the table is as expected, with <code>_rls</code> appended to the name of the table being secured.</li>\n</ol>\n<p>The use of custom SQL for sourcing row-level security rules isn’t secure in our case, because a QuickSight developer could use SQL to alter the underlying rules. Because of this, our model requires that a physical table from the dataset is used as the row-level security rule source. Of course, it’s possible to use a view in the database to provide the rules. A view is okay because the definition (in our scenario) is governed by a change management process, as opposed to the custom SQL, which the QuickSight developer can create.</p>\n<p>The rules being implemented for your specific organization will be different. You may need to connect to a database directly from your Python script in order to validate the dataset was created in a secure manner. Regardless of your actual rules, the <code>describe_data_set</code> API method provides you the details you need to begin validation of the dataset.</p>\n<h4><a id=\"Columnlevel_security_310\"></a><strong>Column-level security</strong></h4>\n<p>Our model for column-level security indicates that any database field name that ends in _sensitive should only be accessible to members of a QuickSight group named restricted. Instead of validating that the dataset has the column-level security rules applied correctly, we simply enforce the rules directly in two steps:</p>\n<ol>\n<li>Identify the sensitive fields.</li>\n<li>Create a dictionary and add it to our dataset with the key <code>ColumnLevelPermissionRules</code>.</li>\n</ol>\n<p>To identify the sensitive fields, we create a list and iterate through the input columns of the physical table:</p>\n<pre><code class=\"lang-\">sensitive_fields = []\ninput_columns = physical_table_map[list(physical_table_map.keys())[0]][&quot;RelationalTable&quot;][&quot;InputColumns&quot;]\nfor input_column in input_columns:\n field_name = input_column['Name']\n if field_name[-10:len(field_name)] == '_sensitive':\n sensitive_fields.append(field_name)\n</code></pre>\n<p>The result is a list of sensitive fields. We can then take this list and integrate it into the dataset through the use of a dictionary:</p>\n<pre><code class=\"lang-\">if len(sensitive_fields) &gt; 0:\n data_set[&quot;ColumnLevelPermissionRules&quot;] = [\n {\n &quot;Principals&quot;: [\n {&quot;Ref&quot;: &quot;RestrictedUserGroupArn&quot;}\n ],\n &quot;ColumnNames&quot;: sensitive_fields\n }\n ]\n</code></pre>\n<p>Instead of specifying a specific principal, we reference the CloudFormation template parameter <code>RestrictedUserGroupArn</code>. The ARN for the restricted group is likely to vary, especially if you’re deploying to another AWS account. Using a template parameter allows us to specify the ARN at the time of dataset creation in the new environment.</p>\n<h4><a id=\"Access_to_the_dataset_QuickSight_resources_343\"></a><strong>Access to the dataset QuickSight resources</strong></h4>\n<p>The <code>Permissions</code> structure is added to the definition for each dataset:</p>\n<pre><code class=\"lang-\">&quot;Permissions&quot;: [\n {\n &quot;Principal&quot;: {\n &quot;Ref&quot;: &quot;QuickSightAdminPrincipal&quot;\n },\n &quot;Actions&quot;: [\n &quot;quicksight:DescribeDataSet&quot;,\n &quot;quicksight:DescribeDataSetPermissions&quot;,\n &quot;quicksight:PassDataSet&quot;,\n &quot;quicksight:DescribeIngestion&quot;,\n &quot;quicksight:ListIngestions&quot;,\n &quot;quicksight:UpdateDataSet&quot;,\n &quot;quicksight:DeleteDataSet&quot;,\n &quot;quicksight:CreateIngestion&quot;,\n &quot;quicksight:CancelIngestion&quot;,\n &quot;quicksight:UpdateDataSetPermissions&quot;\n ]\n }\n]\n</code></pre>\n<p>A value for the <code>QuickSightAdminPrincipal</code> CloudFormation template parameter is provided at the time of stack creation. The preceding structure provides the principal access to manage the QuickSight dataset resource itself. Note that this is not the same as data access (though an admin user could manually remove the row-level security rules). Row-level and column-level security rules indicate whether a given user has access to specific data, whereas these permissions allow for actions on the definition of the dataset, such as the following:</p>\n<ul>\n<li>Updating or deleting the dataset</li>\n<li>Changing the security permissions</li>\n<li>Initiating and monitoring SPICE refreshes</li>\n</ul>\n<p>End-users don’t require this access in order to use a dashboard created from the dataset.</p>\n<h4><a id=\"Run_the_script_375\"></a><strong>Run the script</strong></h4>\n<p>Our script requires you to specify the dataset ID, which is not the same as the dataset name. To determine the ID, use the AWS CLI <code>list-data-sets</code> command.</p>\n<p>To set the script parameters, you can edit the following lines to match your environment:</p>\n<pre><code class=\"lang-\"># parameters for the source data set\nregion_name = 'us-west-2'\naws_account_id = &quot;&lt;YOUR_ACCOUNT_ID&gt;&quot;\nsource_data_set_id = &quot;&lt;SOURCE_DATA_SET_ID&gt;&quot;\n\n# parameters for the target data set\ntarget_data_set_name = &quot;DATA_SET_PRESENTATION_NAME&quot;\ntarget_data_set_id = &quot;NEW_DATA_SET_ID&quot;\n</code></pre>\n<p>The following snippet runs the Python script:</p>\n<pre><code class=\"lang-\">$ quicksight_security % python3 data_set_to_cf.py \nrow level security validated!\nthe following sensitive fields were found: ['discount_sensitive']\ncloudformation template written to dataset.json\ncli-input-json file written to params.json\n</code></pre>\n<h4><a id=\"CloudFormation_template_398\"></a><strong>CloudFormation template</strong></h4>\n<p>Now that the security rules have been validated, our script can generate the CloudFormation template. The <code>describe_response_to_cf_data_set</code> method <code>accepts a describe_data_set</code> response as input (along with a few other parameters) and returns a dictionary that reflects the structure of an <code>AWS::QuickSight::DataSet</code> CloudFormation resource. Our code uses this method once for the primary dataset, and again for the <code>_rls</code> rules. This method handles selecting values from the response, prunes some unnecessary items (such as empty tag lists), and replaces a few values with CloudFormation references. These references allow us to provide parameter values to the template, such as QuickSight principals and the data source ARN.</p>\n<p>You can view the template using the <code>cat</code> command:</p>\n<pre><code class=\"lang-\">$ quicksight_security % cat dataset.json \n{\n &quot;AWSTemplateFormatVersion&quot;: &quot;2010-09-09&quot;,\n &quot;Description&quot;: &quot;Creates a QuickSight Data Set&quot;,\n &quot;Parameters&quot;: {\n &quot;DataSourceArn&quot;: {\n &quot;Type&quot;: &quot;String&quot;,\n &quot;Description&quot;: &quot;ARN for Postgres data source resource&quot;\n },\n &quot;QuickSightOwnerPrincipal&quot;: {\n &quot;Type&quot;: &quot;String&quot;,\n &quot;Description&quot;: &quot;ARN for a QuickSight principal who will be granted API access to the datasets&quot;\n },\n &quot;RestrictedUserGroupArn&quot;: {\n &quot;Type&quot;: &quot;String&quot;,\n &quot;Description&quot;: &quot;ARN for a QuickSight principal who will be granted access to sensitive fields&quot;\n }\n },\n &quot;Resources&quot;: {\n &quot;NewDataSet&quot;: {\n &quot;Type&quot;: &quot;AWS::QuickSight::DataSet&quot;,\n &quot;Properties&quot;: {\n &quot;DataSetId&quot;: &quot;NEW_DATA_SET_ID&quot;,\n &quot;Name&quot;: &quot;DATA_SET_PRESENTATION_NAME&quot;,\n &quot;AwsAccountId&quot;: {\n &quot;Ref&quot;: &quot;AWS::AccountId&quot;\n },\n &quot;Permissions&quot;: [\n {\n &quot;Principal&quot;: {\n &quot;Ref&quot;: &quot;QuickSightAdminPrincipal&quot;\n },\n &quot;Actions&quot;: [\n &quot;quicksight:DescribeDataSet&quot;,\n &quot;quicksight:DescribeDataSetPermissions&quot;,\n &quot;quicksight:PassDataSet&quot;,\n &quot;quicksight:DescribeIngestion&quot;,\n &quot;quicksight:ListIngestions&quot;,\n &quot;quicksight:UpdateDataSet&quot;,\n &quot;quicksight:DeleteDataSet&quot;,\n &quot;quicksight:CreateIngestion&quot;,\n &quot;quicksight:CancelIngestion&quot;,\n &quot;quicksight:UpdateDataSetPermissions&quot;\n ]\n }\n ],\n &quot;FieldFolders&quot;: {},\n &quot;ImportMode&quot;: &quot;DIRECT_QUERY&quot;,\n &quot;LogicalTableMap&quot;: {\n &quot;e2305db4-2c79-4ac4-aff5-224b8c809767&quot;: {\n &quot;Alias&quot;: &quot;transactions&quot;,\n &quot;DataTransforms&quot;: [\n {\n &quot;ProjectOperation&quot;: {\n &quot;ProjectedColumns&quot;: [\n &quot;txn_id&quot;,\n &quot;txn_type&quot;,\n &quot;txn_desc&quot;,\n &quot;txn_amt&quot;,\n &quot;department&quot;,\n &quot;discount_sensitive&quot;\n ]\n }\n }\n ],\n &quot;Source&quot;: {\n &quot;PhysicalTableId&quot;: &quot;someguid-2c79-4ac4-aff5-224b8c809767&quot;\n }\n }\n },\n &quot;PhysicalTableMap&quot;: {\n &quot;e2305db4-2c79-4ac4-aff5-224b8c809767&quot;: {\n &quot;RelationalTable&quot;: {\n &quot;DataSourceArn&quot;: {\n &quot;Ref&quot;: &quot;DataSourceArn&quot;\n },\n &quot;Schema&quot;: &quot;ledger&quot;,\n &quot;Name&quot;: &quot;transactions&quot;,\n &quot;InputColumns&quot;: [\n {\n &quot;Name&quot;: &quot;txn_id&quot;,\n &quot;Type&quot;: &quot;INTEGER&quot;\n },\n {\n &quot;Name&quot;: &quot;txn_type&quot;,\n &quot;Type&quot;: &quot;STRING&quot;\n },\n {\n &quot;Name&quot;: &quot;txn_desc&quot;,\n &quot;Type&quot;: &quot;STRING&quot;\n },\n {\n &quot;Name&quot;: &quot;txn_amt&quot;,\n &quot;Type&quot;: &quot;DECIMAL&quot;\n },\n {\n &quot;Name&quot;: &quot;department&quot;,\n &quot;Type&quot;: &quot;STRING&quot;\n },\n {\n &quot;Name&quot;: &quot;discount_sensitive&quot;,\n &quot;Type&quot;: &quot;DECIMAL&quot;\n }\n ]\n }\n }\n },\n &quot;RowLevelPermissionDataSet&quot;: {\n &quot;Namespace&quot;: &quot;default&quot;,\n &quot;Arn&quot;: {\n &quot;Fn::GetAtt&quot;: [\n &quot;NewDataSetRLS&quot;,\n &quot;Arn&quot;\n ]\n },\n &quot;PermissionPolicy&quot;: &quot;GRANT_ACCESS&quot;,\n &quot;FormatVersion&quot;: &quot;VERSION_1&quot;\n },\n &quot;ColumnLevelPermissionRules&quot;: [\n {\n &quot;Principals&quot;: [\n {\n &quot;Ref&quot;: &quot;RestrictedUserGroupArn&quot;\n }\n ],\n &quot;ColumnNames&quot;: [\n &quot;discount_sensitive&quot;\n ]\n }\n ]\n }\n },\n &quot;NewDataSetRLS&quot;: {\n &quot;Type&quot;: &quot;AWS::QuickSight::DataSet&quot;,\n &quot;Properties&quot;: {\n &quot;DataSetId&quot;: &quot;NEW_DATA_SET_ID_rls&quot;,\n &quot;Name&quot;: &quot;DATA_SET_PRESENTATION_NAME_rls&quot;,\n &quot;AwsAccountId&quot;: {\n &quot;Ref&quot;: &quot;AWS::AccountId&quot;\n },\n &quot;Permissions&quot;: [\n {\n &quot;Principal&quot;: {\n &quot;Ref&quot;: &quot;QuickSightAdminPrincipal&quot;\n },\n &quot;Actions&quot;: [\n &quot;quicksight:DescribeDataSet&quot;,\n &quot;quicksight:DescribeDataSetPermissions&quot;,\n &quot;quicksight:PassDataSet&quot;,\n &quot;quicksight:DescribeIngestion&quot;,\n &quot;quicksight:ListIngestions&quot;,\n &quot;quicksight:UpdateDataSet&quot;,\n &quot;quicksight:DeleteDataSet&quot;,\n &quot;quicksight:CreateIngestion&quot;,\n &quot;quicksight:CancelIngestion&quot;,\n &quot;quicksight:UpdateDataSetPermissions&quot;\n ]\n }\n ],\n &quot;FieldFolders&quot;: {},\n &quot;ImportMode&quot;: &quot;SPICE&quot;,\n &quot;LogicalTableMap&quot;: {\n &quot;someguid-51d7-43c4-9f8c-c60a286b0507&quot;: {\n &quot;Alias&quot;: &quot;transactions_rls&quot;,\n &quot;DataTransforms&quot;: [\n {\n &quot;ProjectOperation&quot;: {\n &quot;ProjectedColumns&quot;: [\n &quot;groupname&quot;,\n &quot;department&quot;\n ]\n }\n }\n ],\n &quot;Source&quot;: {\n &quot;PhysicalTableId&quot;: &quot;someguid-51d7-43c4-9f8c-c60a286b0507&quot;\n }\n }\n },\n &quot;PhysicalTableMap&quot;: {\n &quot;someguid-51d7-43c4-9f8c-c60a286b0507&quot;: {\n &quot;RelationalTable&quot;: {\n &quot;DataSourceArn&quot;: {\n &quot;Ref&quot;: &quot;DataSourceArn&quot;\n },\n &quot;Schema&quot;: &quot;ledger&quot;,\n &quot;Name&quot;: &quot;transactions_rls&quot;,\n &quot;InputColumns&quot;: [\n {\n &quot;Name&quot;: &quot;groupname&quot;,\n &quot;Type&quot;: &quot;STRING&quot;\n },\n {\n &quot;Name&quot;: &quot;department&quot;,\n &quot;Type&quot;: &quot;STRING&quot;\n }\n ]\n }\n }\n }\n }\n }\n }\n}\n</code></pre>\n<p>You can deploy this template directly into AWS via the CloudFormation console. You are required to provide the following parameters:</p>\n<ul>\n<li><strong>DataSourceArn</strong> – A QuickSight dataset is a reference to a table or other database object. In order for this object to be accessed, we need to specify a QuickSight data source resource that facilitates the connection.</li>\n<li><strong>QuickSightAdminPrincipal</strong> – The IAM principal allowing access to the data source resource via AWS API calls. You can exclude the IAM permissions from this script and template if your existing security policies automatically provide access to the appropriate users and groups.</li>\n<li><strong>RestrictedUserGroupArn</strong> – The ARN of the QuickSight group that is granted access to the sensitive columns.</li>\n</ul>\n<p>You can also deploy the template using the AWS CLI. Although it’s possible to pass in all the parameters directly via the command line, you may find it a bit clunky when entering long values. To simplify this, our script generates a <code>params.json</code> file structured to capture all the parameters required by the template:</p>\n<pre><code class=\"lang-\">{\n &quot;Parameters&quot;: [\n {\n &quot;ParameterKey&quot;: &quot;DataSourceArn&quot;,\n &quot;ParameterValue&quot;: &quot;YOUR_DATA_SOURCE_ARN_HERE&quot;\n },\n {\n &quot;ParameterKey&quot;: &quot;QuickSightAdminPrincipal&quot;,\n &quot;ParameterValue&quot;: &quot;YOUR_ADMIN_GROUP_PRINCIPAL_HERE&quot;\n },\n {\n &quot;ParameterKey&quot;: &quot;RestrictedUserGroupArn&quot;,\n &quot;ParameterValue&quot;: &quot;YOUR_RESTRICTED_USER_GROUP_ARN_HERE&quot;\n }\n ]\n}\n</code></pre>\n<p>Use the following command to build the stack, with <code>params.json</code> as input:</p>\n<pre><code class=\"lang-\">aws cloudformation create-stack \\\n--stack-name SecuredDataSet \\\n--template-body file://dataset.json \\\n--cli-input-json file://params.json\n</code></pre>\n<p>You can use the AWS CloudFormation console to monitor the stack progress. When the creation is complete, you should see your new dataset in QuickSight!</p>\n<h4><a id=\"Conclusion_642\"></a><strong>Conclusion</strong></h4>\n<p>Though the functionality is relatively new, I consider the API and AWS CloudFormation capabilities to be one of QuickSight’s biggest strengths. Automated validation and enforcement of security rules allows for scale and better security. Being able to manage dataset definitions using AWS CloudFormation provides repeatability, and all of this sets you up for automation. The API and AWS CloudFormation provide tooling to customize QuickSight to suit your workflow, bringing BI into your organization’s cloud management strategy.</p>\n<p>If you are looking for related information about dashboard management and migration in QuickSight, refer to <a href=\"https://aws.amazon.com/blogs/big-data/migrate-amazon-quicksight-across-aws-accounts/\" target=\"_blank\">Migrate Amazon QuickSight across AWS accounts</a>.</p>\n<h5><a id=\"About_the_Author_647\"></a><strong>About the Author</strong></h5>\n<p><img src=\"https://dev-media.amazoncloud.cn/a915834b760944bb992feb789b4e20bc_image.png\" alt=\"image.png\" /></p>\n<p><strong>Jeremy Winters</strong> is an Architect in the AWS Data Lab, where he helps customers design and build data applications to meet their business needs. Prior to AWS, Jeremy built cloud and data applications for consulting customers across a variety of industries.</p>\n"}
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭
contact-us