Dynamically modify Amazon DMS endpoints for flexible intermittent data migrations to Amazon

海外精选
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
0
0
{"value":"[AWS Database Migration Service](https://aws.amazon.com/dms/) (AWS DMS) is a cloud service that makes it easy to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use AWS DMS to migrate your data to the AWS Cloud or between combinations of cloud and on-premises setups. With AWS DMS, you can perform one-time migrations, and you can replicate ongoing changes to keep sources and targets in sync.\n\nAt a high level, when using AWS DMS, you do the following:\n- Create a replication instance\n- Create source and target endpoints that have connection information about your data stores\n- Create one or more migration tasks to migrate data between the source and target data stores\n\nWhen configuring your endpoints, you may find that you need connection settings to be dynamic depending on contextual information like when the task is running. For example, for full load migration tasks loading data into [Amazon Simple Storage Service](https://aws.amazon.com/s3/) (Amazon S3), you may want the target path in your S3 bucket to reflect the load date so that you can keep a full history of data loads. This is not natively supported, but you can modify the target S3 endpoint with the appropriate target path before running the migration task to ensure the data is partitioned by date in the S3 bucket.\n\nIn this post, we walk you through a solution for configuring a workflow—run on a cron-like schedule—that modifies an AWS DMS endpoint and starts a full load migration task. The AWS DMS endpoint modification code in the example updates the S3 target endpoint’s ```BucketPath``` with the current date, but you can modify it to fit other use cases where dynamic modification of AWS DMS endpoints is needed. For example, you may also want the prefix to be updated before every full load to make cataloging your tables easier.\n\nAlthough AWS DMS doesn’t support date-based partitioning for full load migration tasks, it does support date-based partitioning for full load plus replicating data and replicating changes only tasks. For more information, refer to [Using Amazon S3 as a target for AWS Database Migration Service](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html).\n\n#### **Solution walkthrough**\nThe following diagram shows the high-level architecture of the solution. We use the following AWS services:\n\n- [AWS DMS](https://aws.amazon.com/dms/) – Contains the configuration for the migration components (such as endpoints and tasks) and compute environment\n- [Amazon EventBridge](https://aws.amazon.com/eventbridge/) – Triggers the workflow to run on a cron-like schedule\n- [AWS Lambda](https://aws.amazon.com/lambda/) – Provides the serverless compute environment for running AWS SDK calls\n- [AWS Step Functions](https://aws.amazon.com/step-functions) – Orchestrates the workflow steps\n\n![image.png](https://dev-media.amazoncloud.cn/4cb1ca60092f4f5592690cdcb737aa2d_image.png)\n\nA Step Functions state machine orchestrates the invocation of three Lambda functions: ```ModifyEndpoint```, ```DescribeConnections```, and ```StartReplicationTask```. The purpose of each Lambda function is as follows:\n\n- **ModifyEndpoint** – Runs the [DMS.ModifyEndpoint SDK](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dms.html#DatabaseMigrationService.Client.modify_endpoint) call. This function contains the logic for modifying the endpoint according to our requirements. In this example, we modify the target S3 endpoint’s BucketPath to contain the current date to achieve date partitioning in our target S3 bucket.\n- **DescribeConnections** – Runs the [DMS.DescribeConnections](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dms.html#DatabaseMigrationService.Client.describe_connections) SDK call. The DMS.ModifyEndpoint SDK call made in the ModifyEndpoint function triggers an automatic connection test on the modified endpoint, and a task that references that endpoint can’t be started until the connection test is successful. This function simply checks the target endpoint’s connection status.\n- **StartReplicationTask** – Runs the [DMS.StartReplicationTask](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dms.html#DatabaseMigrationService.Client.start_replication_task) SDK call. This function kicks off the data migration process by starting the AWS DMS task with the updated endpoint.\n\nThe state machine orchestrates the invocation of these functions and handles retries and waits when necessary. The following image shows a visual representation of the workflow.\n\n![image.png](https://dev-media.amazoncloud.cn/0b115b0d5d8840e3bb426a1dbfa997af_image.png)\n\nThe state machine includes a step to wait for 60 seconds after invoking the ```ModifyEndpoint``` Lambda function because it can take 30–40 seconds for the automatic connection test to succeed. This wait time could potentially be lowered but it could result in additional invocations of the ```DescribeConnections``` function.\n\nDepending on the outcome of ```DescribeConnections```, the state machine takes different actions:\n- If the function returns a ```testing``` status, the state machine returns to the 60-second wait step to retry until a successful or failure status is returned\n- If the function returns a ```successful``` status, the state machine advances to the ```StartReplicationTask``` Lambda function invocation\n- If the function returns a ```failure``` status (```failed``` or ```deleting```), the state machine fails immediately and the ```StartReplicationTask``` function \n\nAfter the state machine has been deployed and AWS DMS has done multiple full loads of the database to Amazon S3, you’ll have an S3 path for each export with the current year, month, and day appended. The following screenshot shows that we completed two full loads of our database and the AWS DMS endpoint was modified for each respective day.\n\n![image.png](https://dev-media.amazoncloud.cn/18e3ddd655a3475f90360d7dcb65c7f0_image.png)\nnever gets invoked\n\n#### **Prerequisites**\nBefore you get started, you should have the following in place:\n\n- An AWS account with access to create EventBridge rules, Step Functions state machines, Lambda functions, and [AWS Identity and Access Management](https://aws.amazon.com/iam/) (IAM) roles\n- A working full load [AWS DMS task](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.html) with an S3 target endpoint\n\n#### **Deploy the solution in your AWS account**\nYou can deploy the solution to your [AWS account](https://aws.amazon.com/cloudformation/) via AWS CloudFormation. Complete the following steps to deploy the CloudFormation template:\n\n1. Choose Launch **S![image.png](https://dev-media.amazoncloud.cn/095315bc6da8460c889fdf35889c9052_image.png)tack**:\n[![image.png](https://dev-media.amazoncloud.cn/97c5716203b04a54b1e218def9d2f538_image.png)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/DBBLOG-1911/cloudformation_template.yaml)\n2. For **Stack name**, enter a name, such as ```dms-endpoint-mod-stack```.\n3. For **DMSTask**, enter the ARN of the AWS DMS replication task to start after modifying the S3 target endpoint. You can find the ARN on the AWS DMS console. Choose **Database migration tasks** in the navigation pane and open the task to view its details.\n4. For **EventRuleScheduleExpression**, enter the time when the DMS task should be started. The default is daily at 2:00 AM GMT, but you can modify it to meet your needs. See the [EventBridge documentation](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule-schedule.html) for the schedule expression syntax.\n5. For **S3Endpoint**, enter the ARN of the S3 target endpoint for dynamic modification. To locate the ARN, on the AWS DMS console, choose **Endpoints** and then open the target endpoint you want to modify. After filling out all of the fields, the **Specify stack details** screen should look something like this:\n\n![image.png](https://dev-media.amazoncloud.cn/fc4a7e509be444fab8572251c7b51440_image.png)\n\n6. Choose **Next**.\n7. Enter any tags you want to assign the stack.\n8. Choose **Next**.\n9. Verify the parameters are correct and choose **Create stack**.\n\nThe stack should take 3–5 minutes to deploy.\n\nOnce the stack deployment is complete, the Step Functions state machine that orchestrates the workflow will be executed at the next scheduled interval, according to the supplied EventBridge schedule expression. Alternatively, you can execute the state machine manually by navigating to the Step Functions console, choosing the state machine beginning with ```TaskWorkflowStateMachine```, and then choosing **Start execution**.\n\n#### **Cost of solution and cleanup**\nRefer to the pricing for [EventBridge](https://aws.amazon.com/eventbridge/pricing/), [Step Functions](https://aws.amazon.com/step-functions/pricing/), and [Lambda](https://aws.amazon.com/lambda/pricing/) to determine the cost of this solution per your own use cases and desired outcomes. The [AWS Pricing Calculator](https://calculator.aws/#/) is also a great way to estimate costs of a solution.\n\nDeploying the CloudFormation templates are free, but you’re charged for resources that the template deploys. To avoid incurring extra cost, delete the [CloudFormation stack](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-delete-stack.html) you deployed when going through this post.\n\n#### **Summary**\nIn this post, we covered how to dynamically modify an AWS DMS endpoint for full load AWS DMS tasks using EventBridge, Step Functions, and Lambda. We also provided a CloudFormation template that you can modify and deploy for your own workloads and use cases, like date-based folder partitioning for AWS DMS full load tasks.\n\nFor more information about the services in this solution, refer to the [AWS DMS User Guide](https://docs.aws.amazon.com/dms/latest/userguide/Welcome.html) and [AWS Step Functions User Guide](https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html). Additionally, please leave a comment with any questions or feedback!\n\n##### **About the Authors**\n\n![image.png](https://dev-media.amazoncloud.cn/aadf2e870ec94b0196335619c3cd178c_image.png)\n\n**Jeff Gardner** is a Solutions Architect with Amazon Web Services (AWS). In his role, Jeff helps enterprise customers through their cloud journey, leveraging his experience with application architecture and DevOps practices. Outside of work, Jeff enjoys watching and playing sports and chasing around his three young children.\n\n![image.png](https://dev-media.amazoncloud.cn/9b851cf914ef4dd0bd9e9338d5d74cd9_image.png)\n\n**Michael Hamilton** is an Analytics Specialist Solutions Architect who enjoys working with customers to solve their complex needs when it comes to data on AWS. He enjoys spending time with his wife and kids outside of work and has recently taken up mountain biking!","render":"<p><a href=\"https://aws.amazon.com/dms/\" target=\"_blank\">AWS Database Migration Service</a> (AWS DMS) is a cloud service that makes it easy to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use AWS DMS to migrate your data to the AWS Cloud or between combinations of cloud and on-premises setups. With AWS DMS, you can perform one-time migrations, and you can replicate ongoing changes to keep sources and targets in sync.</p>\n<p>At a high level, when using AWS DMS, you do the following:</p>\n<ul>\n<li>Create a replication instance</li>\n<li>Create source and target endpoints that have connection information about your data stores</li>\n<li>Create one or more migration tasks to migrate data between the source and target data stores</li>\n</ul>\n<p>When configuring your endpoints, you may find that you need connection settings to be dynamic depending on contextual information like when the task is running. For example, for full load migration tasks loading data into <a href=\"https://aws.amazon.com/s3/\" target=\"_blank\">Amazon Simple Storage Service</a> (Amazon S3), you may want the target path in your S3 bucket to reflect the load date so that you can keep a full history of data loads. This is not natively supported, but you can modify the target S3 endpoint with the appropriate target path before running the migration task to ensure the data is partitioned by date in the S3 bucket.</p>\n<p>In this post, we walk you through a solution for configuring a workflow—run on a cron-like schedule—that modifies an AWS DMS endpoint and starts a full load migration task. The AWS DMS endpoint modification code in the example updates the S3 target endpoint’s <code>BucketPath</code> with the current date, but you can modify it to fit other use cases where dynamic modification of AWS DMS endpoints is needed. For example, you may also want the prefix to be updated before every full load to make cataloging your tables easier.</p>\n<p>Although AWS DMS doesn’t support date-based partitioning for full load migration tasks, it does support date-based partitioning for full load plus replicating data and replicating changes only tasks. For more information, refer to <a href=\"https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html\" target=\"_blank\">Using Amazon S3 as a target for AWS Database Migration Service</a>.</p>\n<h4><a id=\"Solution_walkthrough_13\"></a><strong>Solution walkthrough</strong></h4>\n<p>The following diagram shows the high-level architecture of the solution. We use the following AWS services:</p>\n<ul>\n<li><a href=\"https://aws.amazon.com/dms/\" target=\"_blank\">AWS DMS</a> – Contains the configuration for the migration components (such as endpoints and tasks) and compute environment</li>\n<li><a href=\"https://aws.amazon.com/eventbridge/\" target=\"_blank\">Amazon EventBridge</a> – Triggers the workflow to run on a cron-like schedule</li>\n<li><a href=\"https://aws.amazon.com/lambda/\" target=\"_blank\">AWS Lambda</a> – Provides the serverless compute environment for running AWS SDK calls</li>\n<li><a href=\"https://aws.amazon.com/step-functions\" target=\"_blank\">AWS Step Functions</a> – Orchestrates the workflow steps</li>\n</ul>\n<p><img src=\"https://dev-media.amazoncloud.cn/4cb1ca60092f4f5592690cdcb737aa2d_image.png\" alt=\"image.png\" /></p>\n<p>A Step Functions state machine orchestrates the invocation of three Lambda functions: <code>ModifyEndpoint</code>, <code>DescribeConnections</code>, and <code>StartReplicationTask</code>. The purpose of each Lambda function is as follows:</p>\n<ul>\n<li><strong>ModifyEndpoint</strong> – Runs the <a href=\"https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dms.html#DatabaseMigrationService.Client.modify_endpoint\" target=\"_blank\">DMS.ModifyEndpoint SDK</a> call. This function contains the logic for modifying the endpoint according to our requirements. In this example, we modify the target S3 endpoint’s BucketPath to contain the current date to achieve date partitioning in our target S3 bucket.</li>\n<li><strong>DescribeConnections</strong> – Runs the <a href=\"https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dms.html#DatabaseMigrationService.Client.describe_connections\" target=\"_blank\">DMS.DescribeConnections</a> SDK call. The DMS.ModifyEndpoint SDK call made in the ModifyEndpoint function triggers an automatic connection test on the modified endpoint, and a task that references that endpoint can’t be started until the connection test is successful. This function simply checks the target endpoint’s connection status.</li>\n<li><strong>StartReplicationTask</strong> – Runs the <a href=\"https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dms.html#DatabaseMigrationService.Client.start_replication_task\" target=\"_blank\">DMS.StartReplicationTask</a> SDK call. This function kicks off the data migration process by starting the AWS DMS task with the updated endpoint.</li>\n</ul>\n<p>The state machine orchestrates the invocation of these functions and handles retries and waits when necessary. The following image shows a visual representation of the workflow.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/0b115b0d5d8840e3bb426a1dbfa997af_image.png\" alt=\"image.png\" /></p>\n<p>The state machine includes a step to wait for 60 seconds after invoking the <code>ModifyEndpoint</code> Lambda function because it can take 30–40 seconds for the automatic connection test to succeed. This wait time could potentially be lowered but it could result in additional invocations of the <code>DescribeConnections</code> function.</p>\n<p>Depending on the outcome of <code>DescribeConnections</code>, the state machine takes different actions:</p>\n<ul>\n<li>If the function returns a <code>testing</code> status, the state machine returns to the 60-second wait step to retry until a successful or failure status is returned</li>\n<li>If the function returns a <code>successful</code> status, the state machine advances to the <code>StartReplicationTask</code> Lambda function invocation</li>\n<li>If the function returns a <code>failure</code> status (<code>failed</code> or <code>deleting</code>), the state machine fails immediately and the <code>StartReplicationTask</code> function</li>\n</ul>\n<p>After the state machine has been deployed and AWS DMS has done multiple full loads of the database to Amazon S3, you’ll have an S3 path for each export with the current year, month, and day appended. The following screenshot shows that we completed two full loads of our database and the AWS DMS endpoint was modified for each respective day.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/18e3ddd655a3475f90360d7dcb65c7f0_image.png\" alt=\"image.png\" /><br />\nnever gets invoked</p>\n<h4><a id=\"Prerequisites_45\"></a><strong>Prerequisites</strong></h4>\n<p>Before you get started, you should have the following in place:</p>\n<ul>\n<li>An AWS account with access to create EventBridge rules, Step Functions state machines, Lambda functions, and <a href=\"https://aws.amazon.com/iam/\" target=\"_blank\">AWS Identity and Access Management</a> (IAM) roles</li>\n<li>A working full load <a href=\"https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.html\" target=\"_blank\">AWS DMS task</a> with an S3 target endpoint</li>\n</ul>\n<h4><a id=\"Deploy_the_solution_in_your_AWS_account_51\"></a><strong>Deploy the solution in your AWS account</strong></h4>\n<p>You can deploy the solution to your <a href=\"https://aws.amazon.com/cloudformation/\" target=\"_blank\">AWS account</a> via AWS CloudFormation. Complete the following steps to deploy the CloudFormation template:</p>\n<ol>\n<li>Choose Launch <strong>S<img src=\"https://dev-media.amazoncloud.cn/095315bc6da8460c889fdf35889c9052_image.png\" alt=\"image.png\" />tack</strong>:<br />\n<a href=\"https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/DBBLOG-1911/cloudformation_template.yaml\" target=\"_blank\"><img src=\"https://dev-media.amazoncloud.cn/97c5716203b04a54b1e218def9d2f538_image.png\" alt=\"image.png\" /></a></li>\n<li>For <strong>Stack name</strong>, enter a name, such as <code>dms-endpoint-mod-stack</code>.</li>\n<li>For <strong>DMSTask</strong>, enter the ARN of the AWS DMS replication task to start after modifying the S3 target endpoint. You can find the ARN on the AWS DMS console. Choose <strong>Database migration tasks</strong> in the navigation pane and open the task to view its details.</li>\n<li>For <strong>EventRuleScheduleExpression</strong>, enter the time when the DMS task should be started. The default is daily at 2:00 AM GMT, but you can modify it to meet your needs. See the <a href=\"https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule-schedule.html\" target=\"_blank\">EventBridge documentation</a> for the schedule expression syntax.</li>\n<li>For <strong>S3Endpoint</strong>, enter the ARN of the S3 target endpoint for dynamic modification. To locate the ARN, on the AWS DMS console, choose <strong>Endpoints</strong> and then open the target endpoint you want to modify. After filling out all of the fields, the <strong>Specify stack details</strong> screen should look something like this:</li>\n</ol>\n<p><img src=\"https://dev-media.amazoncloud.cn/fc4a7e509be444fab8572251c7b51440_image.png\" alt=\"image.png\" /></p>\n<ol start=\"6\">\n<li>Choose <strong>Next</strong>.</li>\n<li>Enter any tags you want to assign the stack.</li>\n<li>Choose <strong>Next</strong>.</li>\n<li>Verify the parameters are correct and choose <strong>Create stack</strong>.</li>\n</ol>\n<p>The stack should take 3–5 minutes to deploy.</p>\n<p>Once the stack deployment is complete, the Step Functions state machine that orchestrates the workflow will be executed at the next scheduled interval, according to the supplied EventBridge schedule expression. Alternatively, you can execute the state machine manually by navigating to the Step Functions console, choosing the state machine beginning with <code>TaskWorkflowStateMachine</code>, and then choosing <strong>Start execution</strong>.</p>\n<h4><a id=\"Cost_of_solution_and_cleanup_72\"></a><strong>Cost of solution and cleanup</strong></h4>\n<p>Refer to the pricing for <a href=\"https://aws.amazon.com/eventbridge/pricing/\" target=\"_blank\">EventBridge</a>, <a href=\"https://aws.amazon.com/step-functions/pricing/\" target=\"_blank\">Step Functions</a>, and <a href=\"https://aws.amazon.com/lambda/pricing/\" target=\"_blank\">Lambda</a> to determine the cost of this solution per your own use cases and desired outcomes. The <a href=\"https://calculator.aws/#/\" target=\"_blank\">AWS Pricing Calculator</a> is also a great way to estimate costs of a solution.</p>\n<p>Deploying the CloudFormation templates are free, but you’re charged for resources that the template deploys. To avoid incurring extra cost, delete the <a href=\"https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-delete-stack.html\" target=\"_blank\">CloudFormation stack</a> you deployed when going through this post.</p>\n<h4><a id=\"Summary_77\"></a><strong>Summary</strong></h4>\n<p>In this post, we covered how to dynamically modify an AWS DMS endpoint for full load AWS DMS tasks using EventBridge, Step Functions, and Lambda. We also provided a CloudFormation template that you can modify and deploy for your own workloads and use cases, like date-based folder partitioning for AWS DMS full load tasks.</p>\n<p>For more information about the services in this solution, refer to the <a href=\"https://docs.aws.amazon.com/dms/latest/userguide/Welcome.html\" target=\"_blank\">AWS DMS User Guide</a> and <a href=\"https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html\" target=\"_blank\">AWS Step Functions User Guide</a>. Additionally, please leave a comment with any questions or feedback!</p>\n<h5><a id=\"About_the_Authors_82\"></a><strong>About the Authors</strong></h5>\n<p><img src=\"https://dev-media.amazoncloud.cn/aadf2e870ec94b0196335619c3cd178c_image.png\" alt=\"image.png\" /></p>\n<p><strong>Jeff Gardner</strong> is a Solutions Architect with Amazon Web Services (AWS). In his role, Jeff helps enterprise customers through their cloud journey, leveraging his experience with application architecture and DevOps practices. Outside of work, Jeff enjoys watching and playing sports and chasing around his three young children.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/9b851cf914ef4dd0bd9e9338d5d74cd9_image.png\" alt=\"image.png\" /></p>\n<p><strong>Michael Hamilton</strong> is an Analytics Specialist Solutions Architect who enjoys working with customers to solve their complex needs when it comes to data on AWS. He enjoys spending time with his wife and kids outside of work and has recently taken up mountain biking!</p>\n"}
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭