Chaos experiments on Amazon RDS using Amazon Fault Injection Simulator

海外精选
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
0
0
{"value":"Performing controlled chaos experiments on your [Amazon Relational Database Service (RDS)](https://aws.amazon.com/rds/) database instances and validating the application behavior is essential to making sure that your application stack is resilient. How does the application behave when there is a database failover? Will the connection pooling solution or tools being used gracefully connect after a database failover is successful? Will there be a cascading failure if the database node gets rebooted for a few seconds? These are some of the fundamental questions that you should consider when evaluating the resiliency of your database stack. Chaos engineering is a way to effectively answer these questions.\n\nTraditionally, database failure conditions, such as a failover or a node reboot, are often triggered using a script or 3rd party tools. However, at scale, these external dependencies often become a bottleneck and are hard to maintain and manage. Scripts and 3rd party tools can fail when called, whereas a web service is highly available. The scripts and 3rd party tools also tend to require elevated permissions to work, which is a management overhead and insecure from a least privilege access model perspective. This is where [AWS Fault Injection Simulator (FIS)](https://aws.amazon.com/fis/) comes to the rescue.\n\nAWS Fault Injection Simulator (AWS FIS) is a fully managed service for running fault injection experiments on AWS that makes it easier to improve an application’s performance, observability, and resiliency. Fault injection experiments are used in chaos engineering, which is the practice of stressing an application in testing or production environments by creating disruptive events, such as a sudden increase in CPU or memory consumption, database failover and observing how the system responds, and implementing improvements.\n\nWe can define the key phases of chaos engineering as identifying the steady state of the workload, defining a hypothesis, running the experiment, verifying the experiment results and making necessary improvements based on the experiment results. These phases will confirm that you are injecting failures in a controlled environment through well-planned experiments in order to build confidence in the workloads and tools we are using to withstand turbulent conditions.\n\n![image.png](https://dev-media.amazoncloud.cn/ad9971d016d04a6ca437e0505c10ce1f_image.png)\n\n**Example—**\n\n- Baseline: we have a managed database with a replica and automatic failover enabled.\n- Hypothesis: failure of a single database instance / replica may slow down a few requests but will not adversely affect our application.\n- Run experiment: trigger a DB failover.\n- Verify: confirm/dis-confirm the hypothesis by looking at KPIs for the application (e.g., via CloudWatch metric/alarm).\n\n##### **Methodology and Walkthrough**\n\nLet’s look at how you can configure AWS FIS to perform failure conditions for your RDS database instances. For this walkthrough, we’ll look at injecting a cluster failover for [Amazon Aurora PostgreSQL](https://aws.amazon.com/rds/aurora/features/). You can leverage an existing Aurora PostgreSQL cluster or you can launch a new cluster by following the steps in the Create an [Aurora PostgreSQL DB Cluster](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/CHAP_GettingStartedAurora.CreatingConnecting.AuroraPostgreSQL.html) documentation.\n\n**Step 1: Select the Aurora Cluster.**\n\nThe Aurora PostgreSQL instance that we’ll use for this walkthrough is provisioned in us-east-1 (N. Virginia), and it’s a cluster with two instances. There is one writer instance and another reader instance (Aurora replica). The cluster is named chaostest, the writer instance is named chaostest-instance-1, and the reader is named chaostest-intance-1-us-east-1a.\n\n![image.png](https://dev-media.amazoncloud.cn/733912a35b7247af879c76e9317fab62_image.png)\n\nThe goal is to simulate a failover for this Aurora PostgreSQL cluster so that the existing chaostest-intance-1-us-east-1a reader instance will switch roles and then be promoted as the writer, and the existing chaostest-instance-1 will become the reader.\n\n**Step 2: Navigate to the AWS FIS console.**\n\nWe will now navigate to the AWS FIS [console](https://us-east-1.console.aws.amazon.com/fis/home?region=us-east-1#Home) to create an experiment template. Select Create experiment template.\n\n![image.png](https://dev-media.amazoncloud.cn/106cfd3fe9d34b8f8818facfbcae0372_image.png)\n\n**Step 3: Complete the AWS FIS template pre-requisites.\n**\n\nEnter a Description, Name, and select the AWS [IAM Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) for the experiment template.\n\n![image.png](https://dev-media.amazoncloud.cn/f00456c9d35946fb985d90d7208ce56c_image.png)\n\nThe IAM role selected above was pre-created. To use AWS FIS, you must [create an IAM role](https://docs.aws.amazon.com/fis/latest/userguide/getting-started-iam-service-role.html) that grants AWS FIS the permissions required so that the service can run experiments on your behalf. The role follows the least privileged model and includes permissions to act on your database clusters like trigger a failover. AWS FIS only uses the permissions that have been delegated explicitly for the role. To learn more about how to create an IAM role with the required permissions for AWS FIS, refer to the [FIS documentation.](https://docs.aws.amazon.com/fis/latest/userguide/getting-started-iam-service-role.html)\n\n**Step 4: Navigate to the Actions, Target, Stop Condition section of the template.**\n\nThe next key section of AWS FIS is Action, Target, and Stop Condition.\n\n![image.png](https://dev-media.amazoncloud.cn/fc4f3888af3d469c8b6e003faff14227_image.png)\n\n**Action**—An action is an activity that AWS FIS performs on an AWS resource during an experiment. AWS FIS provides a set of pre-configured actions based on the AWS resource type. Each Action runs for a specified duration during an experiment, or until you stop the experiment. An action can run sequentially or in parallel.\n\nFor our experiment, the Action will be aws:rds:failover-db-cluster.\n\n**Target**—A target is one or more AWS resources on which AWS FIS performs an action during an experiment. You can choose specific resources or select a group of resources based on specific criteria, such as tags or state.\n\nFor our experiment, the target will be the chaostest Aurora PostgreSQL cluster.\n\n**Stop Condition**—AWS FIS provides the controls and guardrails that you need to run experiments safely on your AWS workloads. A stop condition is a mechanism to stop an experiment if it reaches a threshold that you define as an [Amazon CloudWatch](https://aws.amazon.com/cn/cloudwatch/?trk=cndc-detail) alarm. If a [stop condition](https://docs.aws.amazon.com/fis/latest/userguide/stop-conditions.html) is triggered while the experiment is running, then AWS FIS stops the experiment.\n\nFor our experiment, we won’t be defining a stop condition. This is because this simple experiment contains only one action. Stop conditions are especially useful for experiments with a series of actions, to prevent them from continuing if something goes wrong.\n\n**Step 5: Configure Action.**\n\nNow, let’s configure the Action and Target for our experiment template. Under the Actions section, we will select Add action to get the New action window.\n\n![image.png](https://dev-media.amazoncloud.cn/dea6c9e1d18a48b4af1541e41bc9b6ac_image.png)\n\nEnter a Name, a Description, and select Action type aws:rds:failover-db-cluster. Start after is an optional setting. This setting allows you to specify an action that should precede the one we are currently configuring.\n\n![image.png](https://dev-media.amazoncloud.cn/db53ff6fdaf946c9a81acda8d0dea818_image.png)\n\n**Step 6: Configure Target.**\n\nNote that a Target has been automatically created with the name Clusters-Target-1. Select Save to save the action.\n\nNext, you will edit the Clusters-Target-1 target to select the target, i.e., the Aurora PostgreSQL cluster.\n\n![image.png](https://dev-media.amazoncloud.cn/0bf63af2fb9d43dda39b52a33c3f8f18_image.png)\n\nSelect Target method as Resource IDs, and select the chaostest cluster. If you are interested to select a group of resources, then select Resource tags, filters and parameters option.\n\n![image.png](https://dev-media.amazoncloud.cn/478d8684a27d46e6b52375814fc41f9d_image.png)\n\n**Step 7: Create the experiment template to complete this stage.**\n\nWe will wrap up the process by selecting the create experiment template.\n\n![image.png](https://dev-media.amazoncloud.cn/e4186cd5ada947a6a5b1f5fe28257501_image.png)\n\nWe will get a warning stating that a stop condition isn’t defined. We’ll enter create in the provided field to create the template.\n\n![image.png](https://dev-media.amazoncloud.cn/6d3e1277c79f431f99b849296143eed2_image.png)\n\nWe will get a success message if the entries are correct and the template will be successfully created.\n\n![image.png](https://dev-media.amazoncloud.cn/363fcc5955304aafbdc3f38730e7fa5a_image.png)\n\n**Step 8: Verify the Aurora Cluster.**\n\nBefore we run the experiment, let’s double-check the chaostest Aurora Cluster to confirm which instance is the writer and which is the reader.\n\n![image.png](https://dev-media.amazoncloud.cn/3e3025e6e90743c9bfa359a21fbb90ed_image.png)\n\nWe confirmed that chaostest-instance-1 is the writer and chaostest-instance-1-us-east-1a is the reader.\n\n**Step 9: Run the AWS FIS experiment.**\n\nNow we’ll run the FIS experiment. Select Actions, and then select Start for the experiment template.\n\n![image.png](https://dev-media.amazoncloud.cn/b526e73ea8764f568bfa14b4296af753_image.png)\n\nSelect Start experiment and you’ll get another warning to confirm if you really want to start this experiment. Confirm by entering start say Start experiment.\n\n![image.png](https://dev-media.amazoncloud.cn/dc94209f3133437bb4d08219ad261a03_image.png)\n\n**Step 10: Observe the various stages of the experiment.**\n\nThe experiment will be in initiating, running and will eventually be in completed states.\n\n![image.png](https://dev-media.amazoncloud.cn/1af60bcc69f34daba96f4f130f078cde_image.png)\n\n**Step 11: Verify the Aurora Cluster to confirm failover.**\n\nNow let’s look at the chaostest Aurora PostgreSQL cluster to check the state. Note that a failover was indeed triggered by FIS and chaostest-instance-1-us-east-1a is the newly promoted writer and chaostest-instance-1 is the reader now.\n\n![image.png](https://dev-media.amazoncloud.cn/6dd17c5ac6ec4c08b725a44a79797208_image.png)\n\n**Step 12: Verify the Aurora Cluster logs.**\n\nWe can also confirm the failover action by looking at the Logs and events section of the Aurora Cluster.\n\n![image.png](https://dev-media.amazoncloud.cn/e1af00fdc74e431bb132eb0a68a095d3_image.png)\n\n##### **Clean up**\n\nIf you created a new Aurora PostgreSQL cluster for this walkthrough, then you can terminate the cluster to optimize the costs by following the steps in the [Deleting an Aurora DB cluster](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_DeleteCluster.html) documentation.\n\nYou can also delete the AWS FIS experiment template by following the steps in the [Delete an experiment template](https://docs.aws.amazon.com/fis/latest/userguide/working-with-templates.html) documentation.\n\n##### **Further Reading**\n\nYou can refer to the AWS [FIS documentation](https://docs.aws.amazon.com/fis/latest/userguide/getting-started.html) to learn more about the service. If you want to know more about chaos engineering, check out the AWS re:Invent session [Testing resiliency using chaos engineering](https://www.youtube.com/watch?v=OlobVYPkxgg&ab_channel=AWSEvents) and [The Chaos Engineering Collection](https://medium.com/the-cloud-architect/the-chaos-engineering-collection-5e188d6a90e2). Finally, check out the [FIS Workshop](https://workshops.aws/card/FIS) for a deeper dive into using FIS, this [GitHub repo](https://github.com/aws-samples/aws-fault-injection-simulator-samples) for additional example experiments, and how you can work with AWS FIS using the [AWS Cloud Development Kit](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_fis-readme.html) ([AWS CDK](https://aws.amazon.com/cn/cdk/?trk=cndc-detail)).\n\n##### **Conclusion**\n\nIn this walkthrough, you learned how you can leverage AWS FIS to inject failures into your RDS Instances. To get started with [AWS Fault Injection Service](https://aws.amazon.com/cn/fis/?trk=cndc-detail) for [Amazon RDS](https://aws.amazon.com/cn/rds/?trk=cndc-detail), refer to the service [documentation](https://docs.aws.amazon.com/fis/latest/userguide/what-is.html).\n\n##### **Author:**\n\n![image.png](https://dev-media.amazoncloud.cn/550d177128644c5d9edbe07ed96f0c81_image.png)\n\n##### **Anup Sivadas**\nAnup Sivadas is a Principal Solutions Architect at Amazon Web Services and is based out of Arlington, Virginia. With 18 + years in technology, Anup enjoys working with AWS customers and helps them craft highly scalable, performing, resilient, secure, sustainable and cost-effective cloud architectures. Outside work, Anup’s passion is to travel and explore the nature with his family.\n\nTAGS: [AWS Fault Injection Simulator,](https://aws.amazon.com/blogs/devops/tag/aws-fault-injection-simulator/) [FIS](https://aws.amazon.com/blogs/devops/tag/fis/)","render":"<p>Performing controlled chaos experiments on your <a href=\\"https://aws.amazon.com/rds/\\" target=\\"_blank\\">Amazon Relational Database Service (RDS)</a> database instances and validating the application behavior is essential to making sure that your application stack is resilient. How does the application behave when there is a database failover? Will the connection pooling solution or tools being used gracefully connect after a database failover is successful? Will there be a cascading failure if the database node gets rebooted for a few seconds? These are some of the fundamental questions that you should consider when evaluating the resiliency of your database stack. Chaos engineering is a way to effectively answer these questions.</p>\\n<p>Traditionally, database failure conditions, such as a failover or a node reboot, are often triggered using a script or 3rd party tools. However, at scale, these external dependencies often become a bottleneck and are hard to maintain and manage. Scripts and 3rd party tools can fail when called, whereas a web service is highly available. The scripts and 3rd party tools also tend to require elevated permissions to work, which is a management overhead and insecure from a least privilege access model perspective. This is where <a href=\\"https://aws.amazon.com/fis/\\" target=\\"_blank\\">AWS Fault Injection Simulator (FIS)</a> comes to the rescue.</p>\\n<p>AWS Fault Injection Simulator (AWS FIS) is a fully managed service for running fault injection experiments on AWS that makes it easier to improve an application’s performance, observability, and resiliency. Fault injection experiments are used in chaos engineering, which is the practice of stressing an application in testing or production environments by creating disruptive events, such as a sudden increase in CPU or memory consumption, database failover and observing how the system responds, and implementing improvements.</p>\n<p>We can define the key phases of chaos engineering as identifying the steady state of the workload, defining a hypothesis, running the experiment, verifying the experiment results and making necessary improvements based on the experiment results. These phases will confirm that you are injecting failures in a controlled environment through well-planned experiments in order to build confidence in the workloads and tools we are using to withstand turbulent conditions.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/ad9971d016d04a6ca437e0505c10ce1f_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Example—</strong></p>\\n<ul>\\n<li>Baseline: we have a managed database with a replica and automatic failover enabled.</li>\n<li>Hypothesis: failure of a single database instance / replica may slow down a few requests but will not adversely affect our application.</li>\n<li>Run experiment: trigger a DB failover.</li>\n<li>Verify: confirm/dis-confirm the hypothesis by looking at KPIs for the application (e.g., via CloudWatch metric/alarm).</li>\n</ul>\\n<h5><a id=\\"Methodology_and_Walkthrough_17\\"></a><strong>Methodology and Walkthrough</strong></h5>\\n<p>Let’s look at how you can configure AWS FIS to perform failure conditions for your RDS database instances. For this walkthrough, we’ll look at injecting a cluster failover for <a href=\\"https://aws.amazon.com/rds/aurora/features/\\" target=\\"_blank\\">Amazon Aurora PostgreSQL</a>. You can leverage an existing Aurora PostgreSQL cluster or you can launch a new cluster by following the steps in the Create an <a href=\\"https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/CHAP_GettingStartedAurora.CreatingConnecting.AuroraPostgreSQL.html\\" target=\\"_blank\\">Aurora PostgreSQL DB Cluster</a> documentation.</p>\\n<p><strong>Step 1: Select the Aurora Cluster.</strong></p>\\n<p>The Aurora PostgreSQL instance that we’ll use for this walkthrough is provisioned in us-east-1 (N. Virginia), and it’s a cluster with two instances. There is one writer instance and another reader instance (Aurora replica). The cluster is named chaostest, the writer instance is named chaostest-instance-1, and the reader is named chaostest-intance-1-us-east-1a.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/733912a35b7247af879c76e9317fab62_image.png\\" alt=\\"image.png\\" /></p>\n<p>The goal is to simulate a failover for this Aurora PostgreSQL cluster so that the existing chaostest-intance-1-us-east-1a reader instance will switch roles and then be promoted as the writer, and the existing chaostest-instance-1 will become the reader.</p>\n<p><strong>Step 2: Navigate to the AWS FIS console.</strong></p>\\n<p>We will now navigate to the AWS FIS <a href=\\"https://us-east-1.console.aws.amazon.com/fis/home?region=us-east-1#Home\\" target=\\"_blank\\">console</a> to create an experiment template. Select Create experiment template.</p>\\n<p><img src=\\"https://dev-media.amazoncloud.cn/106cfd3fe9d34b8f8818facfbcae0372_image.png\\" alt=\\"image.png\\" /></p>\n<p>**Step 3: Complete the AWS FIS template pre-requisites.<br />\\n**</p>\n<p>Enter a Description, Name, and select the AWS <a href=\\"https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html\\" target=\\"_blank\\">IAM Role</a> for the experiment template.</p>\\n<p><img src=\\"https://dev-media.amazoncloud.cn/f00456c9d35946fb985d90d7208ce56c_image.png\\" alt=\\"image.png\\" /></p>\n<p>The IAM role selected above was pre-created. To use AWS FIS, you must <a href=\\"https://docs.aws.amazon.com/fis/latest/userguide/getting-started-iam-service-role.html\\" target=\\"_blank\\">create an IAM role</a> that grants AWS FIS the permissions required so that the service can run experiments on your behalf. The role follows the least privileged model and includes permissions to act on your database clusters like trigger a failover. AWS FIS only uses the permissions that have been delegated explicitly for the role. To learn more about how to create an IAM role with the required permissions for AWS FIS, refer to the <a href=\\"https://docs.aws.amazon.com/fis/latest/userguide/getting-started-iam-service-role.html\\" target=\\"_blank\\">FIS documentation.</a></p>\\n<p><strong>Step 4: Navigate to the Actions, Target, Stop Condition section of the template.</strong></p>\\n<p>The next key section of AWS FIS is Action, Target, and Stop Condition.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/fc4f3888af3d469c8b6e003faff14227_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Action</strong>—An action is an activity that AWS FIS performs on an AWS resource during an experiment. AWS FIS provides a set of pre-configured actions based on the AWS resource type. Each Action runs for a specified duration during an experiment, or until you stop the experiment. An action can run sequentially or in parallel.</p>\\n<p>For our experiment, the Action will be aws:rds:failover-db-cluster.</p>\n<p><strong>Target</strong>—A target is one or more AWS resources on which AWS FIS performs an action during an experiment. You can choose specific resources or select a group of resources based on specific criteria, such as tags or state.</p>\\n<p>For our experiment, the target will be the chaostest Aurora PostgreSQL cluster.</p>\n<p><strong>Stop Condition</strong>—AWS FIS provides the controls and guardrails that you need to run experiments safely on your AWS workloads. A stop condition is a mechanism to stop an experiment if it reaches a threshold that you define as an [Amazon CloudWatch](https://aws.amazon.com/cn/cloudwatch/?trk=cndc-detail) alarm. If a <a href=\\"https://docs.aws.amazon.com/fis/latest/userguide/stop-conditions.html\\" target=\\"_blank\\">stop condition</a> is triggered while the experiment is running, then AWS FIS stops the experiment.</p>\\n<p>For our experiment, we won’t be defining a stop condition. This is because this simple experiment contains only one action. Stop conditions are especially useful for experiments with a series of actions, to prevent them from continuing if something goes wrong.</p>\n<p><strong>Step 5: Configure Action.</strong></p>\\n<p>Now, let’s configure the Action and Target for our experiment template. Under the Actions section, we will select Add action to get the New action window.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/dea6c9e1d18a48b4af1541e41bc9b6ac_image.png\\" alt=\\"image.png\\" /></p>\n<p>Enter a Name, a Description, and select Action type aws:rds:failover-db-cluster. Start after is an optional setting. This setting allows you to specify an action that should precede the one we are currently configuring.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/db53ff6fdaf946c9a81acda8d0dea818_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Step 6: Configure Target.</strong></p>\\n<p>Note that a Target has been automatically created with the name Clusters-Target-1. Select Save to save the action.</p>\n<p>Next, you will edit the Clusters-Target-1 target to select the target, i.e., the Aurora PostgreSQL cluster.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/0bf63af2fb9d43dda39b52a33c3f8f18_image.png\\" alt=\\"image.png\\" /></p>\n<p>Select Target method as Resource IDs, and select the chaostest cluster. If you are interested to select a group of resources, then select Resource tags, filters and parameters option.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/478d8684a27d46e6b52375814fc41f9d_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Step 7: Create the experiment template to complete this stage.</strong></p>\\n<p>We will wrap up the process by selecting the create experiment template.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/e4186cd5ada947a6a5b1f5fe28257501_image.png\\" alt=\\"image.png\\" /></p>\n<p>We will get a warning stating that a stop condition isn’t defined. We’ll enter create in the provided field to create the template.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/6d3e1277c79f431f99b849296143eed2_image.png\\" alt=\\"image.png\\" /></p>\n<p>We will get a success message if the entries are correct and the template will be successfully created.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/363fcc5955304aafbdc3f38730e7fa5a_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Step 8: Verify the Aurora Cluster.</strong></p>\\n<p>Before we run the experiment, let’s double-check the chaostest Aurora Cluster to confirm which instance is the writer and which is the reader.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/3e3025e6e90743c9bfa359a21fbb90ed_image.png\\" alt=\\"image.png\\" /></p>\n<p>We confirmed that chaostest-instance-1 is the writer and chaostest-instance-1-us-east-1a is the reader.</p>\n<p><strong>Step 9: Run the AWS FIS experiment.</strong></p>\\n<p>Now we’ll run the FIS experiment. Select Actions, and then select Start for the experiment template.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/b526e73ea8764f568bfa14b4296af753_image.png\\" alt=\\"image.png\\" /></p>\n<p>Select Start experiment and you’ll get another warning to confirm if you really want to start this experiment. Confirm by entering start say Start experiment.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/dc94209f3133437bb4d08219ad261a03_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Step 10: Observe the various stages of the experiment.</strong></p>\\n<p>The experiment will be in initiating, running and will eventually be in completed states.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/1af60bcc69f34daba96f4f130f078cde_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Step 11: Verify the Aurora Cluster to confirm failover.</strong></p>\\n<p>Now let’s look at the chaostest Aurora PostgreSQL cluster to check the state. Note that a failover was indeed triggered by FIS and chaostest-instance-1-us-east-1a is the newly promoted writer and chaostest-instance-1 is the reader now.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/6dd17c5ac6ec4c08b725a44a79797208_image.png\\" alt=\\"image.png\\" /></p>\n<p><strong>Step 12: Verify the Aurora Cluster logs.</strong></p>\\n<p>We can also confirm the failover action by looking at the Logs and events section of the Aurora Cluster.</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/e1af00fdc74e431bb132eb0a68a095d3_image.png\\" alt=\\"image.png\\" /></p>\n<h5><a id=\\"Clean_up_134\\"></a><strong>Clean up</strong></h5>\\n<p>If you created a new Aurora PostgreSQL cluster for this walkthrough, then you can terminate the cluster to optimize the costs by following the steps in the <a href=\\"https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_DeleteCluster.html\\" target=\\"_blank\\">Deleting an Aurora DB cluster</a> documentation.</p>\\n<p>You can also delete the AWS FIS experiment template by following the steps in the <a href=\\"https://docs.aws.amazon.com/fis/latest/userguide/working-with-templates.html\\" target=\\"_blank\\">Delete an experiment template</a> documentation.</p>\\n<h5><a id=\\"Further_Reading_140\\"></a><strong>Further Reading</strong></h5>\\n<p>You can refer to the AWS <a href=\\"https://docs.aws.amazon.com/fis/latest/userguide/getting-started.html\\" target=\\"_blank\\">FIS documentation</a> to learn more about the service. If you want to know more about chaos engineering, check out the AWS re:Invent session <a href=\\"https://www.youtube.com/watch?v=OlobVYPkxgg&amp;ab_channel=AWSEvents\\" target=\\"_blank\\">Testing resiliency using chaos engineering</a> and <a href=\\"https://medium.com/the-cloud-architect/the-chaos-engineering-collection-5e188d6a90e2\\" target=\\"_blank\\">The Chaos Engineering Collection</a>. Finally, check out the <a href=\\"https://workshops.aws/card/FIS\\" target=\\"_blank\\">FIS Workshop</a> for a deeper dive into using FIS, this <a href=\\"https://github.com/aws-samples/aws-fault-injection-simulator-samples\\" target=\\"_blank\\">GitHub repo</a> for additional example experiments, and how you can work with AWS FIS using the <a href=\\"https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_fis-readme.html\\" target=\\"_blank\\">AWS Cloud Development Kit</a> ([AWS CDK](https://aws.amazon.com/cn/cdk/?trk=cndc-detail)).</p>\\n<h5><a id=\\"Conclusion_144\\"></a><strong>Conclusion</strong></h5>\\n<p>In this walkthrough, you learned how you can leverage AWS FIS to inject failures into your RDS Instances. To get started with AWS Fault Injection Service for Amazon RDS, refer to the service <a href=\\"https://docs.aws.amazon.com/fis/latest/userguide/what-is.html\\" target=\\"_blank\\">documentation</a>.</p>\\n<h5><a id=\\"Author_148\\"></a><strong>Author:</strong></h5>\\n<p><img src=\\"https://dev-media.amazoncloud.cn/550d177128644c5d9edbe07ed96f0c81_image.png\\" alt=\\"image.png\\" /></p>\n<h5><a id=\\"Anup_Sivadas_152\\"></a><strong>Anup Sivadas</strong></h5>\\n<p>Anup Sivadas is a Principal Solutions Architect at Amazon Web Services and is based out of Arlington, Virginia. With 18 + years in technology, Anup enjoys working with AWS customers and helps them craft highly scalable, performing, resilient, secure, sustainable and cost-effective cloud architectures. Outside work, Anup’s passion is to travel and explore the nature with his family.</p>\n<p>TAGS: <a href=\\"https://aws.amazon.com/blogs/devops/tag/aws-fault-injection-simulator/\\" target=\\"_blank\\">AWS Fault Injection Simulator,</a> <a href=\\"https://aws.amazon.com/blogs/devops/tag/fis/\\" target=\\"_blank\\">FIS</a></p>\n"}
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭