Amazon ParallelCluster 3 集成 ANSYS CFD 计算

亚马逊云科技
Amazon Batch
0
0
{"value":"### **简介**\n\n#### **Amazon ParallelCluster**\n\nAmazon ParallelCluster 是亚马逊云科技支持的开源集群管理工具,可帮助您部署和管理高性能计算 (HPC) 集群。ParallelCluster 是建立在开源 CfnCluster 项目的基础上,Amazon ParallelCluster 可以快速构建 HPC 计算环境。自动设置所需的计算资源和共享文件系统。可以在 Amazon ParallelCluster 环境中使用批处理调度器 Amazon Batch 或 Slurm,旧版本 ParallelCluster 还支持 PBS 和 SGE。\n\nAmazon ParallelCluster 便于快速启动概念验证部署和生产部署。也可以在 Amazon ParallelCluster 基础之上构建更高级别的工作流程,例如 CFD 高性能计算。\n\nAmazon ParallelCluster 可以使用多个 Amazon HPC 服务,例如图形展示的 NICE DCV 和高性能计算文件系统 FSX Lustre 。DCV 可以使用在 CFD 前后处理上,典型的场景是工程师可以通过 DCV 使用 CFD Post 打开最终的计算模型,进行查看验证。也可以通过 ICEM 进行前处理操作。FSX Lustre 提供符合高性能计算需求的带宽和延迟。\n\n#### **NICE DCV**\n\nNICE DCV 是一种高性能远程显示协议,为客户提供一种安全的方式,可以在各种网络条件下,将远程桌面和应用程序从任何云或数据中心流式传输到任何设备。借助 NICE DCV 和 Amazon EC2,客户可以在 EC2 实例上远程运行图形密集型应用程序,并将结果流式传输到客户端计算机上,从而无需昂贵的专用工作站。跨多种 HPC 工作负载的客户使用 NICE DCV 满足其远程可视化要求。在 [Amazon EC2 ](https://aws.amazon.com/cn/ec2/?trk=cndc-detail)上使用 NICE DCV 不会产生任何额外费用。您只需为用于运行和存储工作负载的 EC2 资源付费。\n\n#### **FSx for Lustre**\n\nFSx for Lustre 使启动和运行流行的高性能 Lustre 文件系统变得轻松且经济高效。您可以使用 Lustre 来处理如[机器学习](https://aws.amazon.com/cn/machine-learning/?trk=cndc-detail)、高性能计算 (HPC)、视频处理和财务建模。\n\n开源 Lustre 文件系统专为需要快速存储的应用程序而设计。Lustre 旨在解决快速、廉价地处理世界上不断增长的数据集的问题。这是一个广泛使用的文件系统,专为世界上速度最快的计算机而设计。它提供亚毫秒级的延迟、高达数百 GB 的吞吐量以及高达数百万 IOPS。\n\n作为一项完全托管的服务,Amazon FSx 可迅速地将 Lustre 用于存储速度至关重要的工作负载。FSx for Lustre 消除了设置和管理 Lustre 文件系统的传统复杂性,使您能够在几分钟内启动高性能文件系统。它还提供了多种部署选项,因此您可以根据需求优化成本。\n\nFSx for Lustre 符合 POSIX 标准,因此您可以使用当前基于 Linux 的应用程序,而无需进行任何更改。可以像任何文件系统在 Linux 操作系统中一样工作。它还提供先写后读一致性,并支持文件锁定。\n\n#### **ANSYS Fluent**\n\nANSYS Fluent 是国际上比较流行的商用 CFD 软件包,在美国的市场占有率为 60%,凡是和流体、热传递和化学反应等有关的工业均可使用。它具有丰富的物理模型、先进的数值方法和强大的前后处理功能,在航空航天、汽车设计、石油天然气和涡轮机设计等方面都有着广泛的应用。\n\n#### **Slurm**\nParallelCluster 3 集成了 Slurm 和 Batch 作业调度系统,Slurm 是适用于 CFD 作业调度。Slurm(Simple Linux Utility for Resource Management,[http://slurm.schedmd.com/](http://slurm.schedmd.com/) )是开源的、具有容错性和高度可扩展的 Linux 集群超级计算系统资源管理和作业调度系统。超级计算系统可利用 Slurm 对资源和作业进行管理,以避免相互干扰,提高运行效率。所有需运行的作业,无论是用于程序调试还是业务计算,都可以通过交互式并行 srun 、批处理式 sbatch 或分配式 salloc 等命令提交,提交后可以利用相关命令查询作业状态等。\n\n![97f5737ea31e451f4936bffe372e814d.png](https://dev-media.amazoncloud.cn/da677eece27c4e23bd03a1365f92c192_97f5737ea31e451f4936bffe372e814d.png)\n\n### **方案部署**\n\n#### **安装 ParallelCluster**\n\n##### **前提条件**\n\nAmazon ParallelCluster 需要 Python 3.6 或更高版本。如果还没有安装,需要先从[https://www.python.org/downloads/ ](https://www.python.org/downloads/ )下载兼容的版本,进行安装。\n```\\n\$ python3\\n\\nPython 3.7.10 (default, Jun 3 2021, 00:02:01) \\n[GCC 7.3.1 20180712 (Red Hat 7.3.1-13)] on linux\\nType \\"help\\", \\"copyright\\", \\"credits\\" or \\"license\\" for more information.、\\n>>>\\n```\n##### **安装虚拟环境 virtualenv**\n\n```\\n\$ python3 -m pip install --upgrade pip\\n\\nDefaulting to user installation because normal site-packages is not writeable\\nCollecting pip\\n Downloading pip-22.2.1-py3-none-any.whl (2.0 MB)\\n |████████████████████████████████| 2.0 MB 44.7 MB/s \\nInstalling collected packages: pip\\nSuccessfully installed pip-22.2.1\\n\\n\$ python3 -m pip install --user --upgrade virtualenv\\n\\nCollecting virtualenv\\n Downloading virtualenv-20.16.2-py2.py3-none-any.whl (8.8 MB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 89.1 MB/s eta 0:00:00\\nCollecting distlib<1,>=0.3.1\\n Downloading distlib-0.3.5-py2.py3-none-any.whl (466 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 467.0/467.0 kB 71.2 MB/s eta 0:00:00\\nCollecting importlib-metadata>=0.12\\n Downloading importlib_metadata-4.12.0-py3-none-any.whl (21 kB)\\nCollecting platformdirs<3,>=2\\n Downloading platformdirs-2.5.2-py3-none-any.whl (14 kB)\\nCollecting filelock<4,>=3.2\\n Downloading filelock-3.7.1-py3-none-any.whl (10 kB)\\nCollecting typing-extensions>=3.6.4\\n Downloading typing_extensions-4.3.0-py3-none-any.whl (25 kB)\\nCollecting zipp>=0.5\\n Downloading zipp-3.8.1-py3-none-any.whl (5.6 kB)\\nInstalling collected packages: distlib, zipp, typing-extensions, platformdirs, filelock, importlib-metadata, virtualenv\\nSuccessfully installed distlib-0.3.5 filelock-3.7.1 importlib-metadata-4.12.0 platformdirs-2.5.2 typing-extensions-4.3.0 virtualenv-20.16.2 zipp-3.8.\\n```\n##### **创建 virtualenv,并命名**\n```\\n\$ python3 -m virtualenv ~/apc-ve\\n\\ncreated virtual environment CPython3.7.10.final.0-64 in 850ms\\n creator CPython3Posix(dest=/home/ec2-user/apc-ve, clear=False, no_vcs_ignore=False, global=False)\\n seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/ec2-user/.local/share/virtualenv)\\n added seed packages: pip==22.2.1, setuptools==63.2.0, wheel==0.37.1\\n activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator\\n```\n这个时候会在当前目录下生成文件夹 apc-ve\n\n##### **激活新的 virtualenv**\n```\\n\$ source ~/apc-ve/bin/activate\\n```\n##### **在虚拟环境下安装 Amazon ParallelCluster**\n```\\n\$ python3 -m pip install --upgrade \\"aws-parallelcluster\\"\\n\\nCollecting aws-parallelcluster\\n Downloading aws_parallelcluster-3.2.0-py3-none-any.whl (424 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 425.0/425.0 kB 37.8 MB/s eta 0:00:00\\nCollecting aws-cdk.aws-batch!=1.153.0,~=1.137\\n Downloading aws_cdk.aws_batch-1.167.0-py3-none-any.whl (333 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 333.6/333.6 kB 52.3 MB/s eta 0:00:00\\nCollecting jmespath~=0.10\\n Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)\\nCollecting aws-cdk.aws-cloudwatch!=1.153.0,~=1.137\\n Downloading aws_cdk.aws_cloudwatch-1.167.0-py3-none-any.whl (379 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 379.1/379.1 kB 44.9 MB/s eta 0:00:00\\nCollecting aws-cdk.core!=1.153.0,~=1.137\\n Downloading aws_cdk.core-1.167.0-py3-none-any.whl (1.4 MB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 95.1 MB/s eta 0:00:00\\n\\n……\\n\\nCollecting certifi>=2017.4.17\\n Downloading certifi-2022.6.15-py3-none-any.whl (160 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 160.2/160.2 kB 41.5 MB/s eta 0:00:00\\nCollecting exceptiongroup\\n Downloading exceptiongroup-1.0.0rc8-py3-none-any.whl (11 kB)\\nCollecting six>=1.5\\n Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)\\nInstalling collected packages: publication, zipp, urllib3, typing-extensions, typeguard, tabulate, six, PyYAML, pyrsistent, pyparsing, pkgutil-resolve-name, MarkupSafe, jmespath, itsdangerous, inflection, idna, exceptiongroup, charset-normalizer, certifi, attrs, werkzeug, requests, python-dateutil, packaging, jinja2, importlib-resources, importlib-metadata, cattrs, marshmallow, jsonschema, jsii, click, botocore, s3transfer, flask, constructs, clickclick, aws-cdk.region-info, aws-cdk.cloud-assembly-schema, connexion, boto3, aws-cdk.cx-api, aws-cdk.core, aws-cdk.aws-signer, aws-cdk.aws-sam, aws-cdk.aws-imagebuilder, aws-cdk.aws-iam, aws-cdk.aws-codestarnotifications, aws-cdk.aws-acmpca, aws-cdk.assets, aws-cdk.aws-kms, aws-cdk.aws-events, aws-cdk.aws-codeguruprofiler, aws-cdk.aws-cloudwatch, aws-cdk.aws-autoscaling-common, aws-cdk.aws-ssm, aws-cdk.aws-sqs, aws-cdk.aws-s3, aws-cdk.aws-ecr, aws-cdk.aws-applicationautoscaling, aws-cdk.aws-sns, aws-cdk.aws-s3-assets, aws-cdk.aws-ecr-assets, aws-cdk.aws-logs, aws-cdk.aws-codecommit, aws-cdk.aws-stepfunctions, aws-cdk.aws-kinesis, aws-cdk.aws-ec2, aws-cdk.aws-fsx, aws-cdk.aws-elasticloadbalancing, aws-cdk.aws-efs, aws-cdk.aws-lambda, aws-cdk.aws-sns-subscriptions, aws-cdk.aws-secretsmanager, aws-cdk.aws-cloudformation, aws-cdk.custom-resources, aws-cdk.aws-codebuild, aws-cdk.aws-route53, aws-cdk.aws-globalaccelerator, aws-cdk.aws-dynamodb, aws-cdk.aws-certificatemanager, aws-cdk.aws-elasticloadbalancingv2, aws-cdk.aws-cognito, aws-cdk.aws-cloudfront, aws-cdk.aws-servicediscovery, aws-cdk.aws-autoscaling, aws-cdk.aws-apigateway, aws-cdk.aws-route53-targets, aws-cdk.aws-autoscaling-hooktargets, aws-cdk.aws-ecs, aws-cdk.aws-batch, aws-parallelcluster\\nSuccessfully installed MarkupSafe-2.1.1 PyYAML-5.4.1 attrs-21.4.0 aws-cdk.assets-1.167.0 aws-cdk.aws-acmpca-1.167.0 aws-cdk.aws-apigateway-1.167.0 aws-cdk.aws-applicationautoscaling-1.167.0 aws-cdk.aws-autoscaling-1.167.0 aws-cdk.aws-autoscaling-common-1.167.0 aws-cdk.aws-autoscaling-hooktargets-1.167.0 aws-cdk.aws-batch-1.167.0 aws-cdk.aws-certificatemanager-1.167.0 aws-cdk.aws-cloudformation-1.167.0 aws-cdk.aws-cloudfront-1.167.0 aws-cdk.aws-cloudwatch-1.167.0 aws-cdk.aws-codebuild-1.167.0 aws-cdk.aws-codecommit-1.167.0 aws-cdk.aws-codeguruprofiler-1.167.0 aws-cdk.aws-codestarnotifications-1.167.0 aws-cdk.aws-cognito-1.167.0 aws-cdk.aws-dynamodb-1.167.0 aws-cdk.aws-ec2-1.167.0 aws-cdk.aws-ecr-1.167.0 aws-cdk.aws-ecr-assets-1.167.0 aws-cdk.aws-ecs-1.167.0 aws-cdk.aws-efs-1.167.0 aws-cdk.aws-elasticloadbalancing-1.167.0 aws-cdk.aws-elasticloadbalancingv2-1.167.0 aws-cdk.aws-events-1.167.0 aws-cdk.aws-fsx-1.167.0 aws-cdk.aws-globalaccelerator-1.167.0 aws-cdk.aws-iam-1.167.0 aws-cdk.aws-imagebuilder-1.167.0 aws-cdk.aws-kinesis-1.167.0 aws-cdk.aws-kms-1.167.0 aws-cdk.aws-lambda-1.167.0 aws-cdk.aws-logs-1.167.0 aws-cdk.aws-route53-1.167.0 aws-cdk.aws-route53-targets-1.167.0 aws-cdk.aws-s3-1.167.0 aws-cdk.aws-s3-assets-1.167.0 aws-cdk.aws-sam-1.167.0 aws-cdk.aws-secretsmanager-1.167.0 aws-cdk.aws-servicediscovery-1.167.0 aws-cdk.aws-signer-1.167.0 aws-cdk.aws-sns-1.167.0 aws-cdk.aws-sns-subscriptions-1.167.0 aws-cdk.aws-sqs-1.167.0 aws-cdk.aws-ssm-1.167.0 aws-cdk.aws-stepfunctions-1.167.0 aws-cdk.cloud-assembly-schema-1.167.0 aws-cdk.core-1.167.0 aws-cdk.custom-resources-1.167.0 aws-cdk.cx-api-1.167.0 aws-cdk.region-info-1.167.0 aws-parallelcluster-3.2.0 boto3-1.24.44 botocore-1.27.44 cattrs-22.1.0 certifi-2022.6.15 charset-normalizer-2.1.0 click-8.1.3 clickclick-20.10.2 connexion-2.13.1 constructs-3.4.58 exceptiongroup-1.0.0rc8 flask-2.2.0 idna-3.3 importlib-metadata-4.12.0 importlib-resources-5.9.0 inflection-0.5.1 itsdangerous-2.1.2 jinja2-3.1.2 jmespath-0.10.0 jsii-1.63.2 jsonschema-4.9.0 marshmallow-3.17.0 packaging-21.3 pkgutil-resolve-name-1.3.10 publication-0.0.3 pyparsing-3.0.9 pyrsistent-0.18.1 python-dateutil-2.8.2 requests-2.28.1 s3transfer-0.6.0 six-1.16.0 tabulate-0.8.10 typeguard-2.13.3 typing-extensions-4.3.0 urllib3-1.26.11 werkzeug-2.2.1 zipp-3.8.\\n```\n##### **安装 Node Version Manager 和 Node.js**\n```\\nAWS Cloud Development Kit (AWS CDK)模板生成会使用到Node Version Manager和Node.js。\\n\\n\$ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.38.0/install.sh | bash\\n\\n % Total % Received % Xferd Average Speed Time Time Time Current\\n Dload Upload Total Spent Left Speed\\n100 14926 100 14926 0 0 469k 0 --:--:-- --:--:-- --:--:-- 485k\\n=> Downloading nvm as script to '/home/ec2-user/.nvm'\\n\\n=> Appending nvm source string to /home/ec2-user/.bashrc\\n=> Appending bash_completion source string to /home/ec2-user/.bashrc\\n=> Close and reopen your terminal to start using nvm or run the following to use it now:\\n\\nexport NVM_DIR=\\"\$HOME/.nvm\\"\\n[ -s \\"\$NVM_DIR/nvm.sh\\" ] && \\\\. \\"\$NVM_DIR/nvm.sh\\" # This loads nvm\\n[ -s \\"\$NVM_DIR/bash_completion\\" ] && \\\\. \\"\$NVM_DIR/bash_completion\\" # This loads nvm bash_completion\\n\\n\\n\$ chmod ug+x ~/.nvm/nvm.sh\\n\\n\$ source ~/.nvm/nvm.sh\\n\\n\$ nvm install --lts\\n\\nInstalling latest LTS version.\\nDownloading and installing node v16.16.0...\\nDownloading https://nodejs.org/dist/v16.16.0/node-v16.16.0-linux-x64.tar.xz...\\n################################################################################################################################################################################## 100.0%\\nComputing checksum with sha256sum\\nChecksums matched!\\nNow using node v16.16.0 (npm v8.11.0)\\nCreating default alias: default -> lts/* (-> v16.16.0)\\n\$ node - version\\n```\n\n##### **验证 Amazon ParallelCluster 安装正确**\n\n激活新的 virtualenv\n```\\n\$ source ~/apc-ve/bin/activate\\n\\n\$ pcluster version\\n\\n{\\n \\"version\\": \\"3.2.0\\"\\n}\\n```\n##### **配置 Amazon ParallelCluster**\n\n```\\n\$ aws configure\\n\\nAWS Access Key ID [None]: AKIA5OZOUQ4F2T4IMAOS\\nAWS Secret Access Key [None]: XXX\\nDefault region name [None]: cn-northwest-1 \\nDefault output format [None]: \\n\\n\$ pcluster configure --config cluster-config.yaml\\n\\nINFO: Configuration file cluster-config.yaml will be written.\\nPress CTRL-C to interrupt the procedure.\\n\\n\\nAllowed values for AWS Region ID:\\n1. cn-north-1\\n2. cn-northwest-1\\nAWS Region ID [cn-northwest-1]: \\nAllowed values for EC2 Key Pair Name:\\n1. LL-K2\\nEC2 Key Pair Name [LL-K2]: \\nAllowed values for Scheduler:\\n1. slurm\\n2. awsbatch\\nScheduler [slurm]: \\nAllowed values for Operating System:\\n1. alinux2\\n2. centos7\\n3. ubuntu1804\\n4. ubuntu2004\\nOperating System [alinux2]: alinux2\\nHead node instance type [t2.micro]: c5.large\\nNumber of queues [1]: \\nName of queue 1 [queue1]: \\nNumber of compute resources for queue1 [1]: \\nCompute instance type for compute resource 1 in queue1 [t2.micro]: c5.xlarge\\nMaximum instance count [10]: \\nAutomate VPC creation? (y/n) [n]: \\nAllowed values for VPC ID:\\n # id name number_of_subnets\\n--- --------------------- ----------------- -------------------\\n 1 vpc-003630feddf7d2417 EKS 2\\n 2 vpc-013d1e62cfa405b8e ECS 2\\n 3 vpc-0252e11202ae27e51 2\\n 4 vpc-9b64d8f2 HPC 3\\nVPC ID [vpc-003630feddf7d2417]: vpc-9b64d8f2 \\nAutomate Subnet creation? (y/n) [y]: \\nAllowed values for Availability Zone:\\n1. cn-northwest-1a\\n2. cn-northwest-1b\\n3. cn-northwest-1c\\nAvailability Zone [cn-northwest-1a]: \\nAllowed values for Network Configuration:\\n1. Head node in a public subnet and compute fleet in a private subnet\\n2. Head node and compute fleet in the same public subnet\\nNetwork Configuration [Head node in a public subnet and compute fleet in a private subnet]: \\nCreating CloudFormation stack...\\nDo not leave the terminal until the process has finished.\\nStack Name: parallelclusternetworking-pubpriv-20220729030718 (id: arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/parallelclusternetworking-pubpriv-20220729030718/b846e230-0eeb-11ed-979c-0a9d1a8a4fe6)\\nStatus: parallelclusternetworking-pubpriv-20220729030718 - CREATE_COMPLETE \\nThe stack has been created.\\nConfiguration file written to cluster-config.yaml\\nYou can edit your configuration file or simply run 'pcluster create-cluster --cluster-configuration cluster-config.yaml --cluster-name cluster-name --region cn-northwest-1' to create your cluster.\\n```\n#### **创建 CFD 集群**\n\n##### **配置文件**\n\n按照 HPC/CFD 运行需要修改 cluster-config.yaml,增加前后处理所需的 DCV 远程可视化,还有流体计算所需的高性能计算文件系统 Fsx Lustre。\n\n**1.NICE DCV**\n```\\nDcv:\\n Enabled: true\\n```\n\n**2.Fsx Lustre**\n\n```\\nSharedStorage:\\n - MountDir: /fsx\\n Name: ParallelFileSystem\\n StorageType: FsxLustre\\n FsxLustreSettings:\\n StorageCapacity: 1200\\n DeploymentType: PERSISTENT_1\\n ImportedFileChunkSize: 1024\\n ExportPath: s3://plljdi-fs1/export\\n ImportPath: s3://plljdi-fs1\\n PerUnitStorageThroughput: 200\\n```\n\n当前 ANSYS Fluent 支持 Centos 7操作系统, Amazon Linux 2 不在 ANSYS 官方认证的系统里面。\n\n##### **创建集群**\n```\\n\$ pcluster create-cluster --cluster-name cfd-cluster --cluster-configuration cfd-cluster-config.yaml\\n\\n{\\n \\"cluster\\": {\\n \\"clusterName\\": \\"cfd-cluster\\",\\n \\"cloudformationStackStatus\\": \\"CREATE_IN_PROGRESS\\",\\n \\"cloudformationStackArn\\": \\"arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/test-cluster/348e1c40-0eed-11ed-b3f5-0a96b85a5424\\",\\n \\"region\\": \\"cn-northwest-1\\",\\n \\"version\\": \\"3.1.4\\",\\n \\"clusterStatus\\": \\"CREATE_IN_PROGRESS\\"\\n }\\n}\\n```\n\n##### **查询集群信息**\n\n```\\n\$ pcluster describe-cluster --cluster-name cfd-cluster\\n\\n{\\n \\"creationTime\\": \\"2022-07-29T10:31:33.608Z\\",\\n \\"headNode\\": {\\n \\"launchTime\\": \\"2022-07-29T10:40:14.000Z\\",\\n \\"instanceId\\": \\"i-0e3c4967953c806a7\\",\\n \\"publicIpAddress\\": \\"52.83.49.88\\",\\n \\"instanceType\\": \\"c5.large\\",\\n \\"state\\": \\"running\\",\\n \\"privateIpAddress\\": \\"172.31.48.96\\"\\n },\\n \\"version\\": \\"3.1.4\\",\\n \\"clusterConfiguration\\": {\\n \\"url\\": \\"https://parallelcluster-02fb13f6f8ec970c-v1-do-not-delete.s3.cn-northwest-1.amazonaws.com.cn/parallelcluster/3.1.4/clusters/cfd-cluster-7p51jnbemquummo3/configs/cluster-config.yaml?versionId=sf6OxDbpIGYPjmrRfSSArCU5YRUHzCqo&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5OZOUQ4F2T4IMAOS%2F20220805%2Fcn-northwest-1%2Fs3%2Faws4_request&X-Amz-Date=20220805T021305Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=f7bd1e1e31bdcc9d3bf7b260d68f418e39a7239fdf4baf0983cb1e399cdea35e\\"\\n },\\n \\"tags\\": [\\n {\\n \\"value\\": \\"3.1.4\\",\\n \\"key\\": \\"parallelcluster:version\\"\\n }\\n ],\\n \\"cloudFormationStackStatus\\": \\"CREATE_COMPLETE\\",\\n \\"clusterName\\": \\"cfd-cluster\\",\\n \\"computeFleetStatus\\": \\"RUNNING\\",\\n \\"cloudformationStackArn\\": \\"arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/cfd-cluster/9dc591c0-0f29-11ed-a5cd-02357b891a1c\\",\\n \\"lastUpdatedTime\\": \\"2022-07-29T10:31:33.608Z\\",\\n \\"region\\": \\"cn-northwest-1\\",\\n \\"clusterStatus\\": \\"CREATE_COMPLETE\\"\\n}\\n\\n\$ pcluster list-clusters --query 'clusters[?clusterName==`cfd-cluster`]'\\n\\n[\\n {\\n \\"clusterName\\": \\"cfd-cluster\\",\\n \\"cloudformationStackStatus\\": \\"CREATE_IN_PROGRESS\\",\\n \\"cloudformationStackArn\\": \\"arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/cfd-cluster/f7316cd0-1464-11ed-8f62-0aa55a928096\\",\\n \\"region\\": \\"cn-northwest-1\\",\\n \\"version\\": \\"3.1.4\\",\\n \\"clusterStatus\\": \\"CREATE_IN_PROGRESS\\"\\n }\\n]\\n```\n##### **登陆集群**\n```\\n\$ pcluster ssh --cluster-name cfd-cluster -i ~/LL-K2.pem\\n```\n##### **检查 Slurm 集群状态**\n\n```\\nsinfo\\n\\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST\\nqueue1* up infinite 10 idle~ queue1-dy-c5xlarge-[1-10]\\n\\nsinfo -l\\n\\nFri Aug 05 02:56:34 2022\\nPARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST\\nqueue1* up infinite 1-infinite no NO all 10 idle~ queue1-dy-c5xlarge-[1-10]\\n\\nsqueue\\n\\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\\n\\nsrun -n4 -l hostname\\n\\n0: queue1-dy-c5xlarge-1\\n2: queue1-dy-c5xlarge-1\\n3: queue1-dy-c5xlarge-1\\n1: queue1-dy-c5xlarge-1\\n```\n##### **DCV 登陆**\n```\\nDCV dcv-connect参数\\npcluster dcv-connect [-h]\\n --cluster-name CLUSTER_NAME \\n [--debug]\\n [--key-path KEY_PATH]\\n [--region REGION]\\n [--show-url]\\n\\n\$ pcluster dcv-connect --cluster-name cfd-cluster --key-path ~/LL-K2.pem --show-url\\n```\nPlease use the following one-time URL in your browser within 30 seconds:\n\n[https://52.83.49.88:8443?authToken=Xh92zh9pJ3bWK1Sn_2gzdUEnf4GwjYWYyMmh2bWSq4n8Pm4jUWWbqCOuBG6CdWBLFpPwZLmi7WC8PM7t44DWwL9Lr85Cu_QWTaEg-A9tywg3TjA2waXRzQhhI8-URnDWfTpC8l6Od5IkaUyiAjqybRfK2a41yYHNYSYUc3uWL_UNKYgjjoqCjvwFyBpKa0WGo88mODGpLkyWNhU6dqiWTK-BMqbSXl3SttPQOgge6YIwvSyKB28rmP0JoyC4SkvN#8DWPj4h0HXiKPbh1yZ69](https://52.83.49.88:8443?authToken=Xh92zh9pJ3bWK1Sn_2gzdUEnf4GwjYWYyMmh2bWSq4n8Pm4jUWWbqCOuBG6CdWBLFpPwZLmi7WC8PM7t44DWwL9Lr85Cu_QWTaEg-A9tywg3TjA2waXRzQhhI8-URnDWfTpC8l6Od5IkaUyiAjqybRfK2a41yYHNYSYUc3uWL_UNKYgjjoqCjvwFyBpKa0WGo88mODGpLkyWNhU6dqiWTK-BMqbSXl3SttPQOgge6YIwvSyKB28rmP0JoyC4SkvN#8DWPj4h0HXiKPbh1yZ69)\n\n打开浏览器,通过链接登陆集群管理节点。CFD 前后处理阶段可以通过 DCV 登陆在管理节点进行,可以根据 CFD 前后处理资源需求,配置带有 GPU 的机器。\n\n![20d106a33849a22ffd4cb03d997d60c6.png](https://dev-media.amazoncloud.cn/dcffe57e3b5b45f0bf327395e4cba8e1_20d106a33849a22ffd4cb03d997d60c6.png)\n\n##### **安装 Fluent 软件**\n\n从 ANSYS 官方拿到安装介质和授权文件,通过 DCV 登陆到管理节点,将软件安装到共享存储 Fsx Lustre 目录下,这样所有的计算节点都能运行 Fluent 相关组件。按照安装提示往下走。\n\n![ffc706cb8ace27c80c5d31f24f3d715e.png](https://dev-media.amazoncloud.cn/fddd772245424c2ba44802d7e42572de_ffc706cb8ace27c80c5d31f24f3d715e.png)\n\n![a478dbb55dc75920cefdfdd18fc706bd.png](https://dev-media.amazoncloud.cn/081da05ab1ba4529925177ad1f6262f4_a478dbb55dc75920cefdfdd18fc706bd.png)\n\n![d2cf1e0cf59a8d3d9e9c48111ba1785a.png](https://dev-media.amazoncloud.cn/c4dc84bbc6e148a5b8aab04ac9a6c9e1_d2cf1e0cf59a8d3d9e9c48111ba1785a.png)\n\n安装好之后,配置 License 访问端口。修改 ansyslmd.ini 文件,将以下两条记录添加进去。\n```\\nSERVER=1055@licenseServer\\nANSYSLI_SERVERS=2325@licenseServer\\n```\n##### **运行 Fluent 和 CFD-Post 软件**\n\n**运行 Fluent**\n\n通过 NICE DCV 登陆,然后运行 /fsx/apps/ansys_inc/v195/fluent/bin/fluent\n\n![3d4fd41d76254dd9e1e6e524c95b81eb.png](https://dev-media.amazoncloud.cn/4022d47ad4a94a63b3c1ab34798aba88_3d4fd41d76254dd9e1e6e524c95b81eb.png)\n\n用户可以通过 Fluent 来进行 CFD 的仿真模拟,因为当前 Fluent GUI 还不支持 Slurm 调度,可以通过脚本集成的方式,把 Fluent 作业提交给 Slurm sbatch。\n\n**运行 CFD-Post**\n\n在 Amazon Linux 2 下,需要正确设置 LD_LIBRARY_PATH 环境变量,因为可能会存在一些 lib 库,运行环境需要指定的。\n```\\nexport LD_LIBRARY_PATH=/fsx/apps/ansys_inc/v195/commonfiles/CFX/support/fluentio/lib/linx64/:\$LD_LIBRARY_PATH\\n```\n运行 /fsx/apps/ansys_inc/v195/CFD-Post/bin/cfdpost,\n通过 CFD-Post 查看模型仿真计算结果。\n例如 perf_IndyCar.res 结果文件。\n\n![68ee418981ce794243d7c5f91feb97a4.png](https://dev-media.amazoncloud.cn/0903852a02ef45ca8fd5e43886fc70cb_68ee418981ce794243d7c5f91feb97a4.png)\n### **资源回收**\n当我们不在需要计算环境的情况下,需要删除 CFD 集群。\n```\\npcluster delete-cluster --region cn-northwest-1 --cluster-name cfd-cluster\\n```\n通过 Amazon Console,删除 Cloud Formation networking stack\n删除 VPC ,如果是新建的 VPC。\n\n### **本篇作者**\n\n\n**林磊**\n资深高性能计算行业和 SaaS 行业专家。毕业于中国科学技术大学和中科院软件研究所。加入亚马逊云科技之前,曾就职于 IBM 和 ANSYS China,主持过多个超算和 EDA 和 CAE 高性能系统建设。作为产品经理参与 CAE Workspace 平台研发工作(调度系统)。研究生期间,参与过分布式密码计算项目,该项目由国家自然科学基金支持。\n\n[阅读原文](https://aws.amazon.com/cn/blogs/china/aws-parallelcluster-3-integrated-ansys-cfd-calculation/)","render":"<h3><a id=\\"_0\\"></a><strong>简介</strong></h3>\\n<h4><a id=\\"Amazon_ParallelCluster_2\\"></a><strong>Amazon ParallelCluster</strong></h4>\\n<p>Amazon ParallelCluster 是亚马逊云科技支持的开源集群管理工具,可帮助您部署和管理高性能计算 (HPC) 集群。ParallelCluster 是建立在开源 CfnCluster 项目的基础上,Amazon ParallelCluster 可以快速构建 HPC 计算环境。自动设置所需的计算资源和共享文件系统。可以在 Amazon ParallelCluster 环境中使用批处理调度器 Amazon Batch 或 Slurm,旧版本 ParallelCluster 还支持 PBS 和 SGE。</p>\n<p>Amazon ParallelCluster 便于快速启动概念验证部署和生产部署。也可以在 Amazon ParallelCluster 基础之上构建更高级别的工作流程,例如 CFD 高性能计算。</p>\n<p>Amazon ParallelCluster 可以使用多个 Amazon HPC 服务,例如图形展示的 NICE DCV 和高性能计算文件系统 FSX Lustre 。DCV 可以使用在 CFD 前后处理上,典型的场景是工程师可以通过 DCV 使用 CFD Post 打开最终的计算模型,进行查看验证。也可以通过 ICEM 进行前处理操作。FSX Lustre 提供符合高性能计算需求的带宽和延迟。</p>\n<h4><a id=\\"NICE_DCV_10\\"></a><strong>NICE DCV</strong></h4>\\n<p>NICE DCV 是一种高性能远程显示协议,为客户提供一种安全的方式,可以在各种网络条件下,将远程桌面和应用程序从任何云或数据中心流式传输到任何设备。借助 NICE DCV 和 Amazon EC2,客户可以在 EC2 实例上远程运行图形密集型应用程序,并将结果流式传输到客户端计算机上,从而无需昂贵的专用工作站。跨多种 HPC 工作负载的客户使用 NICE DCV 满足其远程可视化要求。在 Amazon EC2 上使用 NICE DCV 不会产生任何额外费用。您只需为用于运行和存储工作负载的 EC2 资源付费。</p>\n<h4><a id=\\"FSx_for_Lustre_14\\"></a><strong>FSx for Lustre</strong></h4>\\n<p>FSx for Lustre 使启动和运行流行的高性能 Lustre 文件系统变得轻松且经济高效。您可以使用 Lustre 来处理如机器学习、高性能计算 (HPC)、视频处理和财务建模。</p>\n<p>开源 Lustre 文件系统专为需要快速存储的应用程序而设计。Lustre 旨在解决快速、廉价地处理世界上不断增长的数据集的问题。这是一个广泛使用的文件系统,专为世界上速度最快的计算机而设计。它提供亚毫秒级的延迟、高达数百 GB 的吞吐量以及高达数百万 IOPS。</p>\n<p>作为一项完全托管的服务,Amazon FSx 可迅速地将 Lustre 用于存储速度至关重要的工作负载。FSx for Lustre 消除了设置和管理 Lustre 文件系统的传统复杂性,使您能够在几分钟内启动高性能文件系统。它还提供了多种部署选项,因此您可以根据需求优化成本。</p>\n<p>FSx for Lustre 符合 POSIX 标准,因此您可以使用当前基于 Linux 的应用程序,而无需进行任何更改。可以像任何文件系统在 Linux 操作系统中一样工作。它还提供先写后读一致性,并支持文件锁定。</p>\n<h4><a id=\\"ANSYS_Fluent_24\\"></a><strong>ANSYS Fluent</strong></h4>\\n<p>ANSYS Fluent 是国际上比较流行的商用 CFD 软件包,在美国的市场占有率为 60%,凡是和流体、热传递和化学反应等有关的工业均可使用。它具有丰富的物理模型、先进的数值方法和强大的前后处理功能,在航空航天、汽车设计、石油天然气和涡轮机设计等方面都有着广泛的应用。</p>\n<h4><a id=\\"Slurm_28\\"></a><strong>Slurm</strong></h4>\\n<p>ParallelCluster 3 集成了 Slurm 和 Batch 作业调度系统,Slurm 是适用于 CFD 作业调度。Slurm(Simple Linux Utility for Resource Management,<a href=\\"http://slurm.schedmd.com/\\" target=\\"_blank\\">http://slurm.schedmd.com/</a> )是开源的、具有容错性和高度可扩展的 Linux 集群超级计算系统资源管理和作业调度系统。超级计算系统可利用 Slurm 对资源和作业进行管理,以避免相互干扰,提高运行效率。所有需运行的作业,无论是用于程序调试还是业务计算,都可以通过交互式并行 srun 、批处理式 sbatch 或分配式 salloc 等命令提交,提交后可以利用相关命令查询作业状态等。</p>\\n<p><img src=\\"https://dev-media.amazoncloud.cn/da677eece27c4e23bd03a1365f92c192_97f5737ea31e451f4936bffe372e814d.png\\" alt=\\"97f5737ea31e451f4936bffe372e814d.png\\" /></p>\n<h3><a id=\\"_33\\"></a><strong>方案部署</strong></h3>\\n<h4><a id=\\"_ParallelCluster_35\\"></a><strong>安装 ParallelCluster</strong></h4>\\n<h5><a id=\\"_37\\"></a><strong>前提条件</strong></h5>\\n<p>Amazon ParallelCluster 需要 Python 3.6 或更高版本。如果还没有安装,需要先从<a href=\\"https://www.python.org/downloads/\\" target=\\"_blank\\">https://www.python.org/downloads/ </a>下载兼容的版本,进行安装。</p>\\n<pre><code class=\\"lang-\\">\$ python3\\n\\nPython 3.7.10 (default, Jun 3 2021, 00:02:01) \\n[GCC 7.3.1 20180712 (Red Hat 7.3.1-13)] on linux\\nType &quot;help&quot;, &quot;copyright&quot;, &quot;credits&quot; or &quot;license&quot; for more information.、\\n&gt;&gt;&gt;\\n</code></pre>\\n<h5><a id=\\"_virtualenv_48\\"></a><strong>安装虚拟环境 virtualenv</strong></h5>\\n<pre><code class=\\"lang-\\">\$ python3 -m pip install --upgrade pip\\n\\nDefaulting to user installation because normal site-packages is not writeable\\nCollecting pip\\n Downloading pip-22.2.1-py3-none-any.whl (2.0 MB)\\n |████████████████████████████████| 2.0 MB 44.7 MB/s \\nInstalling collected packages: pip\\nSuccessfully installed pip-22.2.1\\n\\n\$ python3 -m pip install --user --upgrade virtualenv\\n\\nCollecting virtualenv\\n Downloading virtualenv-20.16.2-py2.py3-none-any.whl (8.8 MB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 89.1 MB/s eta 0:00:00\\nCollecting distlib&lt;1,&gt;=0.3.1\\n Downloading distlib-0.3.5-py2.py3-none-any.whl (466 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 467.0/467.0 kB 71.2 MB/s eta 0:00:00\\nCollecting importlib-metadata&gt;=0.12\\n Downloading importlib_metadata-4.12.0-py3-none-any.whl (21 kB)\\nCollecting platformdirs&lt;3,&gt;=2\\n Downloading platformdirs-2.5.2-py3-none-any.whl (14 kB)\\nCollecting filelock&lt;4,&gt;=3.2\\n Downloading filelock-3.7.1-py3-none-any.whl (10 kB)\\nCollecting typing-extensions&gt;=3.6.4\\n Downloading typing_extensions-4.3.0-py3-none-any.whl (25 kB)\\nCollecting zipp&gt;=0.5\\n Downloading zipp-3.8.1-py3-none-any.whl (5.6 kB)\\nInstalling collected packages: distlib, zipp, typing-extensions, platformdirs, filelock, importlib-metadata, virtualenv\\nSuccessfully installed distlib-0.3.5 filelock-3.7.1 importlib-metadata-4.12.0 platformdirs-2.5.2 typing-extensions-4.3.0 virtualenv-20.16.2 zipp-3.8.\\n</code></pre>\\n<h5><a id=\\"_virtualenv_81\\"></a><strong>创建 virtualenv,并命名</strong></h5>\\n<pre><code class=\\"lang-\\">\$ python3 -m virtualenv ~/apc-ve\\n\\ncreated virtual environment CPython3.7.10.final.0-64 in 850ms\\n creator CPython3Posix(dest=/home/ec2-user/apc-ve, clear=False, no_vcs_ignore=False, global=False)\\n seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/ec2-user/.local/share/virtualenv)\\n added seed packages: pip==22.2.1, setuptools==63.2.0, wheel==0.37.1\\n activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator\\n</code></pre>\\n<p>这个时候会在当前目录下生成文件夹 apc-ve</p>\n<h5><a id=\\"_virtualenv_93\\"></a><strong>激活新的 virtualenv</strong></h5>\\n<pre><code class=\\"lang-\\">\$ source ~/apc-ve/bin/activate\\n</code></pre>\\n<h5><a id=\\"_Amazon_ParallelCluster_97\\"></a><strong>在虚拟环境下安装 Amazon ParallelCluster</strong></h5>\\n<pre><code class=\\"lang-\\">\$ python3 -m pip install --upgrade &quot;aws-parallelcluster&quot;\\n\\nCollecting aws-parallelcluster\\n Downloading aws_parallelcluster-3.2.0-py3-none-any.whl (424 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 425.0/425.0 kB 37.8 MB/s eta 0:00:00\\nCollecting aws-cdk.aws-batch!=1.153.0,~=1.137\\n Downloading aws_cdk.aws_batch-1.167.0-py3-none-any.whl (333 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 333.6/333.6 kB 52.3 MB/s eta 0:00:00\\nCollecting jmespath~=0.10\\n Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)\\nCollecting aws-cdk.aws-cloudwatch!=1.153.0,~=1.137\\n Downloading aws_cdk.aws_cloudwatch-1.167.0-py3-none-any.whl (379 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 379.1/379.1 kB 44.9 MB/s eta 0:00:00\\nCollecting aws-cdk.core!=1.153.0,~=1.137\\n Downloading aws_cdk.core-1.167.0-py3-none-any.whl (1.4 MB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 95.1 MB/s eta 0:00:00\\n\\n……\\n\\nCollecting certifi&gt;=2017.4.17\\n Downloading certifi-2022.6.15-py3-none-any.whl (160 kB)\\n ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 160.2/160.2 kB 41.5 MB/s eta 0:00:00\\nCollecting exceptiongroup\\n Downloading exceptiongroup-1.0.0rc8-py3-none-any.whl (11 kB)\\nCollecting six&gt;=1.5\\n Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)\\nInstalling collected packages: publication, zipp, urllib3, typing-extensions, typeguard, tabulate, six, PyYAML, pyrsistent, pyparsing, pkgutil-resolve-name, MarkupSafe, jmespath, itsdangerous, inflection, idna, exceptiongroup, charset-normalizer, certifi, attrs, werkzeug, requests, python-dateutil, packaging, jinja2, importlib-resources, importlib-metadata, cattrs, marshmallow, jsonschema, jsii, click, botocore, s3transfer, flask, constructs, clickclick, aws-cdk.region-info, aws-cdk.cloud-assembly-schema, connexion, boto3, aws-cdk.cx-api, aws-cdk.core, aws-cdk.aws-signer, aws-cdk.aws-sam, aws-cdk.aws-imagebuilder, aws-cdk.aws-iam, aws-cdk.aws-codestarnotifications, aws-cdk.aws-acmpca, aws-cdk.assets, aws-cdk.aws-kms, aws-cdk.aws-events, aws-cdk.aws-codeguruprofiler, aws-cdk.aws-cloudwatch, aws-cdk.aws-autoscaling-common, aws-cdk.aws-ssm, aws-cdk.aws-sqs, aws-cdk.aws-s3, aws-cdk.aws-ecr, aws-cdk.aws-applicationautoscaling, aws-cdk.aws-sns, aws-cdk.aws-s3-assets, aws-cdk.aws-ecr-assets, aws-cdk.aws-logs, aws-cdk.aws-codecommit, aws-cdk.aws-stepfunctions, aws-cdk.aws-kinesis, aws-cdk.aws-ec2, aws-cdk.aws-fsx, aws-cdk.aws-elasticloadbalancing, aws-cdk.aws-efs, aws-cdk.aws-lambda, aws-cdk.aws-sns-subscriptions, aws-cdk.aws-secretsmanager, aws-cdk.aws-cloudformation, aws-cdk.custom-resources, aws-cdk.aws-codebuild, aws-cdk.aws-route53, aws-cdk.aws-globalaccelerator, aws-cdk.aws-dynamodb, aws-cdk.aws-certificatemanager, aws-cdk.aws-elasticloadbalancingv2, aws-cdk.aws-cognito, aws-cdk.aws-cloudfront, aws-cdk.aws-servicediscovery, aws-cdk.aws-autoscaling, aws-cdk.aws-apigateway, aws-cdk.aws-route53-targets, aws-cdk.aws-autoscaling-hooktargets, aws-cdk.aws-ecs, aws-cdk.aws-batch, aws-parallelcluster\\nSuccessfully installed MarkupSafe-2.1.1 PyYAML-5.4.1 attrs-21.4.0 aws-cdk.assets-1.167.0 aws-cdk.aws-acmpca-1.167.0 aws-cdk.aws-apigateway-1.167.0 aws-cdk.aws-applicationautoscaling-1.167.0 aws-cdk.aws-autoscaling-1.167.0 aws-cdk.aws-autoscaling-common-1.167.0 aws-cdk.aws-autoscaling-hooktargets-1.167.0 aws-cdk.aws-batch-1.167.0 aws-cdk.aws-certificatemanager-1.167.0 aws-cdk.aws-cloudformation-1.167.0 aws-cdk.aws-cloudfront-1.167.0 aws-cdk.aws-cloudwatch-1.167.0 aws-cdk.aws-codebuild-1.167.0 aws-cdk.aws-codecommit-1.167.0 aws-cdk.aws-codeguruprofiler-1.167.0 aws-cdk.aws-codestarnotifications-1.167.0 aws-cdk.aws-cognito-1.167.0 aws-cdk.aws-dynamodb-1.167.0 aws-cdk.aws-ec2-1.167.0 aws-cdk.aws-ecr-1.167.0 aws-cdk.aws-ecr-assets-1.167.0 aws-cdk.aws-ecs-1.167.0 aws-cdk.aws-efs-1.167.0 aws-cdk.aws-elasticloadbalancing-1.167.0 aws-cdk.aws-elasticloadbalancingv2-1.167.0 aws-cdk.aws-events-1.167.0 aws-cdk.aws-fsx-1.167.0 aws-cdk.aws-globalaccelerator-1.167.0 aws-cdk.aws-iam-1.167.0 aws-cdk.aws-imagebuilder-1.167.0 aws-cdk.aws-kinesis-1.167.0 aws-cdk.aws-kms-1.167.0 aws-cdk.aws-lambda-1.167.0 aws-cdk.aws-logs-1.167.0 aws-cdk.aws-route53-1.167.0 aws-cdk.aws-route53-targets-1.167.0 aws-cdk.aws-s3-1.167.0 aws-cdk.aws-s3-assets-1.167.0 aws-cdk.aws-sam-1.167.0 aws-cdk.aws-secretsmanager-1.167.0 aws-cdk.aws-servicediscovery-1.167.0 aws-cdk.aws-signer-1.167.0 aws-cdk.aws-sns-1.167.0 aws-cdk.aws-sns-subscriptions-1.167.0 aws-cdk.aws-sqs-1.167.0 aws-cdk.aws-ssm-1.167.0 aws-cdk.aws-stepfunctions-1.167.0 aws-cdk.cloud-assembly-schema-1.167.0 aws-cdk.core-1.167.0 aws-cdk.custom-resources-1.167.0 aws-cdk.cx-api-1.167.0 aws-cdk.region-info-1.167.0 aws-parallelcluster-3.2.0 boto3-1.24.44 botocore-1.27.44 cattrs-22.1.0 certifi-2022.6.15 charset-normalizer-2.1.0 click-8.1.3 clickclick-20.10.2 connexion-2.13.1 constructs-3.4.58 exceptiongroup-1.0.0rc8 flask-2.2.0 idna-3.3 importlib-metadata-4.12.0 importlib-resources-5.9.0 inflection-0.5.1 itsdangerous-2.1.2 jinja2-3.1.2 jmespath-0.10.0 jsii-1.63.2 jsonschema-4.9.0 marshmallow-3.17.0 packaging-21.3 pkgutil-resolve-name-1.3.10 publication-0.0.3 pyparsing-3.0.9 pyrsistent-0.18.1 python-dateutil-2.8.2 requests-2.28.1 s3transfer-0.6.0 six-1.16.0 tabulate-0.8.10 typeguard-2.13.3 typing-extensions-4.3.0 urllib3-1.26.11 werkzeug-2.2.1 zipp-3.8.\\n</code></pre>\\n<h5><a id=\\"_Node_Version_Manager__Nodejs_128\\"></a><strong>安装 Node Version Manager 和 Node.js</strong></h5>\\n<pre><code class=\\"lang-\\">AWS Cloud Development Kit (AWS CDK)模板生成会使用到Node Version Manager和Node.js。\\n\\n\$ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.38.0/install.sh | bash\\n\\n % Total % Received % Xferd Average Speed Time Time Time Current\\n Dload Upload Total Spent Left Speed\\n100 14926 100 14926 0 0 469k 0 --:--:-- --:--:-- --:--:-- 485k\\n=&gt; Downloading nvm as script to '/home/ec2-user/.nvm'\\n\\n=&gt; Appending nvm source string to /home/ec2-user/.bashrc\\n=&gt; Appending bash_completion source string to /home/ec2-user/.bashrc\\n=&gt; Close and reopen your terminal to start using nvm or run the following to use it now:\\n\\nexport NVM_DIR=&quot;\$HOME/.nvm&quot;\\n[ -s &quot;\$NVM_DIR/nvm.sh&quot; ] &amp;&amp; \\\\. &quot;\$NVM_DIR/nvm.sh&quot; # This loads nvm\\n[ -s &quot;\$NVM_DIR/bash_completion&quot; ] &amp;&amp; \\\\. &quot;\$NVM_DIR/bash_completion&quot; # This loads nvm bash_completion\\n\\n\\n\$ chmod ug+x ~/.nvm/nvm.sh\\n\\n\$ source ~/.nvm/nvm.sh\\n\\n\$ nvm install --lts\\n\\nInstalling latest LTS version.\\nDownloading and installing node v16.16.0...\\nDownloading https://nodejs.org/dist/v16.16.0/node-v16.16.0-linux-x64.tar.xz...\\n################################################################################################################################################################################## 100.0%\\nComputing checksum with sha256sum\\nChecksums matched!\\nNow using node v16.16.0 (npm v8.11.0)\\nCreating default alias: default -&gt; lts/* (-&gt; v16.16.0)\\n\$ node - version\\n</code></pre>\\n<h5><a id=\\"_Amazon_ParallelCluster__165\\"></a><strong>验证 Amazon ParallelCluster 安装正确</strong></h5>\\n<p>激活新的 virtualenv</p>\n<pre><code class=\\"lang-\\">\$ source ~/apc-ve/bin/activate\\n\\n\$ pcluster version\\n\\n{\\n &quot;version&quot;: &quot;3.2.0&quot;\\n}\\n</code></pre>\\n<h5><a id=\\"_Amazon_ParallelCluster_177\\"></a><strong>配置 Amazon ParallelCluster</strong></h5>\\n<pre><code class=\\"lang-\\">\$ aws configure\\n\\nAWS Access Key ID [None]: AKIA5OZOUQ4F2T4IMAOS\\nAWS Secret Access Key [None]: XXX\\nDefault region name [None]: cn-northwest-1 \\nDefault output format [None]: \\n\\n\$ pcluster configure --config cluster-config.yaml\\n\\nINFO: Configuration file cluster-config.yaml will be written.\\nPress CTRL-C to interrupt the procedure.\\n\\n\\nAllowed values for AWS Region ID:\\n1. cn-north-1\\n2. cn-northwest-1\\nAWS Region ID [cn-northwest-1]: \\nAllowed values for EC2 Key Pair Name:\\n1. LL-K2\\nEC2 Key Pair Name [LL-K2]: \\nAllowed values for Scheduler:\\n1. slurm\\n2. awsbatch\\nScheduler [slurm]: \\nAllowed values for Operating System:\\n1. alinux2\\n2. centos7\\n3. ubuntu1804\\n4. ubuntu2004\\nOperating System [alinux2]: alinux2\\nHead node instance type [t2.micro]: c5.large\\nNumber of queues [1]: \\nName of queue 1 [queue1]: \\nNumber of compute resources for queue1 [1]: \\nCompute instance type for compute resource 1 in queue1 [t2.micro]: c5.xlarge\\nMaximum instance count [10]: \\nAutomate VPC creation? (y/n) [n]: \\nAllowed values for VPC ID:\\n # id name number_of_subnets\\n--- --------------------- ----------------- -------------------\\n 1 vpc-003630feddf7d2417 EKS 2\\n 2 vpc-013d1e62cfa405b8e ECS 2\\n 3 vpc-0252e11202ae27e51 2\\n 4 vpc-9b64d8f2 HPC 3\\nVPC ID [vpc-003630feddf7d2417]: vpc-9b64d8f2 \\nAutomate Subnet creation? (y/n) [y]: \\nAllowed values for Availability Zone:\\n1. cn-northwest-1a\\n2. cn-northwest-1b\\n3. cn-northwest-1c\\nAvailability Zone [cn-northwest-1a]: \\nAllowed values for Network Configuration:\\n1. Head node in a public subnet and compute fleet in a private subnet\\n2. Head node and compute fleet in the same public subnet\\nNetwork Configuration [Head node in a public subnet and compute fleet in a private subnet]: \\nCreating CloudFormation stack...\\nDo not leave the terminal until the process has finished.\\nStack Name: parallelclusternetworking-pubpriv-20220729030718 (id: arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/parallelclusternetworking-pubpriv-20220729030718/b846e230-0eeb-11ed-979c-0a9d1a8a4fe6)\\nStatus: parallelclusternetworking-pubpriv-20220729030718 - CREATE_COMPLETE \\nThe stack has been created.\\nConfiguration file written to cluster-config.yaml\\nYou can edit your configuration file or simply run 'pcluster create-cluster --cluster-configuration cluster-config.yaml --cluster-name cluster-name --region cn-northwest-1' to create your cluster.\\n</code></pre>\\n<h4><a id=\\"_CFD__243\\"></a><strong>创建 CFD 集群</strong></h4>\\n<h5><a id=\\"_245\\"></a><strong>配置文件</strong></h5>\\n<p>按照 HPC/CFD 运行需要修改 cluster-config.yaml,增加前后处理所需的 DCV 远程可视化,还有流体计算所需的高性能计算文件系统 Fsx Lustre。</p>\n<p><strong>1.NICE DCV</strong></p>\\n<pre><code class=\\"lang-\\">Dcv:\\n Enabled: true\\n</code></pre>\\n<p><strong>2.Fsx Lustre</strong></p>\\n<pre><code class=\\"lang-\\">SharedStorage:\\n - MountDir: /fsx\\n Name: ParallelFileSystem\\n StorageType: FsxLustre\\n FsxLustreSettings:\\n StorageCapacity: 1200\\n DeploymentType: PERSISTENT_1\\n ImportedFileChunkSize: 1024\\n ExportPath: s3://plljdi-fs1/export\\n ImportPath: s3://plljdi-fs1\\n PerUnitStorageThroughput: 200\\n</code></pre>\\n<p>当前 ANSYS Fluent 支持 Centos 7操作系统, Amazon Linux 2 不在 ANSYS 官方认证的系统里面。</p>\n<h5><a id=\\"_273\\"></a><strong>创建集群</strong></h5>\\n<pre><code class=\\"lang-\\">\$ pcluster create-cluster --cluster-name cfd-cluster --cluster-configuration cfd-cluster-config.yaml\\n\\n{\\n &quot;cluster&quot;: {\\n &quot;clusterName&quot;: &quot;cfd-cluster&quot;,\\n &quot;cloudformationStackStatus&quot;: &quot;CREATE_IN_PROGRESS&quot;,\\n &quot;cloudformationStackArn&quot;: &quot;arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/test-cluster/348e1c40-0eed-11ed-b3f5-0a96b85a5424&quot;,\\n &quot;region&quot;: &quot;cn-northwest-1&quot;,\\n &quot;version&quot;: &quot;3.1.4&quot;,\\n &quot;clusterStatus&quot;: &quot;CREATE_IN_PROGRESS&quot;\\n }\\n}\\n</code></pre>\\n<h5><a id=\\"_289\\"></a><strong>查询集群信息</strong></h5>\\n<pre><code class=\\"lang-\\">\$ pcluster describe-cluster --cluster-name cfd-cluster\\n\\n{\\n &quot;creationTime&quot;: &quot;2022-07-29T10:31:33.608Z&quot;,\\n &quot;headNode&quot;: {\\n &quot;launchTime&quot;: &quot;2022-07-29T10:40:14.000Z&quot;,\\n &quot;instanceId&quot;: &quot;i-0e3c4967953c806a7&quot;,\\n &quot;publicIpAddress&quot;: &quot;52.83.49.88&quot;,\\n &quot;instanceType&quot;: &quot;c5.large&quot;,\\n &quot;state&quot;: &quot;running&quot;,\\n &quot;privateIpAddress&quot;: &quot;172.31.48.96&quot;\\n },\\n &quot;version&quot;: &quot;3.1.4&quot;,\\n &quot;clusterConfiguration&quot;: {\\n &quot;url&quot;: &quot;https://parallelcluster-02fb13f6f8ec970c-v1-do-not-delete.s3.cn-northwest-1.amazonaws.com.cn/parallelcluster/3.1.4/clusters/cfd-cluster-7p51jnbemquummo3/configs/cluster-config.yaml?versionId=sf6OxDbpIGYPjmrRfSSArCU5YRUHzCqo&amp;X-Amz-Algorithm=AWS4-HMAC-SHA256&amp;X-Amz-Credential=AKIA5OZOUQ4F2T4IMAOS%2F20220805%2Fcn-northwest-1%2Fs3%2Faws4_request&amp;X-Amz-Date=20220805T021305Z&amp;X-Amz-Expires=3600&amp;X-Amz-SignedHeaders=host&amp;X-Amz-Signature=f7bd1e1e31bdcc9d3bf7b260d68f418e39a7239fdf4baf0983cb1e399cdea35e&quot;\\n },\\n &quot;tags&quot;: [\\n {\\n &quot;value&quot;: &quot;3.1.4&quot;,\\n &quot;key&quot;: &quot;parallelcluster:version&quot;\\n }\\n ],\\n &quot;cloudFormationStackStatus&quot;: &quot;CREATE_COMPLETE&quot;,\\n &quot;clusterName&quot;: &quot;cfd-cluster&quot;,\\n &quot;computeFleetStatus&quot;: &quot;RUNNING&quot;,\\n &quot;cloudformationStackArn&quot;: &quot;arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/cfd-cluster/9dc591c0-0f29-11ed-a5cd-02357b891a1c&quot;,\\n &quot;lastUpdatedTime&quot;: &quot;2022-07-29T10:31:33.608Z&quot;,\\n &quot;region&quot;: &quot;cn-northwest-1&quot;,\\n &quot;clusterStatus&quot;: &quot;CREATE_COMPLETE&quot;\\n}\\n\\n\$ pcluster list-clusters --query 'clusters[?clusterName==`cfd-cluster`]'\\n\\n[\\n {\\n &quot;clusterName&quot;: &quot;cfd-cluster&quot;,\\n &quot;cloudformationStackStatus&quot;: &quot;CREATE_IN_PROGRESS&quot;,\\n &quot;cloudformationStackArn&quot;: &quot;arn:aws-cn:cloudformation:cn-northwest-1:925126395659:stack/cfd-cluster/f7316cd0-1464-11ed-8f62-0aa55a928096&quot;,\\n &quot;region&quot;: &quot;cn-northwest-1&quot;,\\n &quot;version&quot;: &quot;3.1.4&quot;,\\n &quot;clusterStatus&quot;: &quot;CREATE_IN_PROGRESS&quot;\\n }\\n]\\n</code></pre>\\n<h5><a id=\\"_336\\"></a><strong>登陆集群</strong></h5>\\n<pre><code class=\\"lang-\\">\$ pcluster ssh --cluster-name cfd-cluster -i ~/LL-K2.pem\\n</code></pre>\\n<h5><a id=\\"_Slurm__340\\"></a><strong>检查 Slurm 集群状态</strong></h5>\\n<pre><code class=\\"lang-\\">sinfo\\n\\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST\\nqueue1* up infinite 10 idle~ queue1-dy-c5xlarge-[1-10]\\n\\nsinfo -l\\n\\nFri Aug 05 02:56:34 2022\\nPARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST\\nqueue1* up infinite 1-infinite no NO all 10 idle~ queue1-dy-c5xlarge-[1-10]\\n\\nsqueue\\n\\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\\n\\nsrun -n4 -l hostname\\n\\n0: queue1-dy-c5xlarge-1\\n2: queue1-dy-c5xlarge-1\\n3: queue1-dy-c5xlarge-1\\n1: queue1-dy-c5xlarge-1\\n</code></pre>\\n<h5><a id=\\"DCV__365\\"></a><strong>DCV 登陆</strong></h5>\\n<pre><code class=\\"lang-\\">DCV dcv-connect参数\\npcluster dcv-connect [-h]\\n --cluster-name CLUSTER_NAME \\n [--debug]\\n [--key-path KEY_PATH]\\n [--region REGION]\\n [--show-url]\\n\\n\$ pcluster dcv-connect --cluster-name cfd-cluster --key-path ~/LL-K2.pem --show-url\\n</code></pre>\\n<p>Please use the following one-time URL in your browser within 30 seconds:</p>\n<p><a href=\\"https://52.83.49.88:8443?authToken=Xh92zh9pJ3bWK1Sn_2gzdUEnf4GwjYWYyMmh2bWSq4n8Pm4jUWWbqCOuBG6CdWBLFpPwZLmi7WC8PM7t44DWwL9Lr85Cu_QWTaEg-A9tywg3TjA2waXRzQhhI8-URnDWfTpC8l6Od5IkaUyiAjqybRfK2a41yYHNYSYUc3uWL_UNKYgjjoqCjvwFyBpKa0WGo88mODGpLkyWNhU6dqiWTK-BMqbSXl3SttPQOgge6YIwvSyKB28rmP0JoyC4SkvN#8DWPj4h0HXiKPbh1yZ69\\" target=\\"_blank\\">https://52.83.49.88:8443?authToken=Xh92zh9pJ3bWK1Sn_2gzdUEnf4GwjYWYyMmh2bWSq4n8Pm4jUWWbqCOuBG6CdWBLFpPwZLmi7WC8PM7t44DWwL9Lr85Cu_QWTaEg-A9tywg3TjA2waXRzQhhI8-URnDWfTpC8l6Od5IkaUyiAjqybRfK2a41yYHNYSYUc3uWL_UNKYgjjoqCjvwFyBpKa0WGo88mODGpLkyWNhU6dqiWTK-BMqbSXl3SttPQOgge6YIwvSyKB28rmP0JoyC4SkvN#8DWPj4h0HXiKPbh1yZ69</a></p>\\n<p>打开浏览器,通过链接登陆集群管理节点。CFD 前后处理阶段可以通过 DCV 登陆在管理节点进行,可以根据 CFD 前后处理资源需求,配置带有 GPU 的机器。</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/dcffe57e3b5b45f0bf327395e4cba8e1_20d106a33849a22ffd4cb03d997d60c6.png\\" alt=\\"20d106a33849a22ffd4cb03d997d60c6.png\\" /></p>\n<h5><a id=\\"_Fluent__385\\"></a><strong>安装 Fluent 软件</strong></h5>\\n<p>从 ANSYS 官方拿到安装介质和授权文件,通过 DCV 登陆到管理节点,将软件安装到共享存储 Fsx Lustre 目录下,这样所有的计算节点都能运行 Fluent 相关组件。按照安装提示往下走。</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/fddd772245424c2ba44802d7e42572de_ffc706cb8ace27c80c5d31f24f3d715e.png\\" alt=\\"ffc706cb8ace27c80c5d31f24f3d715e.png\\" /></p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/081da05ab1ba4529925177ad1f6262f4_a478dbb55dc75920cefdfdd18fc706bd.png\\" alt=\\"a478dbb55dc75920cefdfdd18fc706bd.png\\" /></p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/c4dc84bbc6e148a5b8aab04ac9a6c9e1_d2cf1e0cf59a8d3d9e9c48111ba1785a.png\\" alt=\\"d2cf1e0cf59a8d3d9e9c48111ba1785a.png\\" /></p>\n<p>安装好之后,配置 License 访问端口。修改 ansyslmd.ini 文件,将以下两条记录添加进去。</p>\n<pre><code class=\\"lang-\\">SERVER=1055@licenseServer\\nANSYSLI_SERVERS=2325@licenseServer\\n</code></pre>\\n<h5><a id=\\"_Fluent__CFDPost__400\\"></a><strong>运行 Fluent 和 CFD-Post 软件</strong></h5>\\n<p><strong>运行 Fluent</strong></p>\\n<p>通过 NICE DCV 登陆,然后运行 /fsx/apps/ansys_inc/v195/fluent/bin/fluent</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/4022d47ad4a94a63b3c1ab34798aba88_3d4fd41d76254dd9e1e6e524c95b81eb.png\\" alt=\\"3d4fd41d76254dd9e1e6e524c95b81eb.png\\" /></p>\n<p>用户可以通过 Fluent 来进行 CFD 的仿真模拟,因为当前 Fluent GUI 还不支持 Slurm 调度,可以通过脚本集成的方式,把 Fluent 作业提交给 Slurm sbatch。</p>\n<p><strong>运行 CFD-Post</strong></p>\\n<p>在 Amazon Linux 2 下,需要正确设置 LD_LIBRARY_PATH 环境变量,因为可能会存在一些 lib 库,运行环境需要指定的。</p>\n<pre><code class=\\"lang-\\">export LD_LIBRARY_PATH=/fsx/apps/ansys_inc/v195/commonfiles/CFX/support/fluentio/lib/linx64/:\$LD_LIBRARY_PATH\\n</code></pre>\\n<p>运行 /fsx/apps/ansys_inc/v195/CFD-Post/bin/cfdpost,<br />\\n通过 CFD-Post 查看模型仿真计算结果。<br />\\n例如 perf_IndyCar.res 结果文件。</p>\n<p><img src=\\"https://dev-media.amazoncloud.cn/0903852a02ef45ca8fd5e43886fc70cb_68ee418981ce794243d7c5f91feb97a4.png\\" alt=\\"68ee418981ce794243d7c5f91feb97a4.png\\" /></p>\n<h3><a id=\\"_421\\"></a><strong>资源回收</strong></h3>\\n<p>当我们不在需要计算环境的情况下,需要删除 CFD 集群。</p>\n<pre><code class=\\"lang-\\">pcluster delete-cluster --region cn-northwest-1 --cluster-name cfd-cluster\\n</code></pre>\\n<p>通过 Amazon Console,删除 Cloud Formation networking stack<br />\\n删除 VPC ,如果是新建的 VPC。</p>\n<h3><a id=\\"_429\\"></a><strong>本篇作者</strong></h3>\\n<p><strong>林磊</strong><br />\\n资深高性能计算行业和 SaaS 行业专家。毕业于中国科学技术大学和中科院软件研究所。加入亚马逊云科技之前,曾就职于 IBM 和 ANSYS China,主持过多个超算和 EDA 和 CAE 高性能系统建设。作为产品经理参与 CAE Workspace 平台研发工作(调度系统)。研究生期间,参与过分布式密码计算项目,该项目由国家自然科学基金支持。</p>\n<p><a href=\\"https://aws.amazon.com/cn/blogs/china/aws-parallelcluster-3-integrated-ansys-cfd-calculation/\\" target=\\"_blank\\">阅读原文</a></p>\n"}
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭