Improve ML developer productivity with Weights & Biases: A computer vision example on Amazon SageMaker

海外精选
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
0
0
{"value":"*This post is co-written with Thomas Capelle at Weights & Biases.*\n\nAs more organizations use deep learning techniques such as computer vision and natural language processing, the machine learning (ML) developer persona needs scalable tooling around experiment tracking, lineage, and collaboration. Experiment tracking includes metadata such as operating system, infrastructure used, library, and input and output datasets—often tracked on a spreadsheet manually. Lineage involves tracking the datasets, transformations, and algorithms used to create an ML model. Collaboration includes ML developers working on a single project and also ML developers sharing their results across teams and to business stakeholders—a process commonly done via email, screenshots, and PowerPoint presentations.\n\nIn this post, we train a model to identify objects for an autonomous vehicle use case using Weights & Biases (W&B) and [Amazon SageMaker](https://aws.amazon.com/sagemaker/). We showcase how the joint solution reduces manual work for the ML developer, creates more transparency in the model development process, and enables teams to collaborate on projects.\n\n![image.png](1)\n\nWe run this example on [Amazon SageMaker Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html) for you to try out for yourself.\n\n#### **Overview of Weights & Biases**\n\nWeights & Biases helps ML teams build better models faster. With just a few lines of code in your SageMaker notebook, you can instantly debug, compare, and reproduce your models—architecture, hyperparameters, git commits, model weights, GPU usage, datasets, and predictions—all while collaborating with your teammates.\n\n![image.png](2)\nW&B is trusted by more than 200,000 ML practitioners from some of the most innovative companies and research organizations in the world. To try it for free, sign up at [Weights & Biases](https://wandb.ai/site), or visit the [W&B AWS Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-guj5ftmaeszay).\n\n#### **Getting started with SageMaker Studio**\n\nSageMaker Studio is the first fully integrated development environment (IDE) for ML. Studio provides a single web-based interface where ML practitioners and data scientists can build, train, and deploy models with a few clicks, all in one place.\n\nTo get started with Studio, you need an AWS account and an [AWS Identity and Access Management](http://aws.amazon.com/iam) (IAM) user or role with permissions to create a Studio domain. Refer to [Onboard to Amazon SageMaker Domain](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-studio-onboard.html) to create a domain, and the [Studio documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html) for an overview on using Studio visual interface and notebooks.\n\n#### **Set up the environment**\n\nFor this post, we’re interested in running our own code, so let’s import some notebooks from GitHub. We use the following [GitHub repo](https://github.com/wandb/SageMakerStudioLab) as an example, so let’s load [this notebook](https://github.com/wandb/SageMakerStudioLab/blob/main/Intro_to_Weights_%26_Biases.ipynb).\n\nYou can clone a repository either through the terminal or the Studio UI. To clone a repository through the terminal, open a system terminal (on the **File** menu, choose **New** and **Terminal**) and enter the following command:\n\n```\ngit clone https://github.com/wandb/SageMakerStudio\n```\n\nTo clone a repository from the Studio UI, see [Clone a Git Repository in SageMaker Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-tasks-git.html).\n\nTo get started, choose the [01_data_processing.ipynb](https://studiolab.sagemaker.aws/import/github/wandb/SageMakerStudioLab/blob/main/01_data_processing.ipynb) notebook. You’re prompted with a kernel switcher prompt. This example uses PyTorch, so we can choose the pre-built **PyTorch 1.10 Python 3.8 GPU optimized** image to start our notebook. You can see the app starting, and when the kernel is ready, it shows the instance type and kernel on the top right of your notebook.\n\nOur notebook needs some additional dependencies. This repository provides a requirements.txt with the additional dependencies. Run the first cell to install the required dependencies:\n\n```\n%pip install -r requirements.txt\n```\n\nYou can also create a lifecycle configuration to automatically install the packages every time you start the PyTorch app. See [Customize Amazon SageMaker Studio using Lifecycle Configurations](https://aws.amazon.com/blogs/machine-learning/customize-amazon-sagemaker-studio-using-lifecycle-configurations/) for instructions and a sample implementation.\n\n#### **Use Weights & Biases in SageMaker Studio**\n\nWeights & Biases (```wandb```) is a standard Python library. Once installed, it’s as simple as adding a few lines of code to your training script and you’re ready to log experiments. We have already installed it through our requirements.txt file. You can also install it manually with the following code:\n\n```\n! pip install wandb\n```\n\n#### **Case study: Autonomous vehicle semantic segmentation**\n\n**Dataset**\n\nWe use the [Cambridge-driving Labeled Video Database](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/) (CamVid) for this example. It contains a collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes. We can version our dataset as a [wandb.Artifact](https://docs.wandb.ai/guides/artifacts/artifacts-core-concepts), that way we can reference it later. See the following code:\n\n```\nwith wandb.init(project=\"sagemaker_camvid_demo\", job_type=\"upload\"):\n artifact = wandb.Artifact(\n name='camvid-dataset',\n type='dataset',\n metadata={\n \"url\": 'https://s3.amazonaws.com/fast-ai-imagelocal/camvid.tgz',\n \"class_labels\": class_labels\n },\n description=\"The Cambridge-driving Labeled Video Database (CamVid) is the first collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes.\"\n )\n artifact.add_dir(path)\n wandb.log_artifact(artifact)\n```\n\nYou can follow along in the [01_data_processing.ipynb](https://studiolab.sagemaker.aws/import/github/wandb/SageMakerStudioLab/blob/main/01_data_processing.ipynb) notebook.\n\nWe also log a [table](https://docs.wandb.ai/guides/data-vis) of the dataset. Tables are rich and powerful DataFrame-like entities that enable you to query and analyze tabular data. You can understand your datasets, visualize model predictions, and share insights in a central dashboard.\n\nWeights & Biases tables support many rich media formats, like image, audio, and waveforms. For a full list of media formats, refer to [Data Types](https://docs.wandb.ai/ref/python/data-types).\n\nThe following screenshot shows a table with raw images with the ground truth segmentations. You can also view an [interactive version of this table](http://wandb.me/aws_studiolab).\n\n![1.gif](3)\n\n#### **Train a model**\n\nWe can now create a model and train it on our dataset. We use [PyTorch](https://pytorch.org/docs/stable/index.html) and [fastai](https://docs.fast.ai/) to quickly prototype a baseline and then use ```wandb.Sweeps``` to optimize our hyperparameters. Follow along in the [02_semantic_segmentation.ipynb](https://studiolab.sagemaker.aws/import/github/wandb/SageMakerStudioLab/blob/main/02_semantic_segmentation.ipynb) notebook. When prompted for a kernel on opening the notebook, choose the same kernel from our first notebook, **PyTorch 1.10 Python 3.8 GPU optimized**. Your packages are already installed because you’re using the same app.\n\nThe model is supposed to learn a per-pixel annotation of a scene captured from the point of view of the autonomous agent. The model needs to categorize or segment each pixel of a given scene into 32 relevant categories, such as road, pedestrian, sidewalk, or cars. You can choose any of the segmented images on the table and access this interactive interface for accessing the segmentation results and categories.\n\nBecause the [fastai](http://docs.fast.ai/) library has integration with ```wandb```, you can simply pass the ```WandbCallback``` to the Learner:\n\n```\nfrom fastai.callback.wandb import WandbCallback\n\nloss_func=FocalLossFlat(axis=1)\nmodel = SegmentationModel(backbone, hidden_dim, num_classes=num_classes)\nwandb_callback = WandbCallback(log_preds=True)\n learner = Learner(\n data_loader,\n model,\n loss_func=loss_func,\n metrics=metrics,\n cbs=[wandb_callback],\n )\n\nlearn.fit_one_cycle(TRAIN_EPOCHS, LEARNING_RATE)\n```\n\nFor the baseline experiments, we decided to use a simple architecture inspired by the [UNet](https://arxiv.org/abs/1505.04597) paper with different backbones from [timm](https://github.com/rwightman/pytorch-image-models). We trained our models with [Focal Loss](https://paperswithcode.com/method/retinanet) as criterion. With Weights & Biases, you can easily create dashboards with summaries of your experiments to quickly analyze training results, as shown in the following screenshot. You can also [view this dashboard interactively](http://wandb.me/aws_studiolab).\n\n![image.png](4)\n\n#### **Hyperparameter search with sweeps**\n\nTo improve the performance of the baseline model, we need to select the best model and the best set of hyperparameters to train. W&B makes this easy for us using [sweeps](https://docs.wandb.ai/guides/sweeps/quickstart).\n\nWe perform a [Bayesian hyperparameter search](https://docs.wandb.ai/guides/sweeps/configuration#method) with the goal of maximizing the foreground accuracy of the model on the validation dataset. To perform the sweep, we define the configuration file sweep.yaml. Inside this file, we pass the desired method to use: bayes and the parameters and their corresponding values to search. In our case, we try different backbones, batch sizes, and loss functions. We also explore different optimization parameters like learning rate and weight decay. Because these are continuous values, we sample from a distribution. There are multiple [configuration options available for sweeps](https://docs.wandb.ai/guides/sweeps/configuration).\n\n```\nprogram: train.py\nproject: sagemaker_camvid_demo\nmethod: bayes\nmetric:\n name: foreground_acc\n goal: maximize\nearly_terminate:\n type: hyperband\n min_iter: 5\nparameters:\n backbone:\n values: [\"mobilenetv2_100\",\"mobilenetv3_small_050\",\"mobilenetv3_large_100\",\"resnet18\",\"resnet34\",\"resnet50\",\"vgg19\"]\n batch_size: \n values: [8, 16]\n image_resize_factor: \n value: 4\n loss_function: \n values: [\"categorical_cross_entropy\", \"focal\", \"dice\"]\n learning_rate: \n distribution: uniform \n min: 1e-5\n max: 1e-2\n weight_decay: \n distribution: uniform\n min: 0.0 \n max: 0.05\n```\n\nAfterwards, in a terminal, you launch the sweep using the [wandb command line](https://docs.wandb.ai/ref/cli):\n\n```\n$ wandb sweep sweep.yaml —-project=\"sagemaker_camvid_demo\"\n```\n\nAnd then launch a sweep agent on this machine with the following code:\n\n```\n$ wandb agent <sweep_id>\n```\n\nWhen the sweep has finished, we can use a parallel coordinates plot to explore the performances of the models with various backbones and different sets of hyperparameters. Based on that, we can see which model performs the best.\n\nThe following screenshot shows the results of the sweeps, including a parallel coordinates chart and parameter correlation charts. You can also view this [sweeps dashboard interactively](http://wandb.me/aws_studiolab).\n\nWe can derive the following key insights from the sweep:\n\n- Lower learning rate and lower weight decay results in better foreground accuracy and Dice scores.\n- Batch size has strong positive correlations with the metrics.\n- The [VGG-based backbones](https://rwightman.github.io/pytorch-image-models/models/#vgg-vggpy) might not be a good option to train our final model because they’re prone to resulting in a [vanishing gradient](https://en.wikipedia.org/wiki/Vanishing_gradient_problem). (They’re filtered out as the loss diverged.)\n- The [ResNet](https://rwightman.github.io/pytorch-image-models/models/resnet/) backbones result in the best overall performance with respect to the metrics.\n- The ResNet34 or ResNet50 backbone should be chosen for the final model due to their strong performance in terms of metrics.\n\n#### **Data and model lineage**\n\nW&B artifacts were designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with W&B or whether you already have a bucket you want W&B to track. After you track your datasets or model files, W&B automatically logs each modification, giving you a complete and auditable history of changes to your files.\n\nIn our case, the dataset, models, and different tables generated during training are logged to the workspace. You can quickly view and visualize this lineage by going to the **Artifacts** page.\n\n![ML8885image010.gif](5)\n\n#### **Interpret model predictions**\n\nWeight & Biases is especially useful when assessing model performance by using the power of [wandb.Tables](https://docs.wandb.ai/guides/data-vis) to visualize where our model is doing badly. In this case, we’re particularly interested in detecting correctly vulnerable users like bicycles and pedestrians.\n\nWe logged the predicted masks along with the per-class Dice score coefficient into a table. We then filtered by rows containing the desired classes and sorted by ascending order on the Dice score.\n\nIn the following table, we first filter by choosing where the Dice score is positive (pedestrians are present in the image). Then we sort in ascending order to identify our worst-detected pedestrians. Keep in mind that a Dice score equaling 1 means correctly segmenting the pedestrian class. You can also [view this table interactively](https://wandb.me/aws_studiolab).\n\n![3.gif](6)\n\nWe can repeat this analysis with other vulnerable classes, such as bicycles or traffic lights.\n\nThis feature is a very good way of identifying images that aren’t labeled correctly and tagging them to re-annotate.\n\n#### **Conclusion**\n\nThis post showcased the Weights & Biases MLOps platform, how to set up W&B in SageMaker Studio, and how to run an introductory notebook on the joint solution. We then ran through an autonomous vehicle semantic segmentation use case and demonstrated tracking training runs with W&B experiments, hyperparameter optimization using W&B sweeps, and interpreting results with W&B tables.\n\nIf you’re interested in learning more, you can access the live [W&B report](http://wandb.me/aws_studiolab). To try [Weights & Biases](https://wandb.ai/site) for free, sign up at Weights & Biases, or visit the [W&B AWS Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-guj5ftmaeszay).\n\n#### **About the Authors**\n\n![image.png](7)\n\n**Thomas Capelle** is a Machine Learning Engineer at Weights and Biases. He is responsible for keeping the www.github.com/wandb/examples repository live and up to date. He also builds content on MLOPS, applications of W&B to industries, and fun deep learning in general. Previously he was using deep learning to solve short-term forecasting for solar energy. He has a background in Urban Planning, Combinatorial Optimization, Transportation Economics, and Applied Math.\n\n![image.png](8)\n\n**Durga Sury** is a ML Solutions Architect in the Amazon SageMaker Service SA team. She is passionate about making machine learning accessible to everyone. In her 3 years at AWS, she has helped set up AI/ML platforms for enterprise customers. When she isn’t working, she loves motorcycle rides, mystery novels, and hikes with her four-year old husky.\n\n![image.png](9)\n\n**Karthik Bharathy** is the product leader for Amazon SageMaker with over a decade of product management, product strategy, execution and launch experience.","render":"<p><em>This post is co-written with Thomas Capelle at Weights &amp; Biases.</em></p>\n<p>As more organizations use deep learning techniques such as computer vision and natural language processing, the machine learning (ML) developer persona needs scalable tooling around experiment tracking, lineage, and collaboration. Experiment tracking includes metadata such as operating system, infrastructure used, library, and input and output datasets—often tracked on a spreadsheet manually. Lineage involves tracking the datasets, transformations, and algorithms used to create an ML model. Collaboration includes ML developers working on a single project and also ML developers sharing their results across teams and to business stakeholders—a process commonly done via email, screenshots, and PowerPoint presentations.</p>\n<p>In this post, we train a model to identify objects for an autonomous vehicle use case using Weights &amp; Biases (W&amp;B) and <a href=\"https://aws.amazon.com/sagemaker/\" target=\"_blank\">Amazon SageMaker</a>. We showcase how the joint solution reduces manual work for the ML developer, creates more transparency in the model development process, and enables teams to collaborate on projects.</p>\n<p><img src=\"1\" alt=\"image.png\" /></p>\n<p>We run this example on <a href=\"https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html\" target=\"_blank\">Amazon SageMaker Studio</a> for you to try out for yourself.</p>\n<h4><a id=\"Overview_of_Weights__Biases_10\"></a><strong>Overview of Weights &amp; Biases</strong></h4>\n<p>Weights &amp; Biases helps ML teams build better models faster. With just a few lines of code in your SageMaker notebook, you can instantly debug, compare, and reproduce your models—architecture, hyperparameters, git commits, model weights, GPU usage, datasets, and predictions—all while collaborating with your teammates.</p>\n<p><img src=\"2\" alt=\"image.png\" /><br />\nW&amp;B is trusted by more than 200,000 ML practitioners from some of the most innovative companies and research organizations in the world. To try it for free, sign up at <a href=\"https://wandb.ai/site\" target=\"_blank\">Weights &amp; Biases</a>, or visit the <a href=\"https://aws.amazon.com/marketplace/pp/prodview-guj5ftmaeszay\" target=\"_blank\">W&amp;B AWS Marketplace listing</a>.</p>\n<h4><a id=\"Getting_started_with_SageMaker_Studio_17\"></a><strong>Getting started with SageMaker Studio</strong></h4>\n<p>SageMaker Studio is the first fully integrated development environment (IDE) for ML. Studio provides a single web-based interface where ML practitioners and data scientists can build, train, and deploy models with a few clicks, all in one place.</p>\n<p>To get started with Studio, you need an AWS account and an <a href=\"http://aws.amazon.com/iam\" target=\"_blank\">AWS Identity and Access Management</a> (IAM) user or role with permissions to create a Studio domain. Refer to <a href=\"https://docs.aws.amazon.com/sagemaker/latest/dg/gs-studio-onboard.html\" target=\"_blank\">Onboard to Amazon SageMaker Domain</a> to create a domain, and the <a href=\"https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html\" target=\"_blank\">Studio documentation</a> for an overview on using Studio visual interface and notebooks.</p>\n<h4><a id=\"Set_up_the_environment_23\"></a><strong>Set up the environment</strong></h4>\n<p>For this post, we’re interested in running our own code, so let’s import some notebooks from GitHub. We use the following <a href=\"https://github.com/wandb/SageMakerStudioLab\" target=\"_blank\">GitHub repo</a> as an example, so let’s load <a href=\"https://github.com/wandb/SageMakerStudioLab/blob/main/Intro_to_Weights_%26_Biases.ipynb\" target=\"_blank\">this notebook</a>.</p>\n<p>You can clone a repository either through the terminal or the Studio UI. To clone a repository through the terminal, open a system terminal (on the <strong>File</strong> menu, choose <strong>New</strong> and <strong>Terminal</strong>) and enter the following command:</p>\n<pre><code class=\"lang-\">git clone https://github.com/wandb/SageMakerStudio\n</code></pre>\n<p>To clone a repository from the Studio UI, see <a href=\"https://docs.aws.amazon.com/sagemaker/latest/dg/studio-tasks-git.html\" target=\"_blank\">Clone a Git Repository in SageMaker Studio</a>.</p>\n<p>To get started, choose the <a href=\"https://studiolab.sagemaker.aws/import/github/wandb/SageMakerStudioLab/blob/main/01_data_processing.ipynb\" target=\"_blank\">01_data_processing.ipynb</a> notebook. You’re prompted with a kernel switcher prompt. This example uses PyTorch, so we can choose the pre-built <strong>PyTorch 1.10 Python 3.8 GPU optimized</strong> image to start our notebook. You can see the app starting, and when the kernel is ready, it shows the instance type and kernel on the top right of your notebook.</p>\n<p>Our notebook needs some additional dependencies. This repository provides a requirements.txt with the additional dependencies. Run the first cell to install the required dependencies:</p>\n<pre><code class=\"lang-\">%pip install -r requirements.txt\n</code></pre>\n<p>You can also create a lifecycle configuration to automatically install the packages every time you start the PyTorch app. See <a href=\"https://aws.amazon.com/blogs/machine-learning/customize-amazon-sagemaker-studio-using-lifecycle-configurations/\" target=\"_blank\">Customize Amazon SageMaker Studio using Lifecycle Configurations</a> for instructions and a sample implementation.</p>\n<h4><a id=\"Use_Weights__Biases_in_SageMaker_Studio_45\"></a><strong>Use Weights &amp; Biases in SageMaker Studio</strong></h4>\n<p>Weights &amp; Biases (<code>wandb</code>) is a standard Python library. Once installed, it’s as simple as adding a few lines of code to your training script and you’re ready to log experiments. We have already installed it through our requirements.txt file. You can also install it manually with the following code:</p>\n<pre><code class=\"lang-\">! pip install wandb\n</code></pre>\n<h4><a id=\"Case_study_Autonomous_vehicle_semantic_segmentation_53\"></a><strong>Case study: Autonomous vehicle semantic segmentation</strong></h4>\n<p><strong>Dataset</strong></p>\n<p>We use the <a href=\"http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/\" target=\"_blank\">Cambridge-driving Labeled Video Database</a> (CamVid) for this example. It contains a collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes. We can version our dataset as a <a href=\"https://docs.wandb.ai/guides/artifacts/artifacts-core-concepts\" target=\"_blank\">wandb.Artifact</a>, that way we can reference it later. See the following code:</p>\n<pre><code class=\"lang-\">with wandb.init(project=&quot;sagemaker_camvid_demo&quot;, job_type=&quot;upload&quot;):\n artifact = wandb.Artifact(\n name='camvid-dataset',\n type='dataset',\n metadata={\n &quot;url&quot;: 'https://s3.amazonaws.com/fast-ai-imagelocal/camvid.tgz',\n &quot;class_labels&quot;: class_labels\n },\n description=&quot;The Cambridge-driving Labeled Video Database (CamVid) is the first collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes.&quot;\n )\n artifact.add_dir(path)\n wandb.log_artifact(artifact)\n</code></pre>\n<p>You can follow along in the <a href=\"https://studiolab.sagemaker.aws/import/github/wandb/SageMakerStudioLab/blob/main/01_data_processing.ipynb\" target=\"_blank\">01_data_processing.ipynb</a> notebook.</p>\n<p>We also log a <a href=\"https://docs.wandb.ai/guides/data-vis\" target=\"_blank\">table</a> of the dataset. Tables are rich and powerful DataFrame-like entities that enable you to query and analyze tabular data. You can understand your datasets, visualize model predictions, and share insights in a central dashboard.</p>\n<p>Weights &amp; Biases tables support many rich media formats, like image, audio, and waveforms. For a full list of media formats, refer to <a href=\"https://docs.wandb.ai/ref/python/data-types\" target=\"_blank\">Data Types</a>.</p>\n<p>The following screenshot shows a table with raw images with the ground truth segmentations. You can also view an <a href=\"http://wandb.me/aws_studiolab\" target=\"_blank\">interactive version of this table</a>.</p>\n<p><img src=\"3\" alt=\"1.gif\" /></p>\n<h4><a id=\"Train_a_model_84\"></a><strong>Train a model</strong></h4>\n<p>We can now create a model and train it on our dataset. We use <a href=\"https://pytorch.org/docs/stable/index.html\" target=\"_blank\">PyTorch</a> and <a href=\"https://docs.fast.ai/\" target=\"_blank\">fastai</a> to quickly prototype a baseline and then use <code>wandb.Sweeps</code> to optimize our hyperparameters. Follow along in the <a href=\"https://studiolab.sagemaker.aws/import/github/wandb/SageMakerStudioLab/blob/main/02_semantic_segmentation.ipynb\" target=\"_blank\">02_semantic_segmentation.ipynb</a> notebook. When prompted for a kernel on opening the notebook, choose the same kernel from our first notebook, <strong>PyTorch 1.10 Python 3.8 GPU optimized</strong>. Your packages are already installed because you’re using the same app.</p>\n<p>The model is supposed to learn a per-pixel annotation of a scene captured from the point of view of the autonomous agent. The model needs to categorize or segment each pixel of a given scene into 32 relevant categories, such as road, pedestrian, sidewalk, or cars. You can choose any of the segmented images on the table and access this interactive interface for accessing the segmentation results and categories.</p>\n<p>Because the <a href=\"http://docs.fast.ai/\" target=\"_blank\">fastai</a> library has integration with <code>wandb</code>, you can simply pass the <code>WandbCallback</code> to the Learner:</p>\n<pre><code class=\"lang-\">from fastai.callback.wandb import WandbCallback\n\nloss_func=FocalLossFlat(axis=1)\nmodel = SegmentationModel(backbone, hidden_dim, num_classes=num_classes)\nwandb_callback = WandbCallback(log_preds=True)\n learner = Learner(\n data_loader,\n model,\n loss_func=loss_func,\n metrics=metrics,\n cbs=[wandb_callback],\n )\n\nlearn.fit_one_cycle(TRAIN_EPOCHS, LEARNING_RATE)\n</code></pre>\n<p>For the baseline experiments, we decided to use a simple architecture inspired by the <a href=\"https://arxiv.org/abs/1505.04597\" target=\"_blank\">UNet</a> paper with different backbones from <a href=\"https://github.com/rwightman/pytorch-image-models\" target=\"_blank\">timm</a>. We trained our models with <a href=\"https://paperswithcode.com/method/retinanet\" target=\"_blank\">Focal Loss</a> as criterion. With Weights &amp; Biases, you can easily create dashboards with summaries of your experiments to quickly analyze training results, as shown in the following screenshot. You can also <a href=\"http://wandb.me/aws_studiolab\" target=\"_blank\">view this dashboard interactively</a>.</p>\n<p><img src=\"4\" alt=\"image.png\" /></p>\n<h4><a id=\"Hyperparameter_search_with_sweeps_113\"></a><strong>Hyperparameter search with sweeps</strong></h4>\n<p>To improve the performance of the baseline model, we need to select the best model and the best set of hyperparameters to train. W&amp;B makes this easy for us using <a href=\"https://docs.wandb.ai/guides/sweeps/quickstart\" target=\"_blank\">sweeps</a>.</p>\n<p>We perform a <a href=\"https://docs.wandb.ai/guides/sweeps/configuration#method\" target=\"_blank\">Bayesian hyperparameter search</a> with the goal of maximizing the foreground accuracy of the model on the validation dataset. To perform the sweep, we define the configuration file sweep.yaml. Inside this file, we pass the desired method to use: bayes and the parameters and their corresponding values to search. In our case, we try different backbones, batch sizes, and loss functions. We also explore different optimization parameters like learning rate and weight decay. Because these are continuous values, we sample from a distribution. There are multiple <a href=\"https://docs.wandb.ai/guides/sweeps/configuration\" target=\"_blank\">configuration options available for sweeps</a>.</p>\n<pre><code class=\"lang-\">program: train.py\nproject: sagemaker_camvid_demo\nmethod: bayes\nmetric:\n name: foreground_acc\n goal: maximize\nearly_terminate:\n type: hyperband\n min_iter: 5\nparameters:\n backbone:\n values: [&quot;mobilenetv2_100&quot;,&quot;mobilenetv3_small_050&quot;,&quot;mobilenetv3_large_100&quot;,&quot;resnet18&quot;,&quot;resnet34&quot;,&quot;resnet50&quot;,&quot;vgg19&quot;]\n batch_size: \n values: [8, 16]\n image_resize_factor: \n value: 4\n loss_function: \n values: [&quot;categorical_cross_entropy&quot;, &quot;focal&quot;, &quot;dice&quot;]\n learning_rate: \n distribution: uniform \n min: 1e-5\n max: 1e-2\n weight_decay: \n distribution: uniform\n min: 0.0 \n max: 0.05\n</code></pre>\n<p>Afterwards, in a terminal, you launch the sweep using the <a href=\"https://docs.wandb.ai/ref/cli\" target=\"_blank\">wandb command line</a>:</p>\n<pre><code class=\"lang-\">$ wandb sweep sweep.yaml —-project=&quot;sagemaker_camvid_demo&quot;\n</code></pre>\n<p>And then launch a sweep agent on this machine with the following code:</p>\n<pre><code class=\"lang-\">$ wandb agent &lt;sweep_id&gt;\n</code></pre>\n<p>When the sweep has finished, we can use a parallel coordinates plot to explore the performances of the models with various backbones and different sets of hyperparameters. Based on that, we can see which model performs the best.</p>\n<p>The following screenshot shows the results of the sweeps, including a parallel coordinates chart and parameter correlation charts. You can also view this <a href=\"http://wandb.me/aws_studiolab\" target=\"_blank\">sweeps dashboard interactively</a>.</p>\n<p>We can derive the following key insights from the sweep:</p>\n<ul>\n<li>Lower learning rate and lower weight decay results in better foreground accuracy and Dice scores.</li>\n<li>Batch size has strong positive correlations with the metrics.</li>\n<li>The <a href=\"https://rwightman.github.io/pytorch-image-models/models/#vgg-vggpy\" target=\"_blank\">VGG-based backbones</a> might not be a good option to train our final model because they’re prone to resulting in a <a href=\"https://en.wikipedia.org/wiki/Vanishing_gradient_problem\" target=\"_blank\">vanishing gradient</a>. (They’re filtered out as the loss diverged.)</li>\n<li>The <a href=\"https://rwightman.github.io/pytorch-image-models/models/resnet/\" target=\"_blank\">ResNet</a> backbones result in the best overall performance with respect to the metrics.</li>\n<li>The ResNet34 or ResNet50 backbone should be chosen for the final model due to their strong performance in terms of metrics.</li>\n</ul>\n<h4><a id=\"Data_and_model_lineage_172\"></a><strong>Data and model lineage</strong></h4>\n<p>W&amp;B artifacts were designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with W&amp;B or whether you already have a bucket you want W&amp;B to track. After you track your datasets or model files, W&amp;B automatically logs each modification, giving you a complete and auditable history of changes to your files.</p>\n<p>In our case, the dataset, models, and different tables generated during training are logged to the workspace. You can quickly view and visualize this lineage by going to the <strong>Artifacts</strong> page.</p>\n<p><img src=\"5\" alt=\"ML8885image010.gif\" /></p>\n<h4><a id=\"Interpret_model_predictions_180\"></a><strong>Interpret model predictions</strong></h4>\n<p>Weight &amp; Biases is especially useful when assessing model performance by using the power of <a href=\"https://docs.wandb.ai/guides/data-vis\" target=\"_blank\">wandb.Tables</a> to visualize where our model is doing badly. In this case, we’re particularly interested in detecting correctly vulnerable users like bicycles and pedestrians.</p>\n<p>We logged the predicted masks along with the per-class Dice score coefficient into a table. We then filtered by rows containing the desired classes and sorted by ascending order on the Dice score.</p>\n<p>In the following table, we first filter by choosing where the Dice score is positive (pedestrians are present in the image). Then we sort in ascending order to identify our worst-detected pedestrians. Keep in mind that a Dice score equaling 1 means correctly segmenting the pedestrian class. You can also <a href=\"https://wandb.me/aws_studiolab\" target=\"_blank\">view this table interactively</a>.</p>\n<p><img src=\"6\" alt=\"3.gif\" /></p>\n<p>We can repeat this analysis with other vulnerable classes, such as bicycles or traffic lights.</p>\n<p>This feature is a very good way of identifying images that aren’t labeled correctly and tagging them to re-annotate.</p>\n<h4><a id=\"Conclusion_194\"></a><strong>Conclusion</strong></h4>\n<p>This post showcased the Weights &amp; Biases MLOps platform, how to set up W&amp;B in SageMaker Studio, and how to run an introductory notebook on the joint solution. We then ran through an autonomous vehicle semantic segmentation use case and demonstrated tracking training runs with W&amp;B experiments, hyperparameter optimization using W&amp;B sweeps, and interpreting results with W&amp;B tables.</p>\n<p>If you’re interested in learning more, you can access the live <a href=\"http://wandb.me/aws_studiolab\" target=\"_blank\">W&amp;B report</a>. To try <a href=\"https://wandb.ai/site\" target=\"_blank\">Weights &amp; Biases</a> for free, sign up at Weights &amp; Biases, or visit the <a href=\"https://aws.amazon.com/marketplace/pp/prodview-guj5ftmaeszay\" target=\"_blank\">W&amp;B AWS Marketplace listing</a>.</p>\n<h4><a id=\"About_the_Authors_200\"></a><strong>About the Authors</strong></h4>\n<p><img src=\"7\" alt=\"image.png\" /></p>\n<p><strong>Thomas Capelle</strong> is a Machine Learning Engineer at Weights and Biases. He is responsible for keeping the www.github.com/wandb/examples repository live and up to date. He also builds content on MLOPS, applications of W&amp;B to industries, and fun deep learning in general. Previously he was using deep learning to solve short-term forecasting for solar energy. He has a background in Urban Planning, Combinatorial Optimization, Transportation Economics, and Applied Math.</p>\n<p><img src=\"8\" alt=\"image.png\" /></p>\n<p><strong>Durga Sury</strong> is a ML Solutions Architect in the Amazon SageMaker Service SA team. She is passionate about making machine learning accessible to everyone. In her 3 years at AWS, she has helped set up AI/ML platforms for enterprise customers. When she isn’t working, she loves motorcycle rides, mystery novels, and hikes with her four-year old husky.</p>\n<p><img src=\"9\" alt=\"image.png\" /></p>\n<p><strong>Karthik Bharathy</strong> is the product leader for Amazon SageMaker with over a decade of product management, product strategy, execution and launch experience.</p>\n"}
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭
contact-us