Open source news and updates #147

Amazon Simple Storage Service (S3)
Amazon EC2
Amazon EMR
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
## March 6th, 2023 - Instalment #147 **Welcome** Welcome to edition #147 of the Amazon Web Services open source newsletter, featured in the [latest episode of Build on Open Source]( Welcome to edition #147 of the Amazon Web Services open source newsletter, featured in the [latest episode of Build on Open Source]( This week we have new projects such as "metahub" and "savings-estimator" that we looked at in closer detail on the Build on Open Source livestream, "Amazon-iot-core-credential-provider-session-helper" a Python library to help simplify working with Amazon Web Services IoT, "traffic-inspection-architectures-Amazon-cloud-wan" code that provides examples of different network architectures and how to do traffic inspection, "neptune-export" a tool to help you export your data in Amazon Neptune, "Amazon-organizations-tool" a command line tool to help you configure Amazon Web Services Organisations, "sagemaker-external-repo-access" a nice reference architecture for Amazon Sagemaker, "Amazon-cdk-cfn-hook" a Python CDK app that will get you up and running quickly working with Cloudformation template hooks, and more! Also covered in this edition is content on a number of popular open source technologies, including Kubernetes, OpenSearch, FreeRTOS, Amazon Web Services Lambda Powertools, Debezium, Apache Kafka, Kafka Connect, Apache Spark, Apache Hudi, DeltaStreamer, Apicurio Registry, Apache Iceberg, FFMpeg, Prometheus, Babelfish for Amazon Aurora PostgreSQL, Redis, Mastadon, and more! **Feedback** Please please please take 1 minute to [complete this short survey]( and get some exclusive content as a thank you. ### []( open source contributors The articles and projects shared in this newsletter are only possible thanks to the many contributors in open source. I would like to shout out and thank those folks who really do power open source and enable us all to learn and build on top of what they have created. So thank you to the following open source heroes: Gary Stafford, Dhiraj Thakur, Rajdip Chaudhuri, Corry Haines, Tulip Gupta, Dennis Calhoun, Amir Khairalomoum, James McIntyre, Daniel Gross, Jimmy Ray, Emin Alemdar, and Toni de la Fuente. ### []( open source projects *The great thing about open source projects is that you can review the source code. If you like the look of these projects, make sure you that take a look at the code, and if it is useful to you, get in touch with the maintainer to provide feedback, suggestions or even submit a contribution.* #### Tools **metahub** [metahub]( is an Amazon Web Services Security Finding Format (ASFF) security context enrichment and command line utility for Amazon Web Services Security Hub. Using MetaHub, you can enrich your security findings with your context to use that context for filtering, deduplicating, grouping, reporting, automating, suppressing, or updating and enrichment directly in Amazon Web Services Security Hub. MetaHub interacts with reading/writing from Amazon Web Services Security Hub API or directly from ASFF files. You can combine these sources as you want to enrich your findings further. ![image.png]( "image.png") Check out more details from [this Reddit thread]( **savings-estimator** [savings-estimator]( this is a native desktop application written in Go that allows you to estimate the cost savings you can achieve in your Amazon Web Services account by converting your AutoScaling Groups to Spot instances. You can simulate various scenarios, such as to keep some of your instances as OnDemand in each group (maybe covered by Reserved Instances or Savings Plans), or only convert some of your AutoScaling Groups to Spot as part of a gradual rollout. You may use any mechanism to adopt Spot, such as applying the configuration yourself group by group as per your simulation. They also provide a helpful video to show you how you can get started. <video src="" class="bytemdVideo" controls="controls"></video> **Amazon-iot-core-credential-provider-session-helper** [Amazon-iot-core-credential-provider-session-helper]( this package provides an easy way to create a refreshable Boto3 Session using the Amazon Web Services IoT Core credential provider. Needs Python version 3.8 to 3.11, and some of the features include: * Automatic refresh of Boto3 credentials through requests to the Amazon Web Services IoT Core credential provider. No need to manage or maintain refresh times. * Uses the underlying Amazon Web Services CRT Python bindings for querying the credential provider instead of the Python standard library. This provides support for both certificate and private keys as files or as environment variables. * Extensible to using other TLS methods such as PKCS#11 hardware security modules (see Advanced section). * Only requires four function calls to create a session helper, Boto3 session, Boto3 client, and then client API calls. **traffic-inspection-architectures-Amazon-cloud-wan** [traffic-inspection-architectures-Amazon-cloud-wan]( this repository contains code (in Amazon Web Services CloudFormation and Terraform) to deploy several inspection architectures using Amazon Web Services Cloud WAN - with Amazon Web Services Network Firewall as inspection solution. The use cases covered are 1/ Centralized Outbound, 2/ East/West traffic, with both Spoke VPCs and Inspection VPCs attached to Amazon Web Services Cloud WAN, 3/ East/West traffic, with both Spoke VPCs and Inspection VPCs attached to Amazon Web Services Transit Gateway and peered with Amazon Web Services Cloud WAN, and 4/ East/West traffic, with Spoke VPCs attached to a peered Amazon Web Services Transit Gateway and Inspection VPCs attached to Amazon Web Services Cloud WAN. The documentation provides nice architectural diagram that outline each of these use cases, a description of what is inspected, and some sample output. <!--EndFragment--> ![image.png]( "image.png") **neptune-export** [neptune-export]( is a command line tool that exports Amazon Neptune property graph data to CSV or JSON, or RDF graph data to Turtle. The repo provides details of how you can also deploy this as a service within your environment. **Amazon-organizations-tool** [Amazon-organizations-tool]( orgtool is a configuration management tool set for Amazon Web Services Organizations written in python. This tooling enable the configuration and management of Amazon Web Services Organization with code. This might be useful as you transition from ClickOps to automation via infrastructure as code. Check the docs for more details on how you might do this. ### [](, Samples, Solutions and Workshops **sagemaker-external-repo-access** [sagemaker-external-repo-access]( the goal of this solution is to demonstrate the deployment of Amazon Web Services CodeSuite Services (i.e., CodeBuild, CodePipeline) to orchestrate secure MLOps access to external package repositories in a data science environment configured with multi-layer security. Detailed documentation as well as links to a supporting blog post you can read to help you get started. **mlops-sagemaker-github-actions** [mlops-sagemaker-github-actions]( similar to the previous post, this repo is an example of MLOps implementation using Amazon SageMaker and GitHub Actions. The code helps you to build a solution that automates a model-build pipeline that includes steps for data preparation, model training, model evaluation, and registration of that model in the SageMaker Model Registry. The resulting trained ML model is deployed from the model registry to staging and production environments upon the approval. ![image.png]( "image.png") **Amazon-cdk-cfn-hook** [Amazon-cdk-cfn-hook]( this solution demonstrates how to create, update, and deploy CloudFormation hooks through a CI/CD pipeline using Amazon Web Services Cloud Development Kit as Infrastructure as Code. It leverages Amazon Web Services CDK (Python) to deploy: an Amazon Web Services CodeCommit source repo that contains hook handlers, hook schema and other parameters to create the hook and its related configuration; an Amazon Web Services CodeBuild stage to package and deploy the hook; and an Amazon Web Services CodePipeline. ![image.png]( "image.png") **Amazon-textract-cdk-commercial-acord** [Amazon-textract-cdk-commercial-acord]( This repo contains all the code required to do an IDP solution on Amazon Web Services from document splitting, classification to extraction. The repo use a sample commercial acord data set. **rails-lambda-handler** [rails-lambda-handler]( This repository includes and Amazon Web Services Lambda handler function that launches Ruby on Rails (or other Rack compliant) application, as well as a sample application so you can see how this works. ### []( Web Services and Community blog posts **Open Source Data Lakes** Back with another epic blog post, Gary Stafford provides a gloriously detailed walk through together with supporting sample code, that takes a look at how you can combine a number of open source projects, together with Amazon Web Services services to create a real time transactional data lake. From the post: > Red Hat’s Debezium, Apache Kafka, and Kafka Connect will be used for change data capture (CDC). In addition, Apache Spark, Apache Hudi, and Hudi’s DeltaStreamer will be used to manage the data lake. Gary's posts are essential reading, so head on over to [Building Data Lakes on Amazon with Kafka Connect, Debezium, Apicurio Registry, and Apache Hudi]( and dive right on into the lake! \[hands on] ![image.png]( "image.png") **Apache Iceberg** In the post, [Build a real-time GDPR-aligned Apache Iceberg data lake](, Dhiraj Thakur and Rajdip Chaudhuri show you how you can use the Iceberg table format on Athena to implement GDPR use cases like data deletion and data upserts as required, when streaming data is being generated and ingested through Amazon Web Services Glue streaming jobs in Amazon S3. \[hands on] ![image.png]( "image.png") **Mastodon** If you have been looking for a way to host your own Mastadon instance, then why not take a look at this blog post from Corry Haines. In [Deploying Mastodon on Amazon Web Services]( shares the lessons he learned on the way to hosting his own instance. Essential reading this week. **FFMPEG** FFmpeg is an open source tool commonly used by media technology companies to encode and transcode video and audio formats. In [Run Open Source FFMPEG at Lower Cost and Better Performance on a VT1 Instance for VOD Encoding Workloads](, Tulip Gupta and Dennis Calhoun share how you can optimise how your run FFMpeg on Amazon Web Services using VT1 instance types on Amazon EC2 instances. **Other posts and quick reads** * [Build a GNN-based real-time fraud detection solution using the Deep Graph Library without using external graph storage]( provides a step-by-step process for training and evaluating a Relational Graph Convolutional Network (RGCN) model for real-time fraud detection using the open source Deep Graph Library \[hands on] * [Reduce Amazon EMR cluster costs by up to 19% with new enhancements in Amazon EMR Managed Scaling]( provides an overview of enhancements in EMR Managed Scaling which show improved cluster utilisation (by up to 15 percent) and a reduction in cluster costs \[hands on] ![image.png]( "image.png") * [Using OpsWatch to Create a Single Pane of Prometheus Metrics from Multiple Non-Native Sources]( explores how to integrate CloudWatch and other Amazon Web Services (Amazon Web Services) data sources into a typical container and Prometheus-based monitoring world ![image.png]( "image.png") * [Automated Application Failover Across Availability Zones with Floating/Virtual IP on Amazon EKS]( looks at design patterns that let you fail-over your EKS based applications seamlessly to another AZ while using same IP addresses, in an automated way, with no change needed in the application code \[hands on] * [SaaS Data Isolation with Dynamic Credentials Using HashiCorp Vault in Amazon EKS]( examines a solution for implementing multi-tenant SaaS data isolation with dynamic credentials using HashiCorp Vault in an Amazon EKS environment \[hands on] ![image.png]( "image.png") **Case Studies** * [How Wiz used Amazon ElastiCache to improve performance and reduce costs]( is a great case study on how they were able to improved overall application performance, reduce pressure on their database, and then right-size the database instances, to save their overall TCO. ![image.png]( "image.png") * [Rebura: Accelerate SQL Server database modernization with Babelfish for Amazon Aurora PostgreSQL]( explores how Rebura helps customers modernise their SQL Server on Amazon Web Services using Babelfish for Amazon Aurora PostgreSQL. ![image.png]( "image.png") ### Quick updates **Amazon Web Services Lambda Powertools** Amazon Web Services Lambda Powertools, an open-source developer library, now supports .NET to help you incorporate Well-Architected Serverless best practices into your .NET Lambda function code as early and as fast as possible. Lambda Powertools for .NET is used when developing code for the .NET 6 Lambda runtime. To dive deeper into the GA announcement, read the post from Amir Khairalomoum, [Introducing Amazon Web Services Lambda Powertools for .NET]( ![image.png]( "image.png") **OpenSearch** OpenSearch 2.6.0 is now available, with a new data schema built to OpenTelemetry standards that unlocks an array of future capabilities for analytics and observability use cases. This release also delivers upgrades for index management, improves threat detection for security analytics workloads, and adds functionality for visualization tools, machine learning (ML) models, and more. Read the full announcement from James McIntyre, [Introducing OpenSearch 2.6]( In related news, Krishna Kondaka, Asif Sohail Mohammed, and David Venable collaborated on the post, [Announcing Data Prepper 2.1.0]( that shared news about the latest release of Data Prepper, an open source tool that that accepts, filters, transforms, enriches, and routes data into your OpenSearch environment. **PostgreSQL** Amazon Relational Database Service (Amazon RDS) for PostgreSQL now supports the latest major version PostgreSQL 15. New features in PostgreSQL 15 include the SQL standard "MERGE" command for conditional SQL queries, performance improvements for both in-memory and disk-based sorting, and support for two-phase commit and row/column filtering for logical replication. The PostgreSQL 15 release also adds support for new extension pg_walinspect, and server-side compression with Gzip, LZ4, or Zstandard (zstd) using pg_basebackup. **MySQL** Amazon Aurora MySQL-Compatible Edition 3 (with MySQL 8.0 compatibility) now supports MySQL 8.0.26. In addition to several security enhancements and bug fixes, MySQL 8.0.26 includes several changes, such as enhanced tablespace file segment page configuration and new aliases for certain identifier names. For more details, please review Aurora MySQL 3 and MySQL 8.0.26 release notes. **MariaDB** Amazon Relational Database Service (Amazon RDS) for MariaDB now supports MariaDB minor versions 10.6.12, 10.5.19, 10.4.28 and 10.3.38. We recommend that you upgrade to the latest minor versions to fix known security vulnerabilities in prior versions of MariaDB, and to benefit from the numerous bug fixes, performance improvements, and new functionality added by the MariaDB community. **Amazon Web Services SAM** Serverless application developers can now define multiple destinations when integrating Amazon Web Services services with Amazon Web Services Serverless Application Model (Amazon Web Services SAM) connectors. Previously, SAM customers needed to create a SAM connector definition for every source and destination pair. For example, if a Amazon::Serverless::Function needed to interact with three SNS topics using identical permissions, three SAM connectors would need to be defined for each connection from the function to each topic. ### []( of the week **FreeRTOS** Join Daniel Gross, Developer Advocate within the Amazon Web Services IoT team as he shows you how you can debug FreeRTOS within VSCode using QEMU. Check out the links as you can follow along the blog post and supporting code. Very cool. <video src="" class="bytemdVideo" controls="controls"></video> **Improving Secret Management in K8s with ESO** Managing secrets in Kubernetes can be a challenge. One of the optimal approaches to storing and making use of sensitive data in your clusters is to incorporate the use of external centralized secrets managers. Centralized secrets managers usually offer encryption of data at rest and expose an API for lifecycle management operations of your secrets. So how do you integrate secrets from external providers and securely expose them in your cluster? In this episode, your host Jimmy Ray is joined by Emin Alemdar who walks through the External Secrets Operator (ESO) and some good practices for managing secrets in Kubernetes. <video src="" class="bytemdVideo" controls="controls"></video> **OpenSearch** This short video covers how to install and configure OpenSearch 2.5.0 with OpenSearch Dashboards On Ubuntu. <video src="" class="bytemdVideo" controls="controls"></video> **Build on Open Source** Episode two of Build on Open Source streamed live last Friday, 3rd March. Derek and myself covered newsletters #146 and this one, #147. We had a special guest, Toni de la Fuente who walked us through his open source projects prowler. **[prowler](** helps keep your Amazon Web Services environments secure by running audits against well know security compliance checks, and Toni walked us through a demo of it in action. I highly recommend you watch it if you can, as this project is awesome! You can watch it on [replay here]( For those unfamiliar with this show, Build on Open Source is where we go over this newsletter and then invite special guests to dive deep into their open source project. Expect plenty of code, demos and hopefully laughs. We have put together a playlist so that you can easily access all (eight) of the episodes of the Build on Open Source show. [Build on Open Source playlist]( # []( for your diary If you are planning any events in 2023, either virtual, in person, or hybrid, get in touch as I would love to share details of your event with readers. **Build on Open Source**\ **March 17th,** The third episode of Build on Open Source features special guest Amazon Web Services Community Builder John Preston who will be showing us compose-x, an open source tool to help you deploy applications using Amazon ECS (and other Amazon services). Really looking forward to this one as John is a super start open source developer. See you there on [](, Friday 3rd at 9am GMT, 10am CET. **Power Up your Kubernetes**\ **March 15th, Amazon Web Services Office Zurich, Switzerland** If you want to improve architecture, scaling and monitoring of your applications that run on Amazon Web Services Elastic Kubernetes Service, this event is for you. During this event you will learn to scale Kubernetes applications with Karpenter, monitor your workloads, and build SaaS architectures for Kubernetes. Find out more and save your place by heading over to the registration page, [Power up your Kubernetes on Amazon Web Services]( **Everything Open**\ **March14-15th Melbourne, Australia** A new event for the fine folks in Australia. Everything Open is running for the first time, and the organisers (Linux Australia) have decided to run this event to provide a space for a cross-section of the open technologies communities to come together in person. Check out the [event details here]( The CFP us currently open, so why not take a look and submit something if you can. **FOSSASIA**\ **April 13th-15th, Singapore** FOSSASIA Summit 2023 returns as an in-person and online event, taking place from Thursday 13th April to Saturday 15th April at the Lifelong Learning Institute in Singapore. If you are interested in attending in person, or virtually, find out more about the event at the [FOSSASIA Summit 2023 page]( **Cortex**\ **Every other Thursday, next one 16th February** The Cortex community call happens every two weeks on Thursday, alternating at 1200 UTC and 1700 UTC. You can check out the GitHub project for more details, go to the [Community Meetings]( section. The community calls keep a rolling doc of previous meetings, so you can catch up on the previous discussions. Check the [Cortex Community Meetings Notes]( for more info. **OpenSearch**\ **Every other Tuesday, 3pm GMT** This regular meet-up is for anyone interested in OpenSearch & Open Distro. All skill levels are welcome and they cover and welcome talks on topics including: search, logging, log analytics, and data visualisation. Sign up to the next session, [OpenSearch Community Meeting]( ### []( in touch with open source at Amazon Web Services Remember to check out the [Open Source homepage](\&opensource-all.sort-order=asc) to keep up to date with all our activity in open source by following us on [@AWSOpen](
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案