Amazon Web Services open source newsletter, #180

Amazon EC2
Amazon EMR
Amazon Corretto
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
## November 20th, 2023 - Instalment #180 Welcome to #180 of the Amazon Web Services open source newsletter, the place for all your Amazon Web Services and open source needs. As we ramp up to re:Invent, it is good to see that pre:Invent is giving us plenty of open source goodies. In this weeks newsletter, we have some of those in the way of new projects such as **res** and **aws-iatk**, but we also have lots of really great content too. In this edition, we have a curated set of observability content which I think you will really enjoy, as well as featuring posts that cover projects like Ragna, ezsmdeploy, SnapStart, GraalVM, Amazon Corretto, Amazon EMR, Apache Airflow, LangChain, Amazon Copilot, Karpenter, Grafana, Prometheus, Kubernetes, Apache Flink, Apache Kafka, Avro, Apache Cassandra, Apache Spark, Amazon Amplify, Redis, MySQL, PostgreSQL, eksctl, MapLibre, Overture Maps, and more! As always, do not skip the end, where videos and events await your gaze. Speaking of events, I am heading to Vilnius later this week to speak at Big Data Europe. I have a couple of sessions related to Apache Airflow, so if you are going to be attending get in touch, would love to meet up. Before driving into the newsletter, make sure you check the following launch we did last week which I was super excited about. **Open source security** I was very happy to see two new important resources launched this week. First up was news that we now have a dedicated page for open source security. You can view the page, [Amazon Open Source Security]( which shares our approach to open source security together with some useful examples, data points, and supporting articles. If that was not enough, we also launched[ Open Source Cryptography]( that looks at some of the great open source innovations that Amazon Web Services has produced, in the area of cryptography. Bookmark both of these resources and let us know what you think. **Feedback** Please please please take 1 minute to [complete this short survey]( I have Amazon Web Services credit codes for the first 50 as a thank you! ### Celebrating open source contributors The articles and projects shared in this newsletter are only possible thanks to the many contributors in open source. I would like to shout out and thank those folks who really do power open source and enable us all to learn and build on top of what they have created. So thank you to the following open source heroes: Haoween You, Joe Dahlquist, Chetan Patwal, Ali Alemi, Kiran Singh, Chirag Dave, Vinodkrishna Gopalan, Yoni Shalom, Sashi Varanasi, Roberto Luna Rojas, Steven Hancz, Elad Bernstein, Karthik Konaparthi, Erik Hanchett, Ashish Nanda, Pascal Vogel, Rakshith Rao, John Jackson, Rachel Leekin, Michael (Mike) Masaaud, Jihed Mselmi, Shreyas Subramanian, Ray Khorsandi, Hemanth Vemulapalli, Abhishek Gupta, Ruchi Mishra, Naga Gaddamu, Vadym Kazulkin, Harry at Data Smiles, Nowsath, Brendan Bouffler, Dan Fox, and Brian Krygsman. ### Latest open source projects *The great thing about open source projects is that you can review the source code. If you like the look of these projects, make sure you that take a look at the code, and if it is useful to you, get in touch with the maintainer to provide feedback, suggestions or even submit a contribution. The projects mentioned here do not represent any formal recommendation or endorsement, I am just sharing for greater awareness as I think they look useful and interesting!* #### Tools **aws-iatk** [aws-iatk]( Amazon Integrated Application Test Kit (IATK), a new open-source test library that makes it easier for developers to create tests for cloud applications with increased speed and accuracy. With Amazon IATK, developers can quickly write tests that exercise their code and its Amazon Web Services integrations against an environment in the cloud, making it easier to catch mistakes early in the development process. IATK includes utilities to generate test events, validate event delivery and structure in Amazon EventBridge Event Bus, and assertions to validate call flow using Amazon X-Ray traces. The [Amazon IATK]( is available for Python3.8+. To help you get started, check out the supporting blog post from Dan Fox and Brian Krygsman, [Introducing the Amazon Integrated Application Test Kit (IATK)]( **res** [res]( Research and Engineering Studio on Amazon Web Services (RES) is an open source, easy-to-use web-based portal for administrators to create and manage secure cloud-based research and engineering environments. Using RES, scientists and engineers can visualise data and run interactive applications without the need for cloud expertise. With just a few clicks, scientists and engineers can create and connect to Windows and Linux virtual desktops that come with pre-installed applications, shared data, and collaboration tools they need. With RES, administrators can define permissions, set budgets, and monitor resource utilisation through a single web interface. RES virtual desktops are powered by Amazon EC2 instances and NICE DCV. RES is available at no additional charge. You pay only for the Amazon Web Services resources needed to run your applications. Dive deeper by reading [New: Research and Engineering Studio on Amazon Web Services](, where Brendan Bouffler explains what RES is and how it works, and we’ll explain how to deploy it in your own Amazon Web Services account, and get it ready for your users. ![image.png]( "image.png") **aws-cdk-stack-builder-tool** [aws-cdk-stack-builder-tool]( or Amazon CDK Builder, is a browser-based tool designed to streamline bootstrapping of Infrastructure as Code (IaC) projects using the Amazon Cloud Development Kit (CDK). Equipped with a dynamic visual designer and instant TypeScript code generation capabilities, the CDK Builder simplifies the construction and deployment of CDK projects. It stands as a resource for all CDK users, providing a platform to explore a broad array of CDK constructs. Very cool indeed, and you can deploy on Amazon Cloud9, so that this project on my weekend to do list. ![image.png]( "image.png") **terraform-aws-ecr-watch** [terraform-aws-ecr-watch]( is a project out of the folks from Porsche, when they are not busy designing super fast cars, their engineers are busy creating useful open source tools for folks to use. This project is a Terraform module to configure an Amazon ECR Usage Dashboard based on Amazon CloudWatch log insight queries with data fetched from Amazon CloudTrail. ![image.png]( "image.png") ### Demos, Samples, Solutions and Workshops **ragna** [ragna]( this is a repo I put together to show you how you can add Amazon Bedrock models from Anthropic and Meta within the Ragna tool. I blogged last week about this [#179]( but I have put together this repo that shows the actual code as I had received quite a few DMs, and as a bonus, I have also added the recently announced Llama2 13B model from Meta. To help with this, a new blog post, [Adding Amazon Bedrock Llama2 as an assistant in Ragna]( will help you get this all up and running. There is also lots of useful info in the project README. ![image.png]( "image.png") **aws-external-package-security** [aws-external-package-security]( provides code to setup a solution that demonstrates how you can deploy Amazon Code Services (e.g., Amazon CodePipeline, Amazon CodeBuild, Amazon CodeGuru Security, Amazon CodeArtifact) to orchestrate secure access to external package repositories from an Amazon SageMaker data science environment configured with multi-layer security. The solution can also be expanded upon to account for general developer workflows, where developers use external package dependencies. ### Amazon Web Services and Community blog posts **Community round up** Starting strongly in this weeks community round up we have Amazon Web Services Community Builder Vadym Kazulkin who has put together a great read for those building modern Java apps and looking for ways to optimise how these work in his post, [Reducing Cold Starts on Amazon Lambda with Java Runtime - Future Ideas about SnapStart, GraalVM and Co]( Amazon Web Services Community Builder Nowsath has a quick handy post for those working with Amazon EMR, and who might be encountering the occasional error (after all, who create stuff that works first time.......yeah, exactly, no one!). Check out [Most common errors when setting up Amazon EMR]( the next time you are planning to work on Amazon EMR, you never know, it might help save you some time. I am also a big fan of including errors and issues in my blog posts. The amount of times that I have received messages to say they were searching for a fix, and came across my blog posts is enough validation for me. Next up we have Harry at Data Smiles with [Mastering Apache Airflow: My Essential Best Practices for Robust Data Orchestration](, covering good practices when working with Apache Airflow. A lot of sense in that post, so whether you are new or experienced, well worth checking out. To finish up, a few posts from some of my colleagues. First up we have Abhishek Gupta with [Use Amazon Bedrock and LangChain to build an application to chat with web pages](, who shows you how you can build and package a conversational application using Amazon CDK, LangChain, and of course Go. Next up we have a longer tutorial from Hemanth Vemulapalli, [Break a Monolithic Application Into Microservices With Amazon Web Services Migration Hub Refactor Spaces, Amazon Copilot]( which walks you through the process of decomposing a monolith to micro services leveraging the strangler fig pattern using Refactor Spaces and Amazon Copilot. To finish this weeks community round up, we have [How to launch Ray Clusters on Amazon Web Services]( where Ruchi Mishra and Naga Gaddamu look at how you can get started with Ray on Amazon Web Services. Ray is an open source general purpose universal library that allows you to do distributed computing and it offers you an ecosystem of native libraries to scale ML workloads. **ezsmdeploy** I first covered ezsmdeploy back in [#25](, an open source Python package to help easily deploy machine learning models and provide a variety of options such as passing one or more model files, automatic selection of instances, and autoscaling. In [Deploy Large Language Models Easily with the New ezsmdeploy Python SDK](, Shreyas Subramanian and Ray Khorsandi cover the new features of the ezsmdeploy 2.0 SDK, providing some code examples to demonstrate how to launch and interact with popular foundation models (FM). This is a really nice update, so make sure you check this out if you are exploring the world of generative AI and looking for open source tools to help make your life easier. **Karpenter** Karpenter is an open-source, high-performance Kubernetes cluster autoscaler that automatically provisions new nodes in response to un-schedulable pods. In the post, [Deliver Namespace as a Service multi tenancy for Amazon EKS using Karpenter]( Rachel Leekin, Michael (Mike) Masaaud, and Jihed Mselmi show how you can use Karpenter to provision node, scale up, and scale down the cluster per tenant without impacting other ones. \[hands on] **Apache Airflow** For folks who have been using Managed Workflows for Apache Airflow (MWAA) for a while, you may be familiar with a long requested ask around how to deploy MWAA into more customised Amazon VPC configurations. Specifying customer-managed endpoints is sometimes a requirements for customers to meet strict security policies by explicitly restricting VPC resource access to just those needed by their Amazon MWAA environments. It was great to read [Introducing shared VPC support on Amazon MWAA](, where John Jackson provides a hands on guide on how to automate environment creation with shared VPC support in Amazon MWAA. This gives you the ability to manage your own endpoints within your VPC, adding compatibility to shared, or otherwise restricted, VPCs. **Observability** Last week saw some great content covering all things related to observability, so here are my picks: * [Extend your Amazon Managed Grafana experience with Grafana community plugins]( looks at a new self-service plugin management experience for Grafana community plugins, that enables you to unify data from a wider variety of data sources with visualisations tailored to analyse your unique datasets \[hands on] * [.NET Observability with OpenTelemetry – Part 1: Metrics using Amazon Managed Prometheus and Grafana]( is the first in a series, that looks at the implementation of OpenTelemetry in an ASP.NET website to capture application metrics \[hands on] * [Analyzing Amazon Lex conversation log data with Amazon Managed Grafana]( overes how to enable and use conversation logs to provide these insights and use Grafana to visualize and report on key metrics \[hands on] ![image.png]( "image.png") * [Announcing Amazon CloudWatch Container Insights with Enhanced Observability for Amazon EKS on EC2]( has probably more graphs than I have ever seen in a post before, but looks at the various features introduced as part of the Amazon CloudWatch Container Insights with enhanced observability for Amazon Elastic Kubernetes Service - expect lots of graphs! \[hands on] ![image.png]( "image.png") * [Monitoring MongoDB Atlas with Amazon Web Services Managed Grafana and Amazon Managed Service for Prometheus]( describes how to use Amazon Web Services Managed Service for Prometheus (AMP) and Amazon Managed Grafana (AMG) for monitoring MongoDB Atlas Clusters \[hands on] ![image.png]( "image.png") **Other posts and quick reads** * [Implement Apache Flink near-online data enrichment patterns]( covers how you can implement data enrichment for near-online streaming events with Apache Flink and how you can optimize performance \[hands on] ![image.png]( "image.png") * [Converting Apache Kafka events from Avro to JSON using EventBridge Pipes]( how to reliably consume, validate, convert, and send Avro events from Kafka to Amazon Web Services and third-party services using EventBridge Pipes, allowing you to reduce custom deserialization logic in downstream services \[hands on] ![image.png]( "image.png")> * [Announcing frozen collections in Amazon Keyspaces]( looks at support for frozen collections in Amazon Keyspaces, and discusses various use cases for frozen collections and how to use the three different types of collections: MAP, LIST, and SET \[hands on] * [Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark]( is a case study on how the Capitec team successfully implemented the Apache Spark Amazon Redshift integration for Apache Spark to simplify their feature computation workflows \[hands on] * [Run Amazon EKS on RHEL Worker Nodes with IPVS Networking]( provides details of how to create Amazon EKS worker nodes that run on RHEL 8 and RHEL 9 Amazon EC2 instances and how to run your Amazon EKS clusters in IPVS networking mode \[hands on] * [Build trust and safety for generative AI applications with Amazon Comprehend and LangChain]( explores how you can use Amazon Comprehend ContentModerationChain to add trust and safety features to any LLM workflow, including Retrieval Augmented Generation (RAG) workflows implemented in LangChain \[hands on] ![image.png]( "image.png") ### Quick updates **Amazon Linux 2023** Amazon Lambda now supports Amazon Linux 2023 as both a managed runtime and a container base image. This runtime has a significantly smaller deployment footprint than Amazon Linux 2 runtimes, and provides updated versions of common libraries such as glibc. The Amazon Linux 2023 runtime will also be used as the basis for future Lambda runtime releases, such as Node.js 20, Python 3.12, Java 21, and .NET 8. The Amazon Linux 2023 runtime provides an OS-only execution environment for Lambda functions. It is based on the AL2023 minimal container image release of AL2023. An OS-only Lambda environment is useful in three scenarios: when using languages which are compiled to native code, such as Go or Rust; when using third-party runtimes such as Bref for PHP; or when using custom runtimes. Amazon Web Services will automatically apply updates and security patches to the managed runtime and container base image, as they become available. Rakshith Rao provides more details in the post, [Introducing the Amazon Linux 2023 runtime for Amazon Lambda]( **NodeJS** Amazon Lambda now supports creating serverless applications using Node.js 20. Developers can use Node.js 20 as both a managed runtime and a container base image, and Amazon Web Services will automatically apply updates to the managed runtime and base image as they become available. You can use Node.js 20 with Lambda\@Edge, allowing you to customize low-latency content delivered through Amazon CloudFront. Powertools for Amazon Lambda (TypeScript), a developer toolkit to implement serverless best practices and increase developer velocity, also supports Node.js 20. The Lambda Node.js 20 runtime is built on the new Amazon Linux 2023 runtime, which is based on the AL2023 minimal container image. This provides a significantly smaller deployment footprint than earlier Amazon Linux 2-based runtimes, updated versions of common libraries such as glibc, and a new package manager. The Node.js 20 runtime also provides configurable certificate loading for faster cold starts, as well as supporting new Node.js 20 language features. Read Pascal Vogel's post to find out more, [Node.js 20.x runtime now available in Amazon Lambda]( \[hands on] **Amazon Amplify** Amazon Amplify JavaScript Library v6 is now available. This includes reduced bundle sizes, richer TypeScript support, and integrations with Next.js server-side features. The Amazon Amplify JavaScript Library enables frontend developers to connect their web and React Native apps to Amazon Web Services cloud backends. In this release, Amplify JavaScript now offers richer TypeScript support enhancing developer productivity and reducing runtime errors. Additionally, Apps using this new release will be served with smaller bundle sizes. Amplify JavaScript v6 also introduces an integration with Next.js server-side features such as App Router, Middleware, API routes, and server functions. Developers can now build web apps leveraging improved TypeScript interfaces for a more intuitive development experience, including syntax highlighting, code completion, and type checking. We have improved app load times by reducing the bundle sizes served with Amplify JavaScript. This version also enables developers to use Amplify JavaScript in Next.js applications client-side or server-side, including in Middleware, API routes, and with the new App Router. You might be wanting more info perhaps? Well, in which case why not dive into this post [Building fast Next.js apps using TypeScript and Amazon Amplify JavaScript v6](, where Erik Hanchett and Ashish Nanda happily oblige. ![image.png]( "image.png") **Redis** Amazon ElastiCache for Redis version 7.1 is now generally available. This release contains performance improvements which enable workloads to drive higher throughput and lower operation latencies. ElastiCache customers can achieve over 1 million requests per second per node on r7g.4xlarge or larger. Amazon ElastiCache for Redis version 7.1 can achieve up to 100% more throughput and 50% lower P99 latency, compared to Elasticache for Redis version 7.0. Customers choose ElastiCache to power some of their most performance-sensitive, real-time applications, that require fast data access, fast response time and cost reduction. As they scale their applications, they have a growing need for high throughput, while keeping latency under 1ms per request. Today, customers can double the performance by upgrading from ElastiCache for Redis 7.0 to 7.1, and can scale to 500 millions of requests per second (RPS) with microsecond response time. Also published this week on an unrelated note, was a great post to help guide you in sizing your Redis clusters using good practices. Elad Bernstein and Karthik Konaparthi are your hosts in [Best practices for sizing your Amazon ElastiCache for Redis clusters]( If that was not enough, Sashi Varanasi, Roberto Luna Rojas, and Steven Hancz collaborated on [Optimize cost and boost performance of RDS for MySQL using Amazon ElastiCache for Redis](, that very nicely details how you can optimise your relational database costs with in-memory caching using Amazon ElastiCache. ![image.png]( "image.png") **MySQL** Amazon RDS for MySQL now supports MySQL Innovation Release 8.1 in the Amazon RDS Database Preview Environment, allowing you to evaluate the latest Innovation Release on Amazon RDS for MySQL. You can deploy MySQL 8.1 in the Amazon RDS Database Preview Environment that has the benefits of a fully managed database, making it simpler to set up, operate, and monitor databases. MySQL 8.1 is the first Innovation Release from the MySQL community. MySQL Innovation releases include bug fixes, security patches, as well as new features. MySQL Innovation releases are supported by the community until the next major & minor release, whereas MySQL Long Term Support (LTS) Releases, such as MySQL 8.0, are supported by the community for up to eight years. **PostgreSQL** Amazon RDS Proxy is a fully managed and a highly available database proxy for Amazon Aurora and RDS databases. RDS Proxy allows customers to gracefully scale applications by efficiently reusing database connections. For application using PostgreSQL Extended Query Protocol, RDS Proxy can now reuse database connections, resulting in efficient use of database resources. Many applications and database drivers use PostgreSQL Extended Query Protocol for improved security and performance. Prior to this launch, when RDS Proxy encountered an extended query protocol message, it automatically “pinned” the database connection, which meant applications could not benefit from efficient database connection reuse. Now, RDS Proxy will continue to pool and share database connections when it detects an extended query protocol message, improving your database efficiency and application scalability. This functionality is available for existing as well as new RDS Proxy customers by default. It would be great if you could get more details right? Well, why not try [Amazon RDS Proxy multiplexing support for PostgreSQL Extended Query Protocol](, where Kiran Singh, Chirag Dave, Vinodkrishna Gopalan, and Yoni Shalom provide a details hands on guide on getting started with this. ![image.png]( "image.png") **Apache Kafka** Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless now supports writes and reads from Kafka clients written in all programming languages. Administrators can simplify and standardise access control to Kafka resources using Amazon Identity and Access Management (IAM). Amazon MSK’s IAM support is based on SASL/OAUTHBEARER, an open standard for authorisation and authentication. Amazon MSK Serverless is a cluster type for Amazon MSK that allows you to run Apache Kafka without having to manage and scale cluster capacity. MSK Serverless automatically provisions and scales compute and storage resources, so you can use Apache Kafka on demand. Developers can now build applications on Amazon MSK Serverless with minimal code changes using Amazon MSK’s open-sourced client helper libraries and code samples for popular languages, including Java, Python, Go, JavaScript, and .NET. Customers can also use standard IAM access controls such as temporary role-based credentials and precisely scoped permission policies more broadly with the multiple language support on Amazon MSK Serverless. Check out [Amazon MSK Serverless now supports Kafka clients written in all programming languages]( where Ali Alemi provides a hands on guide to show how you can connect your applications to MSK Serverless with minimal code changes using the open-sourced client helper libraries and code samples for popular languages, including Java, Python, Go, JavaScript, and .NET. Ali also published [Amazon MSK IAM authentication now supports all programming languages](, so make sure you check that out too. ![image.png]( "image.png") ### Videos of the week **eksctl - the CLI for EKS** The topic for this Weaveworks Office Hours session is eksctl - The official CLI for Amazon EKS and the hosts for this session include Joe Dahlquist, VP Product Marketing, and Chetan Patwal, Senior Software Engineer. Learn about eksctl, an open source tool that simplifies the creation and management of clusters on Amazon EKS. **Open Source with Amazon Geospatial** Haoween You from Amazon Geospatial talks about recent open-source developments at Amazon Web Services. This is a presentation during the Open Visualization Collaborator Summit 2023. # Events for your diary If you are planning any events in 2023, either virtual, in person, or hybrid, get in touch as I would love to share details of your event with readers. **Big Data Europe**\ **21st-24th November, Online/Vilnius, Lithuania** I will be speaking at the Big Data Europe event, talking about how you can shift left and apply modern developer approaches to manage your Apache Airflow workflows. This builds upon a lot of the other work I have done in this space, so am really looking forward to doing this talk. Check out the[ event page ]( registration details, as well as to check out all the other sessions - many of which feature open source projects and technologies. **re:Invent**\ **November, 27th-1st December, Las Vegas, USA** The annual must attend conference for all Amazon Web Services developers is back, and with another strong line up of open source sessions, chalk talks, builder sessions, workshops and more. There will be a super cool open source booth, with a line up of great demos - I have taken a sneak look and so make sure you check out the demo schedule on the booth. Find out more by checking out the event page, [re:Invent 2023]( The Amazon EKS team have published a post to help ensure that you do not miss the best Kubernetes related sessions. Go check it out at, [Amazon EKS and Kubernetes sessions at Amazon Web Services re:Invent 2023]( **Cortex**\ **Every other Thursday, next one 16th February** The Cortex community call happens every two weeks on Thursday, alternating at 1200 UTC and 1700 UTC. You can check out the GitHub project for more details, go to the [Community Meetings]( section. The community calls keep a rolling doc of previous meetings, so you can catch up on the previous discussions. Check the [Cortex Community Meetings Notes]( for more info. **OpenSearch**\ **Every other Tuesday, 3pm GMT** This regular meet-up is for anyone interested in OpenSearch & Open Distro. All skill levels are welcome and they cover and welcome talks on topics including: search, logging, log analytics, and data visualisation. Sign up to the next session, [OpenSearch Community Meeting]( ### Stay in touch with open source at Amazon Web Services Remember to check out the [Open Source homepage](\&opensource-all.sort-order=asc?trk=cndc-detail) to keep up to date with all our activity in open source by following us on [@AWSOpen](
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案