{"value":"##### **November 1st, 2021 - Instalment #88**\nNewsletter #88.\n\nNumber 88 symbolises fortune and good luck in Chinese culture (I hope that is correct, so please let me know if that is not) and I hope you will feel you are all the luckier for chancing upon this weeks selection of open source projects and posts.\n\nThis week, the big news was the publishing of the Babelfish repository. Check out the launch post as well as additional content and links to this interesting open source project. We have plenty of other new open source projects such as nixtla, an open source time series forecasting library ready to roll in your Amazon Web Services environment, shrimp and s3sha256sum tools that will help you work with large files over slow or unreliable connections, Amazon Web Services-sso-cli a project to help you configure SSO within your command line, and many more.\n\nWe have community and Amazon content covering OpenSearch, PyDeequ, ConsoleMe, Apache DistCp, Apache Airflow, Delta Sharing, Apache Solr, Synchro Charts, Amazon SAM, Bottlerocket and more. We also have a bumper set of videos too, this weeks topics include building Open Data Lakes, contributing to Terraform, Hugging Face and how to get started with GitHub Actions.\n\nBefore diving in, make sure you check out the Bug Bust article just below, and my team are hiring a couple of open source folks, so check out the job descriptions and feel free to DM me if you want a chat or to get any more details - could you be the one?\n\n##### **Celebrating open source contributors**\n\nThe articles posted in this series are only possible thanks to contributors and project maintainers and so I would like to shout out and thank those folks who really do power open source and enable us all to build on top of what they have created.\n\nSo thank you to the following open source heroes: Stefan Sundin, Aaron Turner, Saul Magnusson, Sarat Vemulapalli, Robert Hafner, Tyler Lynch, Drew Mullen, Richard Boyd, Sheetal Joshi, Frank Munz, Sebastien Stormacq, Andrew Lee, Hammad Ausaf, Eric Johnson, Mahendar Gajula, Nitin Srivastava, Sean Tracey, Brian Diehr, Andrea Di Simone, Samuel Passman, Kinnar Kumar Sen, Anjan Biswas, Ananth Raghavendra, Gary Stafford, Guilherme Sena Zuza, Suman Kumar Gangopadhyay, Sebastian Bille and Anton Rubin\n\nMake sure you find and follow these builders and keep up to date with their open source projects and contributions.\n\n##### **Community noticeboard**\n###### **Bug Bust**\n\nAs re:Invent fast approaches, it was interesting to read about the Amazon Web Services BugBust re:Invent Challenge at this year’s Amazon re:Invent conference. It will be an attempt to create a new World Record for “Largest Bug Fixing Competition” as recognised by Guinness World Records. As part of this, Amazon Web Services will be including a myriad of open source projects that developers will be able to patch and contribute to throughout the event. Bugs can range from security issues, to duplicate code, to resource leaks and more. To find out more, and how you might be able to register interest for your open source projects, read on in ++[Help Make BugBusting History at Amazon Web Services re:Invent 2021](https://aws-oss.beachgeek.co.uk/11c)++\n\n###### **Jobs**\n\nA couple of openings have come up in the open source team, so if you are looking for a change, and want to work with a really awesome bunch of people, then check these out. Feel free to ping me directly if you want to discuss in more details or ask any other question about the role, the team or working at Amazon Web Services in general.\n\n###### **Principal Evangelist, Open Source, Open Source Strategy & Marketing**\n\nThe OSS Technical Evangelist will be responsible for defining, leading and contributing to the open source and community engagement strategy for service teams across Amazon Web Services. You will combine your passion and enthusiasm for cloud technology and open source with your unmatched creativity to generate grass-roots attention and support for Amazon Web Services among key open source communities, industry opinion makers and technologists. You will work closely with the product marketing leadership to translate the business priorities of the service teams into solid engagements with key open source communities, foundations and open source technology partners. You will also be the voice of the community, building a feedback loop to teams across Amazon to help them understand the dynamics and needs of the open source community and build mechanisms to measure impact of our open source engagements. ++[Read more here](https://aws-oss.beachgeek.co.uk/11h)++.\n\n\n###### **Sr. Open Source Program Mgr, Open Source Strategy & Marketing**\n\nAmazon Web Services Open Source is seeking an experienced Program Manager to join the team. The successful candidate will be a key member of the Open Source Strategy and Marketing team, which is responsible for driving high-visibility, strategic programs that directly impact customer experience and perception. The ideal candidate will have a software development background as well as Program management experience and will own and execute complex projects and drive key operational process improvement activities. You will be comfortable in a fast-paced multi-tasked environment, with the ability to drive the program’s strategy and roadmap, collaborate with business and development teams across the company to analyse the cost/benefit of project selection, and manage all aspects of the project execution. ++[Read more here](https://aws-oss.beachgeek.co.uk/11i)++.\n\n##### **Latest open source projects**\n###### **Babelfish**\n\n++[Babelfish](https://aws-oss.beachgeek.co.uk/115)++ last week saw the availability of the ++[Babelfish for PostgreSQL](https://aws-oss.beachgeek.co.uk/114)++ open source project. As a reminder, Babelfish for PostgreSQL is an open source project available under the Apache 2.0 and PostgreSQL licenses that provide the capability for PostgreSQL to understand queries from applications written for Microsoft SQL Server. You can get started by reading my colleague Sebastian Stormacq post, ++[Goodbye Microsoft SQL Server, Hello Babelfish](https://aws-oss.beachgeek.co.uk/117)++. We also had Chandra sekhar Pathivada and Yogi Barot share this post, ++[Modify SSIS packages from SQL Server to Babelfish for Aurora PostgreSQL](https://aws-oss.beachgeek.co.uk/11r)++ showing you how to modify your existing Microsoft SSIS package connection from SQL Server to Babelfish.\n\n![image.png](https://dev-media.amazoncloud.cn/4c9623080da54317a866567df012a366_image.png)\n\n###### **nixtla**\n\n++[nixtla](https://aws-oss.beachgeek.co.uk/11u)++ this open source library from Nixtla looks interesting. An open source time series forecasting library. You can use the service that Nixtla provide, or use the documentation within the README to host yourself on Amazon Web Services. Either way, if you are exploring your choices for a time series forecasting library, you should add this to your list.\n\n![image.png](https://dev-media.amazoncloud.cn/0f6ea47af43546b9bfb2f026d5604c86_image.png)\n\nStefan Sundin has been busy, putting together a couple of open source projects and a blog post to show you how to use them, ++[Introducing shrimp and s3sha256sum](https://aws-oss.beachgeek.co.uk/11v)++\n\n###### **shrimp**\n\n++[shrimp](https://aws-oss.beachgeek.co.uk/11x)++ this open source tool helps you upload large files to Amazon S3 over slow connections. Shrimp is optimised for this use case.\n\n###### **s3sha256sum**\n\n++[s3sha256sum](https://aws-oss.beachgeek.co.uk/11w)++ this open source tool is a small program that calculates SHA256 checksums of objects stored on Amazon S3. Use it to verify the integrity of your objects. Perfect for checking those files have copied successfully (large files that you may have copied using Shrimp perhaps!)\n\nNice work Stefan!\n\n###### **Amazon Web Services-sso-cli**\n\n++[amazon-sso-cli](https://aws-oss.beachgeek.co.uk/11s)++ this open source, GPL 3 project from Aaron Turner is a secure replacement for using the Amazon configure sso wizard with a focus on security and ease of use for organisations with many Amazon Web Services Accounts and/or users with many IAM Roles to assume. Detailed docs and examples, worth checking out.\n\n###### **amazon-cdk-go-example-static-site**\n\n++[amazon-cdk-go-example-static-site](https://aws-oss.beachgeek.co.uk/11t)++ if you are looking for an example of how to deploy statics sites via Amazon Web Services CDK, then check out this project from Saul Magnusson. This example launches a secure static site hosted in an S3 bucket, distributed by CloudFront, protected by an ACM certificate, and with URIs automatically rewritten by a CloudFront Function\n\n###### **ds-dashboard**\n\n++[ds-dashboard](https://aws-oss.beachgeek.co.uk/11g)++ Andrea Di Simone and Samuel Passman have put together this project that helps you to deploy and operate a simple data collection and distribution mechanism for your data science projects, which can be used for use cases such as generating dashboards. For a detailed walk through, check out the blog post, ++[Implementing a hub and spoke dashboard for multi-account data science projects](https://aws-oss.beachgeek.co.uk/11f)++\n\n###### **amazon-opensearch-service-monitor**\n\n++[amazon-opensearch-service-monitor](https://aws-oss.beachgeek.co.uk/11o)++ if you are using OpenSearch, then check out this repository that contains step by step demonstration to setup monitoring Stack for Amazon OpenSearch Service domains across all specified regions. This example uses Amazon Web Services CDK and Python.\n\n![image.png](https://dev-media.amazoncloud.cn/6c342cc073e14d12a79ed90f2986da18_image.png)\n\n##### **Amazon Web Services and Community blog posts**\n###### **Consoleme**\n\nIn the post ++[Achieving least-privilege at FollowAnalytics with Repokid, Aardvark and ConsoleMe](https://aws-oss.beachgeek.co.uk/11k)++, Guilherme Sena Zuza shares how he used a number of open source tools to help get closer to the principle of least privilege, and how he deployed these tools at FollowAnalytics. [hands on]\n\n![image.png](https://dev-media.amazoncloud.cn/7bb219e0b89f49c29acc3dcef537bc56_image.png)\n\n###### **OpenSearch**\n\nA couple of OpenSearch related posts this week.\n\nFirst up we have Anton Rubin from Eliatra, who explains some of the core concepts of OpenSearch security in the post, ++[Partner Highlight: Eliatra presents OpenSearch Security Concepts](https://aws-oss.beachgeek.co.uk/11n)++. The post links to other, deeper dives if you want to gain even more understanding.\n\n![image.png](https://dev-media.amazoncloud.cn/58f0fd74d6e844809610dccf8c08f28d_image.png)\n\nFollowing that we have the post ++[Moving from open source Elasticsearch to OpenSearch](https://aws-oss.beachgeek.co.uk/11q)++, where Sarat Vemulapalli shares how an overview of how to move from open source Elasticsearch to OpenSearch.\n\n###### **Apache Airflow**\n\nI missed this the first time around, but caught it last week. Suman Kumar Gangopadhyay provides a step by step guide to help setup a pipeline which can automate running spark jobs from an edge node to a spark cluster, in the blog post ++[Airflow, Spark & S3, stitching it all together](https://aws-oss.beachgeek.co.uk/11l)++ [hands on]\n\n###### **Apache DistCp**\n\nIn the post, ++[Copy large datasets from Google Cloud Storage to Amazon S3 using Amazon EMR](https://aws-oss.beachgeek.co.uk/11a)++ Andrew Lee and Hammad Ausaf demonstrates how to configure an EMR cluster to use open source tools such as Apache DistCp and S3DistCP, and compares the performance of doing large file copies. [hands on]\n\n###### **Delta Sharing**\n\nDelta Sharing is a Linux Foundation open source framework that uses an open protocol to secure the real-time exchange of large datasets and enables secure data sharing across products for the first time. It was great therefore, to read ++[Delta Sharing on Amazon Web Services](https://aws-oss.beachgeek.co.uk/118)++ from former colleague Frank Munz, who is now a Developer Advocate at Databricks. [hands on]\n\n![image.png](https://dev-media.amazoncloud.cn/bdbb32debd5148a3bbe64ca8a8f0d495_image.png)\n\n###### **Apache Solr**\n\nApache Solr is an open source enterprise search platform built on Apache Lucene. Kinnar Kumar Sen, Anjan Biswas, and Ananth Raghavendra have collaborated on this post, ++[Deploying and scaling Apache Solr on Kubernetes](https://aws-oss.beachgeek.co.uk/11j)++ that shows you how to deploy a highly available, scalable, and fault-tolerant enterprise-grade search platform with Apache Solr using Amazon Elastic Kubernetes Service (Amazon EKS). [hands on]\n\n![image.png](https://dev-media.amazoncloud.cn/99ae7a4a9d5e4114bcffc35579f9a933_image.png)\n\n###### **PyDeequ**\n\nPyDeequ is an open-source Python wrapper over Deequ (an open-source tool developed and used at Amazon, to help data scientists and engineers with data quality and testing capabilities from Python and PySpark). In the post, ++[Accelerate large-scale data migration validation using PyDeequ](https://aws-oss.beachgeek.co.uk/11b)++ Mahendar Gajula and Nitin Srivastava, show you how you can use this and walk through a step-by-step process to validate large datasets. [hands on]\n\n![image.png](https://dev-media.amazoncloud.cn/f506794dbfd349d19251b2c58c01979f_image.png)\n\n\n###### **Synchro Charts**\n\nSynchro Charts, is a new open source project that is a front-end component library that provides a collection of components to visualise time-series data for application developers with a focus on monitoring, root cause analysis, and analytics. In the post, ++[Visualizing time series data with the open source Synchro Charts](https://aws-oss.beachgeek.co.uk/11d)++ Brian Diehr walks you through some of the features which you can check out at ++[synchrocharts.com](https://aws-oss.beachgeek.co.uk/11e)++\n\n###### **Bottlerocket**\n\n++[Amazon EKS adds native support for Bottlerocket in Managed Node Groups](https://aws-oss.beachgeek.co.uk/116)++ from Sheetal Joshi takes a look at how to set up an Amazon EKS cluster and launch a Bottlerocket managed node group. Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon. It focuses on security and maintainability, and provides a reliable, consistent, and safe platform for container-based workloads. Dive deeper into what this means by checking out this post. [hands on]\n\n###### **Serverless Framework**\n\nAmazon Web Services Community Builder Sebastian Bille takes a look at two open source projects and how you get combine these when developing and deploying your serverless application in the blog post, ++[Combining Serverless Framework & Amazon Web Services CDK ](https://aws-oss.beachgeek.co.uk/11m)++ [hands on]\n\n###### **Amazon Web Services SAM**\n\nEric Johnson shares with you a new capability of Amazon Web Services SAM, that aims to increase infrastructure accuracy for testing with sam sync, incremental builds, and aggregated feedback for developers. Amazon Web Services SAM Accelerate brings the developer to the cloud and not the cloud to the developer. In the post, ++[Accelerating serverless development with Amazon Web Services SAM Accelerate](https://aws-oss.beachgeek.co.uk/119)++ he shows you how to bypass most local emulation by testing serverless applications in the cloud against production services using Amazon Web Services SAM Accelerate. When I saw a demo of this, this was one of the most exciting things I have seen in a while. If you are a serverless developer, you should definitely check this out. [hands on]\n\n![image.png](https://dev-media.amazoncloud.cn/e296e788b36e4bf5b37cf53f653a5189_image.png)\n\n###### **GitHub Actions**\n\nIf you are looking to use GitHub Actions to build your open source projects and need to integrate with Amazon Web Services, then the recent release of the of Github Actions OpenID Connect will be good news for you. In this blog post, ++[Using Github Actions OpenID Connector to push to Amazon ECR without Credentials](https://aws-oss.beachgeek.co.uk/11p)++ Robert Hafner walks you through setting this up, using an example of using Terraform to push images to Amazon ECR. Also, check out the videos where we have Richard Boyd walking you through how you can set this up.\n\n\n##### **Quick updates**\n###### **Bottlerocket**\n\nAmazon Elastic Kubernetes Service (EKS) now adds native support for Bottlerocket in EKS managed node groups in all commercial Amazon regions. Most EKS customers today deploy their applications on worker nodes backed by operating systems that are designed for a variety of use cases. Amazon Web Services launched Bottlerocket, a minimal, Linux-based open source operating system that is purpose built and optimised to run containers. When combined, EKS managed node groups and Bottlerocket give customers a simple way to provision and manage compute capacity using the latest best practices for running containers in production. Bottlerocket is now included as a built-in AMI choice for managed node groups, enabling customers to provision container optimised worker nodes with a single click.\n\nEKS customers can easily migrate their applications to run on Bottlerocket based worker nodes and benefit from an improved node security posture. By moving to worker nodes that include only the minimal set of packages needed to run containers, customers benefit from a reduced attack surface, decreased node provisioning time and improved efficiency as more node resources are allocated to applications. This improves cluster utilisation and scale. Managed node groups provides notifications when newer EKS Bottlerocket AMIs are available, enabling customers to more easily update nodes to the latest versions of the software.\n\n###### **PostgreSQL**\n\nA couple of quick updates this week:\n\nFollowing the announcement of updates to the PostgreSQL database by the open source community, we have updated Amazon Aurora PostgreSQL-Compatible Edition to support PostgreSQL 13.4, 12.8, 11.13, and 10.18. These releases contain bug fixes and improvements by the PostgreSQL community. As a reminder, Amazon Aurora PostgreSQL 9.6 will reach end of life on January 31, 2022.\n\nFollowing that, we saw news that Amazon Aurora PostgreSQL-Compatible Edition now supports PostGIS major version 3.1. This new version of PostGIS is available on PostgreSQL versions 13.4, 12.8, 11.13, 10.18, and higher. PostGIS allows you to store, query and analyze geospatial data within a PostgreSQL database. PostGIS 3.1 significantly improves performance such as spatial joins, which now run up to [N]X faster on PostgreSQL 13. As an example, you could use a spatial join to count the number of people living in an area defined by the reception of mobile phones from radio towers. PostGIS 3.1 is the new default version on PostgreSQL 10 and higher starting with the new minor versions. However, you can still create older versions of PostGIS in your PostgreSQL database, e.g., if you require version stability.\n\n##### **Video of the week**\nWe have a bumper selection this week, all great and well worth watching.\n\n###### **Building Open Data Lakes**\n\nFrom blog posts to You Tube, there is no place where Gary Stafford will not go in order to share his knowledge and expertise on this topic. In this video (grab a cup of your favourite hot beverage) he shows you how you can build a simple open data lake on Amazon Web Services using a combination of open-source software, including Debezium for change data capture (CDC), Apache Kafka, Kafka Connect, Apache Hive, Apache Spark, and Apache Hudi and Hudi's DeltaStreamer.\n\n<video src=\"https://dev-media.amazoncloud.cn/1fa078c20cda4698b7d91d41c8fbeb99_Building%20Open%20Data%20Lakes%EF%BC%9A%20Debezium%2C%20Apache%20Kafka%2C%20Hudi%2C%20Spark%2C%20and%20Hive%20on%20AWS%20%5BRaw%20Cut%5D.mp4\" class=\"manvaVedio\" controls=\"controls\" style=\"width:160px;height:160px\"></video>\n\n###### **Terraform**\n\nFresh from the Amazon Web Services Summit in DC, we have Tyler Lynch and Drew Mullen talking about the process of contributing upstream to the Amazon Web Services provider for Terraform. Dive deep on overcoming imposter syndrome, understanding the process, validating the need, working on the code, acceptance tests, and making a pull request.\n\n###### **Deploying to Amazon Web Services with GitHub Actions**\n\nThis talk from one of my colleagues Richard Boyd, will walk you through how you can use a recently announced new capabilities (new OIDC provider as well as some new Amazon Web Services developed custom GitHub actions) to show you how to deploy a serverless application using Amazon Web Services SAM.\n\n<video src=\"https://dev-media.amazoncloud.cn/0f3fb49edc3441a1a5a89d81cf07df77_Deploying%20to%20AWS%20with%20GitHub%20Actions%20-%20GitHub%20Universe%202021.mp4\" class=\"manvaVedio\" controls=\"controls\" style=\"width:160px;height:160px\"></video>\n\n###### **Hugging Face**\n\nThis is Julien from HuggingFace...it was great to see former colleague in this great video showing you NLP models: from the Hugging Face hub to Amazon SageMaker... and back!\n\n<video src=\"https://dev-media.amazoncloud.cn/08b93a7192274d0ea56677cec7cdd0f9_NLP%20models%EF%BC%9A%20from%20the%20Hugging%20Face%20hub%20to%20Amazon%20SageMaker...%20and%20back%21.mp4\" class=\"manvaVedio\" controls=\"controls\" style=\"width:160px;height:160px\"></video>\n\n##### **Events for your diary**\n###### **Databricks | Amazon Web Services Lakehouse Dev Day Live Workshop**\n###### **November 16th 9:00 AM PT**\n\nDelta Lake is an open source storage layer that provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. You can use Delta Lake on top of your existing data lake. During this workshop you will learn how to:\n\n- Make your existing Amazon S3 data lakes into a lakehouse with Delta Lake.\n- Provide an easy-to-use platform for analysts to directly query data on your data lake using SQL Analytics\n- Simplify and automate data pipelines for streaming and batch data to lower costs and boost productivity for your data teams\n\n++[Read more and sign up here.](https://aws-oss.beachgeek.co.uk/zs)++\n\n##### **Stay in touch with open source at Amazon**\nI hope this summary has been useful. Remember to check out the ++[Open Source homepage](https://aws.amazon.com/opensource/?opensource-all.sort-by=item.additionalFields.startDate&opensource-all.sort-order=asc)++ to keep up to date with all our activity in open source by following us on ++[@Amazon Web ServicesOpen](https://twitter.com/AWSOpen)++\n\n\n\n\n\n","render":"<h5><a id=\"November_1st_2021__Instalment_88_0\"></a><strong>November 1st, 2021 - Instalment #88</strong></h5>\n<p>Newsletter #88.</p>\n<p>Number 88 symbolises fortune and good luck in Chinese culture (I hope that is correct, so please let me know if that is not) and I hope you will feel you are all the luckier for chancing upon this weeks selection of open source projects and posts.</p>\n<p>This week, the big news was the publishing of the Babelfish repository. Check out the launch post as well as additional content and links to this interesting open source project. We have plenty of other new open source projects such as nixtla, an open source time series forecasting library ready to roll in your Amazon Web Services environment, shrimp and s3sha256sum tools that will help you work with large files over slow or unreliable connections, Amazon Web Services-sso-cli a project to help you configure SSO within your command line, and many more.</p>\n<p>We have community and Amazon content covering OpenSearch, PyDeequ, ConsoleMe, Apache DistCp, Apache Airflow, Delta Sharing, Apache Solr, Synchro Charts, Amazon SAM, Bottlerocket and more. We also have a bumper set of videos too, this weeks topics include building Open Data Lakes, contributing to Terraform, Hugging Face and how to get started with GitHub Actions.</p>\n<p>Before diving in, make sure you check out the Bug Bust article just below, and my team are hiring a couple of open source folks, so check out the job descriptions and feel free to DM me if you want a chat or to get any more details - could you be the one?</p>\n<h5><a id=\"Celebrating_open_source_contributors_11\"></a><strong>Celebrating open source contributors</strong></h5>\n<p>The articles posted in this series are only possible thanks to contributors and project maintainers and so I would like to shout out and thank those folks who really do power open source and enable us all to build on top of what they have created.</p>\n<p>So thank you to the following open source heroes: Stefan Sundin, Aaron Turner, Saul Magnusson, Sarat Vemulapalli, Robert Hafner, Tyler Lynch, Drew Mullen, Richard Boyd, Sheetal Joshi, Frank Munz, Sebastien Stormacq, Andrew Lee, Hammad Ausaf, Eric Johnson, Mahendar Gajula, Nitin Srivastava, Sean Tracey, Brian Diehr, Andrea Di Simone, Samuel Passman, Kinnar Kumar Sen, Anjan Biswas, Ananth Raghavendra, Gary Stafford, Guilherme Sena Zuza, Suman Kumar Gangopadhyay, Sebastian Bille and Anton Rubin</p>\n<p>Make sure you find and follow these builders and keep up to date with their open source projects and contributions.</p>\n<h5><a id=\"Community_noticeboard_19\"></a><strong>Community noticeboard</strong></h5>\n<h6><a id=\"Bug_Bust_20\"></a><strong>Bug Bust</strong></h6>\n<p>As re:Invent fast approaches, it was interesting to read about the Amazon Web Services BugBust re:Invent Challenge at this year’s Amazon re:Invent conference. It will be an attempt to create a new World Record for “Largest Bug Fixing Competition” as recognised by Guinness World Records. As part of this, Amazon Web Services will be including a myriad of open source projects that developers will be able to patch and contribute to throughout the event. Bugs can range from security issues, to duplicate code, to resource leaks and more. To find out more, and how you might be able to register interest for your open source projects, read on in <ins><a href=\"https://aws-oss.beachgeek.co.uk/11c\" target=\"_blank\">Help Make BugBusting History at Amazon Web Services re:Invent 2021</a></ins></p>\n<h6><a id=\"Jobs_24\"></a><strong>Jobs</strong></h6>\n<p>A couple of openings have come up in the open source team, so if you are looking for a change, and want to work with a really awesome bunch of people, then check these out. Feel free to ping me directly if you want to discuss in more details or ask any other question about the role, the team or working at Amazon Web Services in general.</p>\n<h6><a id=\"Principal_Evangelist_Open_Source_Open_Source_Strategy__Marketing_28\"></a><strong>Principal Evangelist, Open Source, Open Source Strategy & Marketing</strong></h6>\n<p>The OSS Technical Evangelist will be responsible for defining, leading and contributing to the open source and community engagement strategy for service teams across Amazon Web Services. You will combine your passion and enthusiasm for cloud technology and open source with your unmatched creativity to generate grass-roots attention and support for Amazon Web Services among key open source communities, industry opinion makers and technologists. You will work closely with the product marketing leadership to translate the business priorities of the service teams into solid engagements with key open source communities, foundations and open source technology partners. You will also be the voice of the community, building a feedback loop to teams across Amazon to help them understand the dynamics and needs of the open source community and build mechanisms to measure impact of our open source engagements. <ins><a href=\"https://aws-oss.beachgeek.co.uk/11h\" target=\"_blank\">Read more here</a></ins>.</p>\n<h6><a id=\"Sr_Open_Source_Program_Mgr_Open_Source_Strategy__Marketing_33\"></a><strong>Sr. Open Source Program Mgr, Open Source Strategy & Marketing</strong></h6>\n<p>Amazon Web Services Open Source is seeking an experienced Program Manager to join the team. The successful candidate will be a key member of the Open Source Strategy and Marketing team, which is responsible for driving high-visibility, strategic programs that directly impact customer experience and perception. The ideal candidate will have a software development background as well as Program management experience and will own and execute complex projects and drive key operational process improvement activities. You will be comfortable in a fast-paced multi-tasked environment, with the ability to drive the program’s strategy and roadmap, collaborate with business and development teams across the company to analyse the cost/benefit of project selection, and manage all aspects of the project execution. <ins><a href=\"https://aws-oss.beachgeek.co.uk/11i\" target=\"_blank\">Read more here</a></ins>.</p>\n<h5><a id=\"Latest_open_source_projects_37\"></a><strong>Latest open source projects</strong></h5>\n<h6><a id=\"Babelfish_38\"></a><strong>Babelfish</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/115\" target=\"_blank\">Babelfish</a></ins> last week saw the availability of the <ins><a href=\"https://aws-oss.beachgeek.co.uk/114\" target=\"_blank\">Babelfish for PostgreSQL</a></ins> open source project. As a reminder, Babelfish for PostgreSQL is an open source project available under the Apache 2.0 and PostgreSQL licenses that provide the capability for PostgreSQL to understand queries from applications written for Microsoft SQL Server. You can get started by reading my colleague Sebastian Stormacq post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/117\" target=\"_blank\">Goodbye Microsoft SQL Server, Hello Babelfish</a></ins>. We also had Chandra sekhar Pathivada and Yogi Barot share this post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11r\" target=\"_blank\">Modify SSIS packages from SQL Server to Babelfish for Aurora PostgreSQL</a></ins> showing you how to modify your existing Microsoft SSIS package connection from SQL Server to Babelfish.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/4c9623080da54317a866567df012a366_image.png\" alt=\"image.png\" /></p>\n<h6><a id=\"nixtla_44\"></a><strong>nixtla</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11u\" target=\"_blank\">nixtla</a></ins> this open source library from Nixtla looks interesting. An open source time series forecasting library. You can use the service that Nixtla provide, or use the documentation within the README to host yourself on Amazon Web Services. Either way, if you are exploring your choices for a time series forecasting library, you should add this to your list.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/0f6ea47af43546b9bfb2f026d5604c86_image.png\" alt=\"image.png\" /></p>\n<p>Stefan Sundin has been busy, putting together a couple of open source projects and a blog post to show you how to use them, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11v\" target=\"_blank\">Introducing shrimp and s3sha256sum</a></ins></p>\n<h6><a id=\"shrimp_52\"></a><strong>shrimp</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11x\" target=\"_blank\">shrimp</a></ins> this open source tool helps you upload large files to Amazon S3 over slow connections. Shrimp is optimised for this use case.</p>\n<h6><a id=\"s3sha256sum_56\"></a><strong>s3sha256sum</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11w\" target=\"_blank\">s3sha256sum</a></ins> this open source tool is a small program that calculates SHA256 checksums of objects stored on Amazon S3. Use it to verify the integrity of your objects. Perfect for checking those files have copied successfully (large files that you may have copied using Shrimp perhaps!)</p>\n<p>Nice work Stefan!</p>\n<h6><a id=\"Amazon_Web_Servicesssocli_62\"></a><strong>Amazon Web Services-sso-cli</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11s\" target=\"_blank\">amazon-sso-cli</a></ins> this open source, GPL 3 project from Aaron Turner is a secure replacement for using the Amazon configure sso wizard with a focus on security and ease of use for organisations with many Amazon Web Services Accounts and/or users with many IAM Roles to assume. Detailed docs and examples, worth checking out.</p>\n<h6><a id=\"amazoncdkgoexamplestaticsite_66\"></a><strong>amazon-cdk-go-example-static-site</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11t\" target=\"_blank\">amazon-cdk-go-example-static-site</a></ins> if you are looking for an example of how to deploy statics sites via Amazon Web Services CDK, then check out this project from Saul Magnusson. This example launches a secure static site hosted in an S3 bucket, distributed by CloudFront, protected by an ACM certificate, and with URIs automatically rewritten by a CloudFront Function</p>\n<h6><a id=\"dsdashboard_70\"></a><strong>ds-dashboard</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11g\" target=\"_blank\">ds-dashboard</a></ins> Andrea Di Simone and Samuel Passman have put together this project that helps you to deploy and operate a simple data collection and distribution mechanism for your data science projects, which can be used for use cases such as generating dashboards. For a detailed walk through, check out the blog post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11f\" target=\"_blank\">Implementing a hub and spoke dashboard for multi-account data science projects</a></ins></p>\n<h6><a id=\"amazonopensearchservicemonitor_74\"></a><strong>amazon-opensearch-service-monitor</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/11o\" target=\"_blank\">amazon-opensearch-service-monitor</a></ins> if you are using OpenSearch, then check out this repository that contains step by step demonstration to setup monitoring Stack for Amazon OpenSearch Service domains across all specified regions. This example uses Amazon Web Services CDK and Python.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/6c342cc073e14d12a79ed90f2986da18_image.png\" alt=\"image.png\" /></p>\n<h5><a id=\"Amazon_Web_Services_and_Community_blog_posts_80\"></a><strong>Amazon Web Services and Community blog posts</strong></h5>\n<h6><a id=\"Consoleme_81\"></a><strong>Consoleme</strong></h6>\n<p>In the post <ins><a href=\"https://aws-oss.beachgeek.co.uk/11k\" target=\"_blank\">Achieving least-privilege at FollowAnalytics with Repokid, Aardvark and ConsoleMe</a></ins>, Guilherme Sena Zuza shares how he used a number of open source tools to help get closer to the principle of least privilege, and how he deployed these tools at FollowAnalytics. [hands on]</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/7bb219e0b89f49c29acc3dcef537bc56_image.png\" alt=\"image.png\" /></p>\n<h6><a id=\"OpenSearch_87\"></a><strong>OpenSearch</strong></h6>\n<p>A couple of OpenSearch related posts this week.</p>\n<p>First up we have Anton Rubin from Eliatra, who explains some of the core concepts of OpenSearch security in the post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11n\" target=\"_blank\">Partner Highlight: Eliatra presents OpenSearch Security Concepts</a></ins>. The post links to other, deeper dives if you want to gain even more understanding.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/58f0fd74d6e844809610dccf8c08f28d_image.png\" alt=\"image.png\" /></p>\n<p>Following that we have the post <ins><a href=\"https://aws-oss.beachgeek.co.uk/11q\" target=\"_blank\">Moving from open source Elasticsearch to OpenSearch</a></ins>, where Sarat Vemulapalli shares how an overview of how to move from open source Elasticsearch to OpenSearch.</p>\n<h6><a id=\"Apache_Airflow_97\"></a><strong>Apache Airflow</strong></h6>\n<p>I missed this the first time around, but caught it last week. Suman Kumar Gangopadhyay provides a step by step guide to help setup a pipeline which can automate running spark jobs from an edge node to a spark cluster, in the blog post <ins><a href=\"https://aws-oss.beachgeek.co.uk/11l\" target=\"_blank\">Airflow, Spark & S3, stitching it all together</a></ins> [hands on]</p>\n<h6><a id=\"Apache_DistCp_101\"></a><strong>Apache DistCp</strong></h6>\n<p>In the post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11a\" target=\"_blank\">Copy large datasets from Google Cloud Storage to Amazon S3 using Amazon EMR</a></ins> Andrew Lee and Hammad Ausaf demonstrates how to configure an EMR cluster to use open source tools such as Apache DistCp and S3DistCP, and compares the performance of doing large file copies. [hands on]</p>\n<h6><a id=\"Delta_Sharing_105\"></a><strong>Delta Sharing</strong></h6>\n<p>Delta Sharing is a Linux Foundation open source framework that uses an open protocol to secure the real-time exchange of large datasets and enables secure data sharing across products for the first time. It was great therefore, to read <ins><a href=\"https://aws-oss.beachgeek.co.uk/118\" target=\"_blank\">Delta Sharing on Amazon Web Services</a></ins> from former colleague Frank Munz, who is now a Developer Advocate at Databricks. [hands on]</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/bdbb32debd5148a3bbe64ca8a8f0d495_image.png\" alt=\"image.png\" /></p>\n<h6><a id=\"Apache_Solr_111\"></a><strong>Apache Solr</strong></h6>\n<p>Apache Solr is an open source enterprise search platform built on Apache Lucene. Kinnar Kumar Sen, Anjan Biswas, and Ananth Raghavendra have collaborated on this post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11j\" target=\"_blank\">Deploying and scaling Apache Solr on Kubernetes</a></ins> that shows you how to deploy a highly available, scalable, and fault-tolerant enterprise-grade search platform with Apache Solr using Amazon Elastic Kubernetes Service (Amazon EKS). [hands on]</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/99ae7a4a9d5e4114bcffc35579f9a933_image.png\" alt=\"image.png\" /></p>\n<h6><a id=\"PyDeequ_117\"></a><strong>PyDeequ</strong></h6>\n<p>PyDeequ is an open-source Python wrapper over Deequ (an open-source tool developed and used at Amazon, to help data scientists and engineers with data quality and testing capabilities from Python and PySpark). In the post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11b\" target=\"_blank\">Accelerate large-scale data migration validation using PyDeequ</a></ins> Mahendar Gajula and Nitin Srivastava, show you how you can use this and walk through a step-by-step process to validate large datasets. [hands on]</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/f506794dbfd349d19251b2c58c01979f_image.png\" alt=\"image.png\" /></p>\n<h6><a id=\"Synchro_Charts_124\"></a><strong>Synchro Charts</strong></h6>\n<p>Synchro Charts, is a new open source project that is a front-end component library that provides a collection of components to visualise time-series data for application developers with a focus on monitoring, root cause analysis, and analytics. In the post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11d\" target=\"_blank\">Visualizing time series data with the open source Synchro Charts</a></ins> Brian Diehr walks you through some of the features which you can check out at <ins><a href=\"https://aws-oss.beachgeek.co.uk/11e\" target=\"_blank\">synchrocharts.com</a></ins></p>\n<h6><a id=\"Bottlerocket_128\"></a><strong>Bottlerocket</strong></h6>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/116\" target=\"_blank\">Amazon EKS adds native support for Bottlerocket in Managed Node Groups</a></ins> from Sheetal Joshi takes a look at how to set up an Amazon EKS cluster and launch a Bottlerocket managed node group. Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon. It focuses on security and maintainability, and provides a reliable, consistent, and safe platform for container-based workloads. Dive deeper into what this means by checking out this post. [hands on]</p>\n<h6><a id=\"Serverless_Framework_132\"></a><strong>Serverless Framework</strong></h6>\n<p>Amazon Web Services Community Builder Sebastian Bille takes a look at two open source projects and how you get combine these when developing and deploying your serverless application in the blog post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11m\" target=\"_blank\">Combining Serverless Framework & Amazon Web Services CDK </a></ins> [hands on]</p>\n<h6><a id=\"Amazon_Web_Services_SAM_136\"></a><strong>Amazon Web Services SAM</strong></h6>\n<p>Eric Johnson shares with you a new capability of Amazon Web Services SAM, that aims to increase infrastructure accuracy for testing with sam sync, incremental builds, and aggregated feedback for developers. Amazon Web Services SAM Accelerate brings the developer to the cloud and not the cloud to the developer. In the post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/119\" target=\"_blank\">Accelerating serverless development with Amazon Web Services SAM Accelerate</a></ins> he shows you how to bypass most local emulation by testing serverless applications in the cloud against production services using Amazon Web Services SAM Accelerate. When I saw a demo of this, this was one of the most exciting things I have seen in a while. If you are a serverless developer, you should definitely check this out. [hands on]</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/e296e788b36e4bf5b37cf53f653a5189_image.png\" alt=\"image.png\" /></p>\n<h6><a id=\"GitHub_Actions_142\"></a><strong>GitHub Actions</strong></h6>\n<p>If you are looking to use GitHub Actions to build your open source projects and need to integrate with Amazon Web Services, then the recent release of the of Github Actions OpenID Connect will be good news for you. In this blog post, <ins><a href=\"https://aws-oss.beachgeek.co.uk/11p\" target=\"_blank\">Using Github Actions OpenID Connector to push to Amazon ECR without Credentials</a></ins> Robert Hafner walks you through setting this up, using an example of using Terraform to push images to Amazon ECR. Also, check out the videos where we have Richard Boyd walking you through how you can set this up.</p>\n<h5><a id=\"Quick_updates_147\"></a><strong>Quick updates</strong></h5>\n<h6><a id=\"Bottlerocket_148\"></a><strong>Bottlerocket</strong></h6>\n<p>Amazon Elastic Kubernetes Service (EKS) now adds native support for Bottlerocket in EKS managed node groups in all commercial Amazon regions. Most EKS customers today deploy their applications on worker nodes backed by operating systems that are designed for a variety of use cases. Amazon Web Services launched Bottlerocket, a minimal, Linux-based open source operating system that is purpose built and optimised to run containers. When combined, EKS managed node groups and Bottlerocket give customers a simple way to provision and manage compute capacity using the latest best practices for running containers in production. Bottlerocket is now included as a built-in AMI choice for managed node groups, enabling customers to provision container optimised worker nodes with a single click.</p>\n<p>EKS customers can easily migrate their applications to run on Bottlerocket based worker nodes and benefit from an improved node security posture. By moving to worker nodes that include only the minimal set of packages needed to run containers, customers benefit from a reduced attack surface, decreased node provisioning time and improved efficiency as more node resources are allocated to applications. This improves cluster utilisation and scale. Managed node groups provides notifications when newer EKS Bottlerocket AMIs are available, enabling customers to more easily update nodes to the latest versions of the software.</p>\n<h6><a id=\"PostgreSQL_154\"></a><strong>PostgreSQL</strong></h6>\n<p>A couple of quick updates this week:</p>\n<p>Following the announcement of updates to the PostgreSQL database by the open source community, we have updated Amazon Aurora PostgreSQL-Compatible Edition to support PostgreSQL 13.4, 12.8, 11.13, and 10.18. These releases contain bug fixes and improvements by the PostgreSQL community. As a reminder, Amazon Aurora PostgreSQL 9.6 will reach end of life on January 31, 2022.</p>\n<p>Following that, we saw news that Amazon Aurora PostgreSQL-Compatible Edition now supports PostGIS major version 3.1. This new version of PostGIS is available on PostgreSQL versions 13.4, 12.8, 11.13, 10.18, and higher. PostGIS allows you to store, query and analyze geospatial data within a PostgreSQL database. PostGIS 3.1 significantly improves performance such as spatial joins, which now run up to [N]X faster on PostgreSQL 13. As an example, you could use a spatial join to count the number of people living in an area defined by the reception of mobile phones from radio towers. PostGIS 3.1 is the new default version on PostgreSQL 10 and higher starting with the new minor versions. However, you can still create older versions of PostGIS in your PostgreSQL database, e.g., if you require version stability.</p>\n<h5><a id=\"Video_of_the_week_162\"></a><strong>Video of the week</strong></h5>\n<p>We have a bumper selection this week, all great and well worth watching.</p>\n<h6><a id=\"Building_Open_Data_Lakes_165\"></a><strong>Building Open Data Lakes</strong></h6>\n<p>From blog posts to You Tube, there is no place where Gary Stafford will not go in order to share his knowledge and expertise on this topic. In this video (grab a cup of your favourite hot beverage) he shows you how you can build a simple open data lake on Amazon Web Services using a combination of open-source software, including Debezium for change data capture (CDC), Apache Kafka, Kafka Connect, Apache Hive, Apache Spark, and Apache Hudi and Hudi’s DeltaStreamer.</p>\n<p><video src=\"https://dev-media.amazoncloud.cn/1fa078c20cda4698b7d91d41c8fbeb99_Building%20Open%20Data%20Lakes%EF%BC%9A%20Debezium%2C%20Apache%20Kafka%2C%20Hudi%2C%20Spark%2C%20and%20Hive%20on%20AWS%20%5BRaw%20Cut%5D.mp4\" controls=\"controls\"></video></p>\n<h6><a id=\"Terraform_171\"></a><strong>Terraform</strong></h6>\n<p>Fresh from the Amazon Web Services Summit in DC, we have Tyler Lynch and Drew Mullen talking about the process of contributing upstream to the Amazon Web Services provider for Terraform. Dive deep on overcoming imposter syndrome, understanding the process, validating the need, working on the code, acceptance tests, and making a pull request.</p>\n<h6><a id=\"Deploying_to_Amazon_Web_Services_with_GitHub_Actions_175\"></a><strong>Deploying to Amazon Web Services with GitHub Actions</strong></h6>\n<p>This talk from one of my colleagues Richard Boyd, will walk you through how you can use a recently announced new capabilities (new OIDC provider as well as some new Amazon Web Services developed custom GitHub actions) to show you how to deploy a serverless application using Amazon Web Services SAM.</p>\n<p><video src=\"https://dev-media.amazoncloud.cn/0f3fb49edc3441a1a5a89d81cf07df77_Deploying%20to%20AWS%20with%20GitHub%20Actions%20-%20GitHub%20Universe%202021.mp4\" controls=\"controls\"></video></p>\n<h6><a id=\"Hugging_Face_181\"></a><strong>Hugging Face</strong></h6>\n<p>This is Julien from HuggingFace…it was great to see former colleague in this great video showing you NLP models: from the Hugging Face hub to Amazon SageMaker… and back!</p>\n<p><video src=\"https://dev-media.amazoncloud.cn/08b93a7192274d0ea56677cec7cdd0f9_NLP%20models%EF%BC%9A%20from%20the%20Hugging%20Face%20hub%20to%20Amazon%20SageMaker...%20and%20back%21.mp4\" controls=\"controls\"></video></p>\n<h5><a id=\"Events_for_your_diary_187\"></a><strong>Events for your diary</strong></h5>\n<h6><a id=\"Databricks__Amazon_Web_Services_Lakehouse_Dev_Day_Live_Workshop_188\"></a><strong>Databricks | Amazon Web Services Lakehouse Dev Day Live Workshop</strong></h6>\n<h6><a id=\"November_16th_900_AM_PT_189\"></a><strong>November 16th 9:00 AM PT</strong></h6>\n<p>Delta Lake is an open source storage layer that provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. You can use Delta Lake on top of your existing data lake. During this workshop you will learn how to:</p>\n<ul>\n<li>Make your existing Amazon S3 data lakes into a lakehouse with Delta Lake.</li>\n<li>Provide an easy-to-use platform for analysts to directly query data on your data lake using SQL Analytics</li>\n<li>Simplify and automate data pipelines for streaming and batch data to lower costs and boost productivity for your data teams</li>\n</ul>\n<p><ins><a href=\"https://aws-oss.beachgeek.co.uk/zs\" target=\"_blank\">Read more and sign up here.</a></ins></p>\n<h5><a id=\"Stay_in_touch_with_open_source_at_Amazon_199\"></a><strong>Stay in touch with open source at Amazon</strong></h5>\n<p>I hope this summary has been useful. Remember to check out the <ins><a href=\"https://aws.amazon.com/opensource/?opensource-all.sort-by=item.additionalFields.startDate&opensource-all.sort-order=asc\" target=\"_blank\">Open Source homepage</a></ins> to keep up to date with all our activity in open source by following us on <ins><a href=\"https://twitter.com/AWSOpen\" target=\"_blank\">@Amazon Web ServicesOpen</a></ins></p>\n"}