亚马逊云科技 NLP 月刊 2022年4月

人工智能
机器学习
自然语言处理
海外精选
亚马逊云科技
海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时,内容中提到的“AWS” 是 “Amazon Web Services” 的缩写,在此网站不作为商标展示。
0
0
{"value":"Hello world. This is the monthly AWS Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS. Feel free to leave comments & share it on your social network.\n### **NLP@AWS Customer Success Story**\n++**[How to Build a Scalable Chatbot and Deploy to Europe’s Largest Airline](https://www.youtube.com/watch?v=dAQhNjwkOX8)**++\n\n![image.png](https://dev-media.amazoncloud.cn/49e3e294f00f4a3f988928165350b920_image.png)\n\nRyanair had a customer care improvement initiative to enable customers to access support 24x7 and yet allow agents to spend more time with customers with challenging needs that require a human touch. In this video, Cation Consulting CTO explains the architecture which uses Amazon Translate, Amazon Lex and Amazon SageMaker to help Ryanair implement a multi-lingual and multi-channel Chatbot to optimise customer responses while using Amazon Comprehend for sentiment analysis of customer interactions.\n\n### **AI Language Services**\n++**[Extract granular sentiment in text](https://aws.amazon.com/blogs/machine-learning/extract-granular-sentiment-in-text-with-amazon-comprehend-targeted-sentiment/)**++\n\n![mj8rjdjf1079xdnxfjmv.png](https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/c85c1f2d617b46c8af1a0ee92395f499_mj8rjdjf1079xdnxfjmv.png)\n\nWhen using NLP to perform sentiment analysis on an input document, typically the dominant sentiment is determined even though the document may contain nuanced sentiment referring to multiple entities . This level of granularity can be exploited by businesses using Amazon Comprehend Targeted Sentiment feature to understand which specific attributes of their products or services were well received by customers and therefore should be retained or strengthen, and which attributes needs to be improved upon.\n\n++**[Automate email responses using Amazon Comprehend custom classification and entity detection](https://aws.amazon.com/blogs/machine-learning/automate-email-responses-using-amazon-comprehend-custom-classification-and-entity-detection/)**++\n![laos4c0t2asxpp24b0f5.png](https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/3b27569e855241b890858a505f6410f3_laos4c0t2asxpp24b0f5.png)\n\nOrganizations and businesses that provide customer care spend a lot of resources and manpower to be responsive to customer’s needs and to ensure a good customer experience. Using an automated approach to answer customer queries can often improve the customer care experience in addition to lowering costs. Many organisations have the requisite data assets but are often hindered by lack of AI/ML expertise. This blog post shows how you can use Amazon Comprehend to identify the intent of customer care email and automate the response using AWS services.\n\n++**[Build a traceable, custom, multi-format document parsing pipeline with Amazon Textract](https://aws.amazon.com/blogs/machine-learning/build-a-traceable-custom-multi-format-document-parsing-pipeline-with-amazon-textract/)**++\n\n![ld5janslhcn3pxzin83c.png](https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/073adba9ad5c4564b4ac9371d958278f_ld5janslhcn3pxzin83c.png)\n\n\nMany businesses today still rely on using forms as a necessary business tool due to compliance reasons or where digital data capture is not possible.\nVery often these businesses have to release new versions of their forms which can impact traditional OCR systems and impact downstream processing tools.\nAmazon Textract is a machine learning (ML) service that automatically extracts text or handwriting data from forms in minutes. You can build a serverless event-driven, multi-format document parsing pipeline using Amazon Textract. This post demonstrates how to design a robust pipeline that can handle multiple versions of forms easily.\n\n++**[Enable conversational chatbots for telephony using Amazon Lex and the Amazon Chime SDK](https://aws.amazon.com/blogs/machine-learning/enable-conversational-chatbots-for-telephony-using-amazon-lex-and-the-amazon-chime-sdk/)**++\nConversational AI can deliver powerful, automated and interactive experiences through voice and text. A common application of this is AI-powered self-service applications such as conversational interactive voice response systems (IVRs) that can handle voice service calls by automating informational responses. This blog post shows how you can easily build such an application in a serverless manner without any need for an expensive IVR platform. It uses an Amazon Lex-driven chatbot with audio powered by Amazon Chime SDK Public Switched Telephone Network (PSTN) and natively integrates with Amazon Polly’s text-to-speech capabilities to convert text responses into speech.\n\n### **NLP on Amazon SageMaker**\n++**[Train EleutherAI GPT-J using SageMaker](https://github.com/aws/amazon-sagemaker-examples/blob/main/training/distributed_training/pytorch/model_parallel/gpt-j/train_gptj_smp_notebook.ipynb)**++\nEleutherAI released GPT-J 6B as an open-source alternative to OpenAI's GPT-3. EleutherAI’s goal was to train a model that is equivalent in size to GPT⁠-⁠3 and make it available to the public under an open license and has since gained a lot of interest from Researchers, Data Scientists, and even Software Developers. This notebook shows you how to easily train and tune GPT-J using Amazon SageMaker Distributed Training and Hugging Face on NVIDIA GPU instances.\n\n++**[Have fun with your own private GPT text generation playground](https://towardsdatascience.com/how-to-build-your-own-gpt-j-playground-733f4f1246e5)**++\n\n![image.png](https://dev-media.amazoncloud.cn/efe4d3d7be4c462d933fa7837efcd313_image.png)\n\nWant to create your very own text generation playground without having to pay for GPT-3 usage or even having to train a GPT-J model? This blog shows how you can easily deploy GPT-J on Amazon SageMaker and create a web interface to interact with the model. Have fun!\n\n++**[Accelerate BERT inference](https://huggingface.co/blog/bert-inferentia-sagemaker)**++\n![1uziifzc457ayudl5x7o.png](https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/a816e029fd25499088237a7bbfd9f320_1uziifzc457ayudl5x7o.png)\n\nBERT and Transformers based models tend to be relatively large, complex and slow compared to traditional Machine Learning algorithms. One way of accelerating these models is to deploy them on AWS Inferentia based EC2 instances. AWS Inferentia is a chip designed to deliver up 2.3x higher throughput and up to 70% lower cost per inference than comparable current generation GPU-based Amazon EC2 instances. This post shows you how to use the AWS Neuron SDK to convert a BERT based model to run on AWS Inferentia powered EC2 Inf1 instances.\n\n++**[A practical guide to assessing text summarisation models](https://aws.amazon.com/blogs/machine-learning/part-1-set-up-a-text-summarization-project-with-hugging-face-transformers/)**++\n\n![63jeps6ib70iqkj6aylh.png](https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/1e5eb5aabb1747b983fc7265bea5ba21_63jeps6ib70iqkj6aylh.png)\n\nMany organisations have huge amount of text document that often need to be summarised and NLP based text summarisation is one way of automating such tasks. This 2 part series proposes a practical approach to assessing summarisation models at scale. The ++[first part](https://aws.amazon.com/blogs/machine-learning/part-1-set-up-a-text-summarization-project-with-hugging-face-transformers/)++ introduces the dataset and metric to evaluate a simple heuristic approach. The ++[second part](https://aws.amazon.com/blogs/machine-learning/part-2-set-up-a-text-summarization-project-with-hugging-face-transformers/)++ uses a zero-shot learning model and discusses an approach to comparing the 2 models that can be applied to compare many other models during the experimentation phase.\n\n++**[Introduction to Hugging Face on Amazon SageMaker](https://www.youtube.com/watch?v=80ix-IyNnQI)**++\n\n![image.png](https://dev-media.amazoncloud.cn/68779838680345f2ac775670f55aa13e_image.png)\n\nTransformers are state of the art NLP models that displays almost human level of performance in NLP tasks like text generation, text classification, question and answering etc. You can use pretrained transformer models and fine tune on your data to raise the intelligence on your NLP models. This video introduces you to Hugging Face and how you can easily build, train and deploy state of the art NLP models using Amazon SageMaker.\n\n++**[Hugging Face on Amazon SageMaker and AWS Workshop](https://github.com/aws-samples/hugging-face-workshop)**++\nInterested to use NLP to generate models in the style of your favourite poets? This workshop shows you how you can fine tune text generation models in the style of your favourite poets using Hugging Face on Amazon SageMaker.\n\n### **NLP@AWS Community Content**\n\n++**[Decoding Sperm Whale language using NLP models](https://www.projectceti.org/)**++\n\n![image.png](https://dev-media.amazoncloud.cn/0079e3786eba472f8ee2f8597831185b_image.png)\n\nProject ++[CETI](https://www.projectceti.org/)++ was selected for an ++[Amazon Research Award](https://www.amazon.science/research-awards)++ and is an ++[Imagine Grant](https://aws.amazon.com/government-education/nonprofits/aws-imagine-grant-program/)++ winner. AWS ML teams are working with the Open Data program team to help decode sperm whale language. By building NLP models to parse and interpret sperm whales’ voices and hopefully develop frameworks for communicating back, CETI aims to show that today’s most cutting-edge technologies can be used to not only drive business impact, but can enable a deeper understanding of other species on this planet. CETI’s work is taking place in the Caribbean, off the coast of Dominica, and remotely with AWS via Chime.\n\n### **Upcoming Events**\n++**[Workshop: Techniques to accelerate BERT Inference](https://app.livestorm.co/hugging-face/accelerate-bert-inference-with-knowledge-distillation-and-aws-inferentia?type=detailed)**++\nLearn how to apply knowledge distillation to compress a large BERT model to a small model, and then to an optimised neuron model with AWS Inferentia. By the end of this process, our model will go from 100ms+ to 5ms+ latency - a 20x improvement! Click ++[here](https://app.livestorm.co/hugging-face/accelerate-bert-inference-with-knowledge-distillation-and-aws-inferentia?type=detailed)++ for registration.\n\n++**[AWS Summits around the globe](https://aws.amazon.com/events/summits/?awsf.events-location=*all&awsf.events-series=*all)**++\nAWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn through technical breakout sessions, demonstrations, interactive workshops, labs, and team challenges. Summits are held in major cities around the world both virtually as well as in person starting April 2022. [Register](https://aws.amazon.com/events/summits/?awsf.events-location=*all&awsf.events-series=*all) for one near you!\n\n### **Stay in touch with NLP on AWS**\n\nOur contact: ++[aws-nlp@amazon.com](mailto:aws-nlp@amazon.com)++\nEmail us about (1) your awesome project about NLP on AWS, (2) let us know which post in the newsletter helped your NLP journey, (3) other things that you want us to post on the newsletter. Talk to you soon.\n\n","render":"<p>Hello world. This is the monthly AWS Natural Language Processing(NLP) newsletter covering everything related to NLP at AWS. Feel free to leave comments &amp; share it on your social network.</p>\n<h3><a id=\"NLPAWS_Customer_Success_Story_1\"></a><strong>NLP@AWS Customer Success Story</strong></h3>\n<p><ins><strong><a href=\"https://www.youtube.com/watch?v=dAQhNjwkOX8\" target=\"_blank\">How to Build a Scalable Chatbot and Deploy to Europe’s Largest Airline</a></strong></ins></p>\n<p><img src=\"https://dev-media.amazoncloud.cn/49e3e294f00f4a3f988928165350b920_image.png\" alt=\"image.png\" /></p>\n<p>Ryanair had a customer care improvement initiative to enable customers to access support 24x7 and yet allow agents to spend more time with customers with challenging needs that require a human touch. In this video, Cation Consulting CTO explains the architecture which uses Amazon Translate, Amazon Lex and Amazon SageMaker to help Ryanair implement a multi-lingual and multi-channel Chatbot to optimise customer responses while using Amazon Comprehend for sentiment analysis of customer interactions.</p>\n<h3><a id=\"AI_Language_Services_8\"></a><strong>AI Language Services</strong></h3>\n<p><ins><strong><a href=\"https://aws.amazon.com/blogs/machine-learning/extract-granular-sentiment-in-text-with-amazon-comprehend-targeted-sentiment/\" target=\"_blank\">Extract granular sentiment in text</a></strong></ins></p>\n<p><img src=\"https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/c85c1f2d617b46c8af1a0ee92395f499_mj8rjdjf1079xdnxfjmv.png\" alt=\"mj8rjdjf1079xdnxfjmv.png\" /></p>\n<p>When using NLP to perform sentiment analysis on an input document, typically the dominant sentiment is determined even though the document may contain nuanced sentiment referring to multiple entities . This level of granularity can be exploited by businesses using Amazon Comprehend Targeted Sentiment feature to understand which specific attributes of their products or services were well received by customers and therefore should be retained or strengthen, and which attributes needs to be improved upon.</p>\n<p><ins><strong><a href=\"https://aws.amazon.com/blogs/machine-learning/automate-email-responses-using-amazon-comprehend-custom-classification-and-entity-detection/\" target=\"_blank\">Automate email responses using Amazon Comprehend custom classification and entity detection</a></strong></ins><br />\n<img src=\"https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/3b27569e855241b890858a505f6410f3_laos4c0t2asxpp24b0f5.png\" alt=\"laos4c0t2asxpp24b0f5.png\" /></p>\n<p>Organizations and businesses that provide customer care spend a lot of resources and manpower to be responsive to customer’s needs and to ensure a good customer experience. Using an automated approach to answer customer queries can often improve the customer care experience in addition to lowering costs. Many organisations have the requisite data assets but are often hindered by lack of AI/ML expertise. This blog post shows how you can use Amazon Comprehend to identify the intent of customer care email and automate the response using AWS services.</p>\n<p><ins><strong><a href=\"https://aws.amazon.com/blogs/machine-learning/build-a-traceable-custom-multi-format-document-parsing-pipeline-with-amazon-textract/\" target=\"_blank\">Build a traceable, custom, multi-format document parsing pipeline with Amazon Textract</a></strong></ins></p>\n<p><img src=\"https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/073adba9ad5c4564b4ac9371d958278f_ld5janslhcn3pxzin83c.png\" alt=\"ld5janslhcn3pxzin83c.png\" /></p>\n<p>Many businesses today still rely on using forms as a necessary business tool due to compliance reasons or where digital data capture is not possible.<br />\nVery often these businesses have to release new versions of their forms which can impact traditional OCR systems and impact downstream processing tools.<br />\nAmazon Textract is a machine learning (ML) service that automatically extracts text or handwriting data from forms in minutes. You can build a serverless event-driven, multi-format document parsing pipeline using Amazon Textract. This post demonstrates how to design a robust pipeline that can handle multiple versions of forms easily.</p>\n<p><ins><strong><a href=\"https://aws.amazon.com/blogs/machine-learning/enable-conversational-chatbots-for-telephony-using-amazon-lex-and-the-amazon-chime-sdk/\" target=\"_blank\">Enable conversational chatbots for telephony using Amazon Lex and the Amazon Chime SDK</a></strong></ins><br />\nConversational AI can deliver powerful, automated and interactive experiences through voice and text. A common application of this is AI-powered self-service applications such as conversational interactive voice response systems (IVRs) that can handle voice service calls by automating informational responses. This blog post shows how you can easily build such an application in a serverless manner without any need for an expensive IVR platform. It uses an Amazon Lex-driven chatbot with audio powered by Amazon Chime SDK Public Switched Telephone Network (PSTN) and natively integrates with Amazon Polly’s text-to-speech capabilities to convert text responses into speech.</p>\n<h3><a id=\"NLP_on_Amazon_SageMaker_32\"></a><strong>NLP on Amazon SageMaker</strong></h3>\n<p><ins><strong><a href=\"https://github.com/aws/amazon-sagemaker-examples/blob/main/training/distributed_training/pytorch/model_parallel/gpt-j/train_gptj_smp_notebook.ipynb\" target=\"_blank\">Train EleutherAI GPT-J using SageMaker</a></strong></ins><br />\nEleutherAI released GPT-J 6B as an open-source alternative to OpenAI’s GPT-3. EleutherAI’s goal was to train a model that is equivalent in size to GPT⁠-⁠3 and make it available to the public under an open license and has since gained a lot of interest from Researchers, Data Scientists, and even Software Developers. This notebook shows you how to easily train and tune GPT-J using Amazon SageMaker Distributed Training and Hugging Face on NVIDIA GPU instances.</p>\n<p><ins><strong><a href=\"https://towardsdatascience.com/how-to-build-your-own-gpt-j-playground-733f4f1246e5\" target=\"_blank\">Have fun with your own private GPT text generation playground</a></strong></ins></p>\n<p><img src=\"https://dev-media.amazoncloud.cn/efe4d3d7be4c462d933fa7837efcd313_image.png\" alt=\"image.png\" /></p>\n<p>Want to create your very own text generation playground without having to pay for GPT-3 usage or even having to train a GPT-J model? This blog shows how you can easily deploy GPT-J on Amazon SageMaker and create a web interface to interact with the model. Have fun!</p>\n<p><ins><strong><a href=\"https://huggingface.co/blog/bert-inferentia-sagemaker\" target=\"_blank\">Accelerate BERT inference</a></strong></ins><br />\n<img src=\"https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/a816e029fd25499088237a7bbfd9f320_1uziifzc457ayudl5x7o.png\" alt=\"1uziifzc457ayudl5x7o.png\" /></p>\n<p>BERT and Transformers based models tend to be relatively large, complex and slow compared to traditional Machine Learning algorithms. One way of accelerating these models is to deploy them on AWS Inferentia based EC2 instances. AWS Inferentia is a chip designed to deliver up 2.3x higher throughput and up to 70% lower cost per inference than comparable current generation GPU-based Amazon EC2 instances. This post shows you how to use the AWS Neuron SDK to convert a BERT based model to run on AWS Inferentia powered EC2 Inf1 instances.</p>\n<p><ins><strong><a href=\"https://aws.amazon.com/blogs/machine-learning/part-1-set-up-a-text-summarization-project-with-hugging-face-transformers/\" target=\"_blank\">A practical guide to assessing text summarisation models</a></strong></ins></p>\n<p><img src=\"https://awsdevweb.s3.cn-north-1.amazonaws.com.cn/1e5eb5aabb1747b983fc7265bea5ba21_63jeps6ib70iqkj6aylh.png\" alt=\"63jeps6ib70iqkj6aylh.png\" /></p>\n<p>Many organisations have huge amount of text document that often need to be summarised and NLP based text summarisation is one way of automating such tasks. This 2 part series proposes a practical approach to assessing summarisation models at scale. The <ins><a href=\"https://aws.amazon.com/blogs/machine-learning/part-1-set-up-a-text-summarization-project-with-hugging-face-transformers/\" target=\"_blank\">first part</a></ins> introduces the dataset and metric to evaluate a simple heuristic approach. The <ins><a href=\"https://aws.amazon.com/blogs/machine-learning/part-2-set-up-a-text-summarization-project-with-hugging-face-transformers/\" target=\"_blank\">second part</a></ins> uses a zero-shot learning model and discusses an approach to comparing the 2 models that can be applied to compare many other models during the experimentation phase.</p>\n<p><ins><strong><a href=\"https://www.youtube.com/watch?v=80ix-IyNnQI\" target=\"_blank\">Introduction to Hugging Face on Amazon SageMaker</a></strong></ins></p>\n<p><img src=\"https://dev-media.amazoncloud.cn/68779838680345f2ac775670f55aa13e_image.png\" alt=\"image.png\" /></p>\n<p>Transformers are state of the art NLP models that displays almost human level of performance in NLP tasks like text generation, text classification, question and answering etc. You can use pretrained transformer models and fine tune on your data to raise the intelligence on your NLP models. This video introduces you to Hugging Face and how you can easily build, train and deploy state of the art NLP models using Amazon SageMaker.</p>\n<p><ins><strong><a href=\"https://github.com/aws-samples/hugging-face-workshop\" target=\"_blank\">Hugging Face on Amazon SageMaker and AWS Workshop</a></strong></ins><br />\nInterested to use NLP to generate models in the style of your favourite poets? This workshop shows you how you can fine tune text generation models in the style of your favourite poets using Hugging Face on Amazon SageMaker.</p>\n<h3><a id=\"NLPAWS_Community_Content_62\"></a><strong>NLP@AWS Community Content</strong></h3>\n<p><ins><strong><a href=\"https://www.projectceti.org/\" target=\"_blank\">Decoding Sperm Whale language using NLP models</a></strong></ins></p>\n<p><img src=\"https://dev-media.amazoncloud.cn/0079e3786eba472f8ee2f8597831185b_image.png\" alt=\"image.png\" /></p>\n<p>Project <ins><a href=\"https://www.projectceti.org/\" target=\"_blank\">CETI</a></ins> was selected for an <ins><a href=\"https://www.amazon.science/research-awards\" target=\"_blank\">Amazon Research Award</a></ins> and is an <ins><a href=\"https://aws.amazon.com/government-education/nonprofits/aws-imagine-grant-program/\" target=\"_blank\">Imagine Grant</a></ins> winner. AWS ML teams are working with the Open Data program team to help decode sperm whale language. By building NLP models to parse and interpret sperm whales’ voices and hopefully develop frameworks for communicating back, CETI aims to show that today’s most cutting-edge technologies can be used to not only drive business impact, but can enable a deeper understanding of other species on this planet. CETI’s work is taking place in the Caribbean, off the coast of Dominica, and remotely with AWS via Chime.</p>\n<h3><a id=\"Upcoming_Events_70\"></a><strong>Upcoming Events</strong></h3>\n<p><ins><strong><a href=\"https://app.livestorm.co/hugging-face/accelerate-bert-inference-with-knowledge-distillation-and-aws-inferentia?type=detailed\" target=\"_blank\">Workshop: Techniques to accelerate BERT Inference</a></strong></ins><br />\nLearn how to apply knowledge distillation to compress a large BERT model to a small model, and then to an optimised neuron model with AWS Inferentia. By the end of this process, our model will go from 100ms+ to 5ms+ latency - a 20x improvement! Click <ins><a href=\"https://app.livestorm.co/hugging-face/accelerate-bert-inference-with-knowledge-distillation-and-aws-inferentia?type=detailed\" target=\"_blank\">here</a></ins> for registration.</p>\n<p><ins><strong><a href=\"https://aws.amazon.com/events/summits/?awsf.events-location=*all&amp;awsf.events-series=*all\" target=\"_blank\">AWS Summits around the globe</a></strong></ins><br />\nAWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn through technical breakout sessions, demonstrations, interactive workshops, labs, and team challenges. Summits are held in major cities around the world both virtually as well as in person starting April 2022. <a href=\"https://aws.amazon.com/events/summits/?awsf.events-location=*all&amp;awsf.events-series=*all\" target=\"_blank\">Register</a> for one near you!</p>\n<h3><a id=\"Stay_in_touch_with_NLP_on_AWS_77\"></a><strong>Stay in touch with NLP on AWS</strong></h3>\n<p>Our contact: <ins><a href=\"mailto:aws-nlp@amazon.com\" target=\"_blank\">aws-nlp@amazon.com</a></ins><br />\nEmail us about (1) your awesome project about NLP on AWS, (2) let us know which post in the newsletter helped your NLP journey, (3) other things that you want us to post on the newsletter. Talk to you soon.</p>\n"}
0
目录
关闭