Data on correlated products and sellers helps improve demand forecasting

海外精选

海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时，内容中提到的“AWS” 是 “Amazon Web Services” 的缩写，在此网站不作为商标展示。

{"value":"Forecasting product demand from customers is central to what Amazon does. It helps ensure that products reach customers on time, that inventory management in fulfillment centers is efficient, and that sellers have predictable shipping schedules.\n\nForecasting models may be trained on data from multiple sellers and multiple products, but their predictions are typically one-dimensional: the forecast for a given product, for a given seller, is based on past demand for that product from that seller. \n\nBut one-dimensional analyses can leave out information crucial to accurate prediction. What will it do to demand if a competitor goes out of stock on a particular product? Or, conversely, what if a competitor’s release of a new product causes the whole product space to heat up?\n\nAt this year’s European Conference on Machine Learning (++[ECML](https://www.amazon.science/conferences-and-events/ecml-pkdd-2021)++), we proposed ++[a new method](https://www.amazon.science/publications/spatio-temporal-multi-graph-networks-for-demand-forecasting-in-online-marketplaces)++ for predicting product demand that factors in correlations with other sellers and products at prediction time as well as training time. Unlike previous attempts at multidimensional demand forecasting, our method uses graphs to represent those correlations and graph neural networks to produce representations of the graphical information. \n\nIn experiments, our approach increased prediction accuracy by 16% relative to standard, one-dimensional analyses. The benefits were particularly pronounced in cases in which multiple sellers were offering the same product — a relative improvement of almost 30% — and “cold-start” products for which less than three months of historical data were available — a relative improvement of almost 25%.\n\n#### **Graphical models**\n\nIn e-commerce, products are often related in terms of categories or sub-categories, and their demand patterns are thus correlated. When a neural-network-based forecasting model is trained on datasets containing correlated products, it learns to extract higher-order features that implicitly account for some of those correlations. \n\nIt stands to reason, however, that looking at other time series may be beneficial at prediction time, too. In this work, we have developed a more systematic method of modeling the correlations between different entities across time series using graph neural networks.\n\nA graph consists of nodes — usually depicted as circles — linked by edges — usually depicted as line segments connecting nodes. Edges can have associated values, which indicate relationships between nodes.\n\n![image.png](https://dev-media.amazoncloud.cn/35490c675ff24fc08b3f52a5cc705184_image.png)\n\nAn example of a product correlation graph, with sellers (green), products (yellow), demand relations (black lines) and substitute relations (orange lines). The function XT computes the feature vector.\n\nWe represent correlations between product and seller data with a graph that has two types of nodes — products and sellers— and two types of edges — demand relations and substitute relations. Demand edges link sellers to products, while substitute edges link products to each other.\n\nAssociated with each node is a feature vector, representing particular attributes of that product or seller. Product-specific features include things like brand, product category, product sales, number of product views, and the like. Seller-specific features include things like seller rating, seller reviews, total customer orders for the seller, and total views of all the products offered by a seller.\n\nAssociated with each demand edge is another feature vector representing relationships between sellers and products, such as total views of a particular product offered by the seller, total customer orders of that product for the seller, whether the seller went out of stock on that product, and so on.\n\nAssociated with each substitute edge is a binary value, indicating whether the products can be substituted for each other, as evidenced by customer choice.\n\nFor every time step in our time series, we construct such a graph, representing the feature set at that time step.\n\n#### **The neural model**\n\nThe graph at each time step in our time series passes through a graph neural network, which produces a fixed-length vector representation of each node in the graph, or embedding. That representation accounts for the features of the node’s neighbors and the edges connecting them, as well.\n\nThe outputs of the graph neural network are concatenated with static features for each time step, such as the number of days until the next major holiday or special financing offers from banks.\n\nThe combined representations are then passed, in sequence, to a network with an encoder-decoder architecture. The encoder comprises a sequential network such as a temporal convolutional network (TCN) or a long-short-term-memory (LSTM) network, which captures characteristics of the historical demand data. The encoder’s output represents the entire time series, factoring in dependencies between successive time steps.\n\nThat representation is passed to a decoder module that produces the final prediction. \n\n![image.png](https://dev-media.amazoncloud.cn/b54eafaa904d467aa24582117be6bd1a_image.png)\n\nThe complete neural model.\n\nThe whole model is trained end to end, so that the graph neural network and the encoder learn to produce representations that are useful for the final prediction, when conditioned on the static features.\n\n#### **Results**\n\nWe experimented with four different types of graph neural networks (GNNs): \n\n- homogeneous graph convolutional networks, in which node features are standardized so that all nodes are treated the same; \n- GraphSAGE networks, which reduce the computational burden of processing densely connected graphs by sampling from each node’s neighbors;\n- heterogeneous GraphSAGE networks, which can handle different types of nodes; and\n- heterogeneous graph attention networks, which assign different weights to a given node’s neighbors.\n\nWe also experimented with different inputs to each type of GNN: nodes only; nodes and demand edges; and nodes and demand and substitute edges. Across models, the addition of more edge data improved performance significantly, demonstrating that the models were taking advantage of the graphical representation of the data. Across input types, the graph attention network performed best, so our best-performing model was the graph attention network with both types of edge information.\n\nABOUT THE AUTHOR\n\n#### **[Ankit Gandhi](https://www.amazon.science/author/ankit-gandhi)**\n\nAnkit Gandhi is a senior applied scientist at Amazon.","render":"Forecasting product demand from customers is central to what Amazon does. It helps ensure that products reach customers on time, that inventory management in fulfillment centers is efficient, and that sellers have predictable shipping schedules.\nForecasting models may be trained on data from multiple sellers and multiple products, but their predictions are typically one-dimensional: the forecast for a given product, for a given seller, is based on past demand for that product from that seller.\nBut one-dimensional analyses can leave out information crucial to accurate prediction. What will it do to demand if a competitor goes out of stock on a particular product? Or, conversely, what if a competitor’s release of a new product causes the whole product space to heat up?\nAt this year’s European Conference on Machine Learning (<ins><a href=\"https://www.amazon.science/conferences-and-events/ecml-pkdd-2021\" target=\"_blank\">ECML</a></ins>), we proposed <ins><a href=\"https://www.amazon.science/publications/spatio-temporal-multi-graph-networks-for-demand-forecasting-in-online-marketplaces\" target=\"_blank\">a new method</a></ins> for predicting product demand that factors in correlations with other sellers and products at prediction time as well as training time. Unlike previous attempts at multidimensional demand forecasting, our method uses graphs to represent those correlations and graph neural networks to produce representations of the graphical information.\nIn experiments, our approach increased prediction accuracy by 16% relative to standard, one-dimensional analyses. The benefits were particularly pronounced in cases in which multiple sellers were offering the same product — a relative improvement of almost 30% — and “cold-start” products for which less than three months of historical data were available — a relative improvement of almost 25%.\n<h4><a id=\"Graphical_models_10\"></a>Graphical models</h4>\nIn e-commerce, products are often related in terms of categories or sub-categories, and their demand patterns are thus correlated. When a neural-network-based forecasting model is trained on datasets containing correlated products, it learns to extract higher-order features that implicitly account for some of those correlations.\nIt stands to reason, however, that looking at other time series may be beneficial at prediction time, too. In this work, we have developed a more systematic method of modeling the correlations between different entities across time series using graph neural networks.\nA graph consists of nodes — usually depicted as circles — linked by edges — usually depicted as line segments connecting nodes. Edges can have associated values, which indicate relationships between nodes.\n<img src=\"https://dev-media.amazoncloud.cn/35490c675ff24fc08b3f52a5cc705184_image.png\" alt=\"image.png\" />\nAn example of a product correlation graph, with sellers (green), products (yellow), demand relations (black lines) and substitute relations (orange lines). The function XT computes the feature vector.\nWe represent correlations between product and seller data with a graph that has two types of nodes — products and sellers— and two types of edges — demand relations and substitute relations. Demand edges link sellers to products, while substitute edges link products to each other.\nAssociated with each node is a feature vector, representing particular attributes of that product or seller. Product-specific features include things like brand, product category, product sales, number of product views, and the like. Seller-specific features include things like seller rating, seller reviews, total customer orders for the seller, and total views of all the products offered by a seller.\nAssociated with each demand edge is another feature vector representing relationships between sellers and products, such as total views of a particular product offered by the seller, total customer orders of that product for the seller, whether the seller went out of stock on that product, and so on.\nAssociated with each substitute edge is a binary value, indicating whether the products can be substituted for each other, as evidenced by customer choice.\nFor every time step in our time series, we construct such a graph, representing the feature set at that time step.\n<h4><a id=\"The_neural_model_32\"></a>The neural model</h4>\nThe graph at each time step in our time series passes through a graph neural network, which produces a fixed-length vector representation of each node in the graph, or embedding. That representation accounts for the features of the node’s neighbors and the edges connecting them, as well.\nThe outputs of the graph neural network are concatenated with static features for each time step, such as the number of days until the next major holiday or special financing offers from banks.\nThe combined representations are then passed, in sequence, to a network with an encoder-decoder architecture. The encoder comprises a sequential network such as a temporal convolutional network (TCN) or a long-short-term-memory (LSTM) network, which captures characteristics of the historical demand data. The encoder’s output represents the entire time series, factoring in dependencies between successive time steps.\nThat representation is passed to a decoder module that produces the final prediction.\n<img src=\"https://dev-media.amazoncloud.cn/b54eafaa904d467aa24582117be6bd1a_image.png\" alt=\"image.png\" />\nThe complete neural model.\nThe whole model is trained end to end, so that the graph neural network and the encoder learn to produce representations that are useful for the final prediction, when conditioned on the static features.\n<h4><a id=\"Results_48\"></a>Results</h4>\nWe experimented with four different types of graph neural networks (GNNs):\n<ul>\n<li>homogeneous graph convolutional networks, in which node features are standardized so that all nodes are treated the same;</li>\n<li>GraphSAGE networks, which reduce the computational burden of processing densely connected graphs by sampling from each node’s neighbors;</li>\n<li>heterogeneous GraphSAGE networks, which can handle different types of nodes; and</li>\n<li>heterogeneous graph attention networks, which assign different weights to a given node’s neighbors.</li>\n</ul>\nWe also experimented with different inputs to each type of GNN: nodes only; nodes and demand edges; and nodes and demand and substitute edges. Across models, the addition of more edge data improved performance significantly, demonstrating that the models were taking advantage of the graphical representation of the data. Across input types, the graph attention network performed best, so our best-performing model was the graph attention network with both types of edge information.\nABOUT THE AUTHOR\n<h4><a id=\"Ankit_Gandhihttpswwwamazonscienceauthorankitgandhi_61\"></a><a href=\"https://www.amazon.science/author/ankit-gandhi\" target=\"_blank\">Ankit Gandhi</a></h4>\nAnkit Gandhi is a senior applied scientist at Amazon.\n"}

亚马逊云科技解决方案基于行业客户应用场景及技术领域的解决方案

联系亚马逊云科技专家