Using hyperboloids to improve product retrieval

海外精选

海外精选的内容汇集了全球优质的亚马逊云科技相关技术内容。同时，内容中提到的“AWS” 是 “Amazon Web Services” 的缩写，在此网站不作为商标展示。

{"value":"Many machine learning models depend on the concept of embedding, or mapping data to a representational space, where it can be manipulated or measured in useful ways. Usually, a data embedding is a point in the space — a vector.\n\nIn recent years, researchers at Amazon and elsewhere have been investigating the idea of ++[hyperbolic embedding](https://www.amazon.science/tag/hyperbolic-embedding)++, or embedding data, not as points in space, but as higher-dimensional analogues of rectangles on a curved surface. This has numerous advantages, one of which is the ability to capture hierarchical relationships between data points.\n\nAt this year’s International Conference on Web Search and Data Mining (++[WSDM](https://www.amazon.science/conferences-and-events/wsdm-2022)++), we and our colleagues are presenting a paper on the use of hyperbolic embeddings for product retrieval. Because product catalogues are often organized hierarchically, with individual products belonging to a succession of more and more general categories (e.g., tablet/computer/electronics), hyperbolic embeddings are suited particularly well to this task.\n\nIn our approach, we represent a query — say, “Fire TV” — as a rectangle in hyperbolic space, known as a hyperboloid. Query matches are those products whose vector embeddings lie within the hyperboloid’s boundaries.\n\n![下载.gif](https://dev-media.amazoncloud.cn/7e6b990c96e04e9789e4eb12258ad8ef_%E4%B8%8B%E8%BD%BD.gif)\n\nA new product retrieval method embeds queries as hyperboloids, or higher-dimensional analogues of rectangles on a curved surface. Each hyperboloid is represented by two vectors: a centroid vector, which defines the hyperboloid's center, and a limit vector. The embedding of a multi-term query is the intersection (red polygon) of the embeddings of its component terms.\n\nIn experiments, we compared this approach to nine different methods that use vector embeddings and one method that embeds data as rectangular boxes in Euclidean space — essentially, non-curved versions of hyperboloids.\n\nWe used two different datasets and five different measures of retrieval accuracy and found that our approach was the best performer across the board. In some cases, the improvements were dramatic — as much as 33% relative to the best vector embedding method and 27% relative to the Euclidean box embedding.\n\nOur approach also aids in model interpretability, as we use an attention mechanism to determine which elements of a query string are most relevant to which attributes of a product. The attention values for a given query provide an easy way to visualize the model’s rationale for selecting a certain product.\n\nFor instance, one experiment showed that when the query included the phrase “daily moisturizer”, the model attended to the word “moisturizer” when selecting products that had the word “lotion” in their titles.\n\n#### **Hyperbolic embeddings**\n\nAn advantage of both Euclidean box embeddings and hyperbolic embeddings is that they can expand and contract according to the generality of a query. With either approach, for instance, the embedding corresponding to the query “Fire” — which would also encompass Fire tablets and Fire cubes — would be larger than the embedding corresponding to the query “Fire TV”.\n\nBy the same token, both approaches offer an efficient way to combine queries. For instance, the embedding of the query “Fire TV stick with Alexa” would be the intersection of the embeddings corresponding to “Fire TV stick” and “Alexa”, while the embedding of the query “Fire or Kindle” would be the union of the embeddings for “Fire” and “Kindle”.\n\nWhere hyperbolic space has an advantage over Euclidean space is in representing hierarchies. Hyperbolic space is intrinsically curved, which means it gives you the representational capacity of curvature for free.\n\nFor instance, a hierarchical tree can be mapped onto a ball such that the root of the tree is at the center of the ball, its leaf nodes are on the surface, and the other layers of the tree fall at regular distances in-between. In Euclidean space, representing that ball requires three dimensions, but in hyperbolic space, it requires only two. This dimensionality reduction enables hyperboloids to model hierarchical relationships efficiently, even when the hierarchical trees are enormous.\n\nIn our paper, we define hyperboloids using two vectors: one vector indicates the center (centroid) of the hyperboloid, and the other indicates the distance from the centroid to the hyperboloid’s edge. This compact representation further increases the efficiency of computing in hyperbolic space.\n\n#### **The model**\n\nOur machine learning model takes as inputs both a product query and the titles of candidate products. All the input texts are then broken into overlapping three-character chunks, or trigrams.\n\nAn encoder maps the trigrams, for both query and products, to hyperbolic space. The query mappings are hyperboloids, while the product mappings are hyperbolic vectors. An intersection layer then produces a new set of hyperboloids by finding the intersection of every pair of trigram embeddings from the query.\n\nBoth the query trigrams and their intersections then pass to an attention layer, which, during training, learns which query elements are most relevant to which product titles. The embedding of each product title also passes to a self-attention layer, which learns which title elements tend to be most pertinent to product retrieval queries.\n\n![image.png](https://dev-media.amazoncloud.cn/7e47ff8b2d824f35bd3d4f464f00767e_image.png)\n\nThe ANTHEM architecture.\n\nFrom the attention values, the model computes a new set of vectors, representing the centroids of new query embedding hyperboloids and new embeddings of product titles, all biased toward the features the attention model identifies as most important. The intersection of hyperboloids and product vectors determines which products are presented to the customer, in what order.\n\nNote that we don’t train the model directly on representations of data hierarchies. To the extent that it is using hierarchical relationships, it simply learns them from training data.\n\n![image.png](https://dev-media.amazoncloud.cn/a5328e62c8a2426ebe3fe8522cda89ea_image.png)\n\nThe weights computed by the attention mechanism provide a way to visualize the rationales for the product retrieval model’s decisions. In these figures, the y-axis represents trigrams of the query, and the x-axis represents trigrams of the product title. Sometimes, the mechanism finds lexical matches, such as “leatherer” with “leatherer” in the first grid. But often, the matches are semantic, such as “lotion” with “moisturizer” or “driver” with “clubs”.\n\nIn our experiments, we measured the performance of our model and ten baselines using five metrics. Three of the metrics were variations of normalized discounted cumulative gain (NDCG), which considers not only how many relevant results are contained in the top N but how highly they rank. We measured NDCG for the top three results (NDCG@3), the top five (NDCG@5), and the top 10 (NDCG@10). We also used mean average precision, which measures the fraction of relevant results, and mean reciprocal rank, which assigns relevant results fixed scores depending on where in the list they fall.\n\n![image.png](https://dev-media.amazoncloud.cn/3c0e5d7f3f2943f6a458557fa7e51150_image.png)\n\nANTHEM's experimental results.\n\nAs can be seen, on all five measures, on both a public dataset and a private dataset, our model — which we call ANTHEM, for AtteNTive Hyperbolic Entity Model — yielded the best results. On the private dataset, the gains over the best-performing vector embedding model (BERT) were consistently around 30%. On the public dataset, they were consistently around 9%.\n\nRelative to the model that used Euclidean box embeddings (E-ANTHEM), the greatest gains came on NDCG@10 — 21% on the private dataset, 8% on the public. This is likely because of the hierarchical information that ANTHEM captures. That is, Euclidean embeddings may do a good job of finding the top matches, but ANTHEM does a better job of exploring the hierarchical product categories those matches belong to.\n\nABOUT THE AUTHOR\n\n#### **[Nikhil Rao](https://www.amazon.science/author/nikhil-rao)**\n\nNikhil Rao is an Amazon applied scientist developing AI systems for shopping and discovery.\n\n#### **[Chandan K. Reddy](https://www.amazon.science/author/chandan-k.-reddy)**\n\nChandan K. Reddy is an Amazon Scholar and a professor of computer science at Virginia Tech.","render":"Many machine learning models depend on the concept of embedding, or mapping data to a representational space, where it can be manipulated or measured in useful ways. Usually, a data embedding is a point in the space — a vector.\nIn recent years, researchers at Amazon and elsewhere have been investigating the idea of <ins><a href=\"https://www.amazon.science/tag/hyperbolic-embedding\" target=\"_blank\">hyperbolic embedding</a></ins>, or embedding data, not as points in space, but as higher-dimensional analogues of rectangles on a curved surface. This has numerous advantages, one of which is the ability to capture hierarchical relationships between data points.\nAt this year’s International Conference on Web Search and Data Mining (<ins><a href=\"https://www.amazon.science/conferences-and-events/wsdm-2022\" target=\"_blank\">WSDM</a></ins>), we and our colleagues are presenting a paper on the use of hyperbolic embeddings for product retrieval. Because product catalogues are often organized hierarchically, with individual products belonging to a succession of more and more general categories (e.g., tablet/computer/electronics), hyperbolic embeddings are suited particularly well to this task.\nIn our approach, we represent a query — say, “Fire TV” — as a rectangle in hyperbolic space, known as a hyperboloid. Query matches are those products whose vector embeddings lie within the hyperboloid’s boundaries.\n<img src=\"https://dev-media.amazoncloud.cn/7e6b990c96e04e9789e4eb12258ad8ef_%E4%B8%8B%E8%BD%BD.gif\" alt=\"下载.gif\" />\nA new product retrieval method embeds queries as hyperboloids, or higher-dimensional analogues of rectangles on a curved surface. Each hyperboloid is represented by two vectors: a centroid vector, which defines the hyperboloid’s center, and a limit vector. The embedding of a multi-term query is the intersection (red polygon) of the embeddings of its component terms.\nIn experiments, we compared this approach to nine different methods that use vector embeddings and one method that embeds data as rectangular boxes in Euclidean space — essentially, non-curved versions of hyperboloids.\nWe used two different datasets and five different measures of retrieval accuracy and found that our approach was the best performer across the board. In some cases, the improvements were dramatic — as much as 33% relative to the best vector embedding method and 27% relative to the Euclidean box embedding.\nOur approach also aids in model interpretability, as we use an attention mechanism to determine which elements of a query string are most relevant to which attributes of a product. The attention values for a given query provide an easy way to visualize the model’s rationale for selecting a certain product.\nFor instance, one experiment showed that when the query included the phrase “daily moisturizer”, the model attended to the word “moisturizer” when selecting products that had the word “lotion” in their titles.\n<h4><a id=\"Hyperbolic_embeddings_20\"></a>Hyperbolic embeddings</h4>\nAn advantage of both Euclidean box embeddings and hyperbolic embeddings is that they can expand and contract according to the generality of a query. With either approach, for instance, the embedding corresponding to the query “Fire” — which would also encompass Fire tablets and Fire cubes — would be larger than the embedding corresponding to the query “Fire TV”.\nBy the same token, both approaches offer an efficient way to combine queries. For instance, the embedding of the query “Fire TV stick with Alexa” would be the intersection of the embeddings corresponding to “Fire TV stick” and “Alexa”, while the embedding of the query “Fire or Kindle” would be the union of the embeddings for “Fire” and “Kindle”.\nWhere hyperbolic space has an advantage over Euclidean space is in representing hierarchies. Hyperbolic space is intrinsically curved, which means it gives you the representational capacity of curvature for free.\nFor instance, a hierarchical tree can be mapped onto a ball such that the root of the tree is at the center of the ball, its leaf nodes are on the surface, and the other layers of the tree fall at regular distances in-between. In Euclidean space, representing that ball requires three dimensions, but in hyperbolic space, it requires only two. This dimensionality reduction enables hyperboloids to model hierarchical relationships efficiently, even when the hierarchical trees are enormous.\nIn our paper, we define hyperboloids using two vectors: one vector indicates the center (centroid) of the hyperboloid, and the other indicates the distance from the centroid to the hyperboloid’s edge. This compact representation further increases the efficiency of computing in hyperbolic space.\n<h4><a id=\"The_model_32\"></a>The model</h4>\nOur machine learning model takes as inputs both a product query and the titles of candidate products. All the input texts are then broken into overlapping three-character chunks, or trigrams.\nAn encoder maps the trigrams, for both query and products, to hyperbolic space. The query mappings are hyperboloids, while the product mappings are hyperbolic vectors. An intersection layer then produces a new set of hyperboloids by finding the intersection of every pair of trigram embeddings from the query.\nBoth the query trigrams and their intersections then pass to an attention layer, which, during training, learns which query elements are most relevant to which product titles. The embedding of each product title also passes to a self-attention layer, which learns which title elements tend to be most pertinent to product retrieval queries.\n<img src=\"https://dev-media.amazoncloud.cn/7e47ff8b2d824f35bd3d4f464f00767e_image.png\" alt=\"image.png\" />\nThe ANTHEM architecture.\nFrom the attention values, the model computes a new set of vectors, representing the centroids of new query embedding hyperboloids and new embeddings of product titles, all biased toward the features the attention model identifies as most important. The intersection of hyperboloids and product vectors determines which products are presented to the customer, in what order.\nNote that we don’t train the model directly on representations of data hierarchies. To the extent that it is using hierarchical relationships, it simply learns them from training data.\n<img src=\"https://dev-media.amazoncloud.cn/a5328e62c8a2426ebe3fe8522cda89ea_image.png\" alt=\"image.png\" />\nThe weights computed by the attention mechanism provide a way to visualize the rationales for the product retrieval model’s decisions. In these figures, the y-axis represents trigrams of the query, and the x-axis represents trigrams of the product title. Sometimes, the mechanism finds lexical matches, such as “leatherer” with “leatherer” in the first grid. But often, the matches are semantic, such as “lotion” with “moisturizer” or “driver” with “clubs”.\nIn our experiments, we measured the performance of our model and ten baselines using five metrics. Three of the metrics were variations of normalized discounted cumulative gain (NDCG), which considers not only how many relevant results are contained in the top N but how highly they rank. We measured NDCG for the top three results (NDCG@3), the top five (NDCG@5), and the top 10 (NDCG@10). We also used mean average precision, which measures the fraction of relevant results, and mean reciprocal rank, which assigns relevant results fixed scores depending on where in the list they fall.\n<img src=\"https://dev-media.amazoncloud.cn/3c0e5d7f3f2943f6a458557fa7e51150_image.png\" alt=\"image.png\" />\nANTHEM’s experimental results.\nAs can be seen, on all five measures, on both a public dataset and a private dataset, our model — which we call ANTHEM, for AtteNTive Hyperbolic Entity Model — yielded the best results. On the private dataset, the gains over the best-performing vector embedding model (BERT) were consistently around 30%. On the public dataset, they were consistently around 9%.\nRelative to the model that used Euclidean box embeddings (E-ANTHEM), the greatest gains came on NDCG@10 — 21% on the private dataset, 8% on the public. This is likely because of the hierarchical information that ANTHEM captures. That is, Euclidean embeddings may do a good job of finding the top matches, but ANTHEM does a better job of exploring the hierarchical product categories those matches belong to.\nABOUT THE AUTHOR\n<h4><a id=\"Nikhil_Raohttpswwwamazonscienceauthornikhilrao_64\"></a><a href=\"https://www.amazon.science/author/nikhil-rao\" target=\"_blank\">Nikhil Rao</a></h4>\nNikhil Rao is an Amazon applied scientist developing AI systems for shopping and discovery.\n<h4><a id=\"Chandan_K_Reddyhttpswwwamazonscienceauthorchandankreddy_68\"></a><a href=\"https://www.amazon.science/author/chandan-k.-reddy\" target=\"_blank\">Chandan K. Reddy</a></h4>\nChandan K. Reddy is an Amazon Scholar and a professor of computer science at Virginia Tech.\n"}

亚马逊云科技解决方案基于行业客户应用场景及技术领域的解决方案

联系亚马逊云科技专家