Abstract
We propose a retrieval architecture in the context of
recommender systems for e-commerce applications, based on a multi-modal representation of the items
of interest (textual description and images of the products), paired with a
locality-sensitive hashing (LSH) indexing scheme for the fast retrieval of the
potential recommendations.
In particular, we learn a latent multimodal representation
of the items through
the use of CLIP architecture, combining text and images
in a contrastive way. The item embeddings thus generated
are then searched by means of different types of LSH.
We report on the experiments we performed on two real-world datasets from e-commerce sites, containing both images and textual descriptions of the products.
Lingua originale | Inglese |
---|---|
Pagine | 52-60 |
Numero di pagine | 9 |
DOI | |
Stato di pubblicazione | Pubblicato - 2022 |
Evento | ISMIS 2022 - Cosenza Durata: 1 gen 2022 → … |
???event.eventtypes.event.conference???
???event.eventtypes.event.conference??? | ISMIS 2022 |
---|---|
Città | Cosenza |
Periodo | 1/01/22 → … |
Keywords
- Multimodal embeddings Recommender systems Locality Sensitive Hashing