Abstract
We propose a retrieval architecture in the context of
recommender systems for e-commerce applications, based on a multi-modal representation of the items
of interest (textual description and images of the products), paired with a
locality-sensitive hashing (LSH) indexing scheme for the fast retrieval of the
potential recommendations.
In particular, we learn a latent multimodal representation
of the items through
the use of CLIP architecture, combining text and images
in a contrastive way. The item embeddings thus generated
are then searched by means of different types of LSH.
We report on the experiments we performed on two real-world datasets from e-commerce sites, containing both images and textual descriptions of the products.
| Lingua originale | Inglese |
|---|---|
| Pagine | 52-60 |
| Numero di pagine | 9 |
| DOI | |
| Stato di pubblicazione | Pubblicato - 2022 |
| Evento | ISMIS 2022 - Cosenza Durata: 1 gen 2022 → … |
???event.eventtypes.event.conference???
| ???event.eventtypes.event.conference??? | ISMIS 2022 |
|---|---|
| Città | Cosenza |
| Periodo | 1/01/22 → … |
Keywords
- Multimodal embeddings Recommender systems Locality Sensitive Hashing
Fingerprint
Entra nei temi di ricerca di 'Multimodal Deep Learning and Fast Retrieval for Recommendation'. Insieme formano una fingerprint unica.Cita questo
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver