TY - JOUR
T1 - Estimating the outcome of spreading processes on networks with incomplete information
T2 - A dimensionality reduction approach
AU - Sapienza, Anna
AU - Barrat, Alain
AU - Cattuto, Ciro
AU - Gauvin, Laetitia
N1 - Publisher Copyright:
© 2018 American Physical Society.
PY - 2018/7/30
Y1 - 2018/7/30
N2 - Recent advances in data collection have facilitated the access to time-resolved human proximity data that can conveniently be represented as temporal networks of contacts between individuals. While the structural and dynamical information revealed by this type of data is fundamental to investigate how information or diseases propagate in a population, data often suffer from incompleteness, which possibly leads to biased estimations in data-driven models. A major challenge is thus to estimate the outcome of spreading processes occurring on temporal networks built from partial information. To cope with this problem, we devise an approach based on non-negative tensor factorization, a dimensionality reduction technique from multilinear algebra. The key idea is to learn a low-dimensional representation of the temporal network built from partial information and to use it to construct a surrogate network similar to the complete original network. To test our method, we consider several human-proximity networks, on which we perform resampling experiments to simulate a loss of data. Using our approach on the resulting partial networks, we build a surrogate version of the complete network for each. We then compare the outcome of a spreading process on the complete networks (nonaltered by a loss of data) and on the surrogate networks. We observe that the epidemic sizes obtained using the surrogate networks are in good agreement with those measured on the complete networks. Finally, we propose an extension of our framework that can leverage additional data, when available, to improve the surrogate network when the data loss is particularly large.
AB - Recent advances in data collection have facilitated the access to time-resolved human proximity data that can conveniently be represented as temporal networks of contacts between individuals. While the structural and dynamical information revealed by this type of data is fundamental to investigate how information or diseases propagate in a population, data often suffer from incompleteness, which possibly leads to biased estimations in data-driven models. A major challenge is thus to estimate the outcome of spreading processes occurring on temporal networks built from partial information. To cope with this problem, we devise an approach based on non-negative tensor factorization, a dimensionality reduction technique from multilinear algebra. The key idea is to learn a low-dimensional representation of the temporal network built from partial information and to use it to construct a surrogate network similar to the complete original network. To test our method, we consider several human-proximity networks, on which we perform resampling experiments to simulate a loss of data. Using our approach on the resulting partial networks, we build a surrogate version of the complete network for each. We then compare the outcome of a spreading process on the complete networks (nonaltered by a loss of data) and on the surrogate networks. We observe that the epidemic sizes obtained using the surrogate networks are in good agreement with those measured on the complete networks. Finally, we propose an extension of our framework that can leverage additional data, when available, to improve the surrogate network when the data loss is particularly large.
UR - http://www.scopus.com/inward/record.url?scp=85051231359&partnerID=8YFLogxK
U2 - 10.1103/PhysRevE.98.012317
DO - 10.1103/PhysRevE.98.012317
M3 - Article
SN - 2470-0045
VL - 98
JO - Physical Review E
JF - Physical Review E
IS - 1
M1 - 012317
ER -