TY - JOUR
T1 - Multiple Systems Estimation for Sparse Capture Data
T2 - Inferential Challenges When There Are Nonoverlapping Lists
AU - Chan, Lax
AU - Silverman, Bernard W.
AU - Vincent, Kyle
N1 - Publisher Copyright:
© 2020 The Author(s). Published with license by Taylor & Francis Group, LLC.
PY - 2021
Y1 - 2021
N2 - Multiple systems estimation strategies have recently been applied to quantify hard-to-reach populations, particularly when estimating the number of victims of human trafficking and modern slavery. In such contexts, it is not uncommon to see sparse or even no overlap between some of the lists on which the estimates are based. These create difficulties in model fitting and selection, and we develop inference procedures to address these challenges. The approach is based on Poisson log-linear regression modeling. Issues investigated in detail include taking proper account of data sparsity in the estimation procedure, as well as the existence and identifiability of maximum likelihood estimates. A stepwise method for choosing the most suitable parameters is developed, together with a bootstrap approach to finding confidence intervals for the total population size. We apply the strategy to two empirical datasets of trafficking in US regions, and find that the approach results in stable, reasonable estimates. An accompanying R software implementation has been made publicly available. Supplementary materials for this article are available online.
AB - Multiple systems estimation strategies have recently been applied to quantify hard-to-reach populations, particularly when estimating the number of victims of human trafficking and modern slavery. In such contexts, it is not uncommon to see sparse or even no overlap between some of the lists on which the estimates are based. These create difficulties in model fitting and selection, and we develop inference procedures to address these challenges. The approach is based on Poisson log-linear regression modeling. Issues investigated in detail include taking proper account of data sparsity in the estimation procedure, as well as the existence and identifiability of maximum likelihood estimates. A stepwise method for choosing the most suitable parameters is developed, together with a bootstrap approach to finding confidence intervals for the total population size. We apply the strategy to two empirical datasets of trafficking in US regions, and find that the approach results in stable, reasonable estimates. An accompanying R software implementation has been made publicly available. Supplementary materials for this article are available online.
KW - Human trafficking; Log-linear models; Mark-recapture; Model identifiability; Model selection; Modern slavery
UR - http://www.scopus.com/inward/record.url?scp=85079698759&partnerID=8YFLogxK
U2 - 10.1080/01621459.2019.1708748
DO - 10.1080/01621459.2019.1708748
M3 - Article
SN - 0162-1459
VL - 116
SP - 1297
EP - 1306
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 535
ER -