When survey science met web tracking: presenting an error framework for metered data

Published in Journal of the Royal Statistical Society: Series A, 2022

Recommended citation: Bosch, O.J., Revilla, M. (2022) When survey science met web tracking: presenting an error framework for metered data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1-29: https://doi.org/10.1111/rssa.12956 https://doi.org/10.1111/rssa.12956

Abstract:Metered data, also called “web-tracking data”, are generally collected from a sample of participants who willingly install or configure, onto their devices, technologies that track digital traces left when people go online (e.g., URLs visited). Since metered data allow for the observation of online behaviours unobtrusively, it has been proposed as a useful tool to understand what people do online and what impacts this might have on online and offline phenomena. It is crucial, nevertheless, to understand its limitations. Although some research has explored the potential errors of metered data, a systematic categorisation and conceptualisation of these errors are missing. Inspired by the Total Survey Error, we present a Total Error framework for digital traces collected with Meters (TEM). The TEM framework (1) describes the data generation and the analysis process for metered data and (2) documents the sources of bias and variance that may arise in each step of this process. Using a case study we also show how the TEM can be applied in real life to identify, quantify and reduce metered data errors. Results suggest that metered data might indeed be affected by the error sources identified in our framework and, to some extent, biased. This framework can help improve the quality of both stand-alone metered data research projects, as well as foster the understanding of how and when survey and metered data can be combined.