Surveys or digital trace data, which one should we use? Using MultiTrait-MultiMethod models to simultaneously estimate the measurement quality of surveys and digital trace data


Description: Measuring what people do online is crucial across all areas of social science research. Although self-reports are still the main instrument to measure online behaviours, there is evidence to doubt about their validity. Consequently, researchers are increasingly relying on digital trace data to measure online phenomena, assuming that it will lead to higher quality statistics. Recent evidence, nonetheless, suggests that digital trace data is also affected by measurement errors, questioning its gold standard status. Therefore, it is essential to understand the size of the measurement errors in digital trace data, and when it might be best to use each data source.

To this aim, we adapt the Generalised MultiTrait-MultiMethod (GMTMM) model created by Oberski et al. (2017) to simultaneously estimate the measurement errors in survey and digital trace data. The GTMM allows both survey and digital trace data to contain random and systematic measurement errors, while accommodating the specific characteristics of digital trace data (i.e., zero-inflation).

To simultaneously assess the measurement quality of both sources of data, we use survey and digital trace data linked at the individual level (N = 1,200), collected using a metered online opt-in panel in Spain. Using this data, we conducted three separate GMTMM models focusing on the measurement quality of survey and digital trace data when measuring three different types of online behaviours: news media exposure, online communication and entertainment. Specifically, for each type of behaviour, we measured three simple concepts (e.g., time spent reading articles about politics and current affairs) with both survey self-reports and digital traces. For each simple concept, we present the reliability and method effects of each data source.

Results provide needed evidence about the size of digital trace data errors, as well as when the use of self-reports might be justified.

Download the presentation