Advancing digital sensing in mental health research
These two workgroups addressed how to effectively organize the wide range of sensors, apps, and data being used globally and how to harmonize pipelines and methods for collecting data from devices, analyzing results, and sharing key findings (Box 1).
Device differences
Devices can be grouped by their intended use (clinical/research or consumer/wellness), although the lines between these uses are becoming more blurred every day6,13,14. The differences between devices, not only across categories but even different versions of the same device, have impeded efforts to replicate the results of digital behavioral sensing studies. As these differences are usually not transparent to most researchers, progress in the field depends on investigators gaining greater access to information on sensor components and specifications as well as the variables that are reported by the device. Differences between consumer platforms may be particularly important barriers to the equitable implementation of digital behavioral sensing, as the relative frequency of use of specific platforms varies considerably according to geographic and sociodemographic factors.
The building of standards for the types, formats, and parameters to adequately characterize data from consumer devices is one step toward being able to meaningfully compare data within and across studies. This process should leverage existing efforts, such as the IEEE Standard 1752.1-202115, IEEE Working Group on digital health (P1752 Open Mobile Health)16, DiMe’s Digital Health Measurement Collaborative Community (DATAcc)17, and Open mHealth18,19. To facilitate further community and collaboration, it may be valuable to develop a network of expertise and an online community accessible to those conducting research on digital sensing in mental health (along the lines of programs such as Biostars for the bioinformatics community20) and to build on communities formed through the National Science Foundation (NSF) Smart and Connected Communities program21.
Expensive infrastructure
Specifying standards is only a first step. The field will need accessible infrastructure to enable implementation of such standards in the process of large-scale data generation. Some current studies are collecting data on the scale of terabytes per participant over long time periods (in some cases several years), and the costs and complexity of this infrastructure are thus increasing rapidly. Efforts on this scale are not supportable with infrastructure that has largely developed on a bespoke, per-project basis, given rapid changes in technologies, application programming interfaces (APIs), and underlying data/communication standards. For this reason, scalable deployment of digital sensing in mental health research is now limited to a few sites with outsized resources and expertise. Even for such sites, the current landscape for funding does not adequately support the long-term maintenance of infrastructure or distribution of assets critical for research.
One mechanism for obtaining a more sustainable and equitable distribution of necessary infrastructure would be the formation of collaborative Centers of Excellence focused on digital sensing for mental health; establishing, supporting, and disseminating best practices in a sustainable manner; and curating shared datasets available to the scientific community. This strategy would incentivize groups to develop and maintain a set of reusable assets while pushing forward standards in the field. Key initiatives funded by this approach would be: 1) creating benchmark datasets; 2) maintaining a network of sites with interdisciplinary expertise; and 3) “priming the pump” for research through training and infrastructure buildout.
The creation of large benchmark datasets for digital sensing in mental health would simultaneously establish best practices that prioritize data privacy and sensitivity, while facilitating their dissemination to the research community. While mental health digital sensing data present distinct privacy concerns, other fields, such as genomics, have established standard practices for widespread sharing of similarly sensitive data. As device manufacturers may play a role in the generation of benchmark datasets, it will be important to engage these companies, from the outset of studies, in developing data sharing plans.
The interdisciplinary infrastructure of Centers of Excellence should initially focus on items that will remain scalable, such as common APIs and endpoints, and extend to include software for data aggregation and multimodal visualization (e.g., the Digital Biomarker Discovery Project22). Additionally, relying on postdoctoral scholars and graduate students for the maintenance of software infrastructure on this scale is not feasible. It will therefore be necessary for Centers of Excellence to include funding for research engineers over multiyear timespans. Importantly, this framework should reward investigators who enable reproducible research, prioritize underserved and underrepresented groups, and implement FAIR23 (Findable, Accessible, Interoperable, and Reusable) and TRUST24 (Transparency, Responsibility, User focus, Sustainability and Technology) principles.
Opaque data sources
Device developers must navigate between their need for financially sustainable business models (e.g., proprietary and marketable technologies) and the demands from the research community for transparency in how their products function. The workgroup participants were emphatic in stating the view that fostering reproducible science does not require developers to reveal all their proprietary technology; indeed, it could strengthen their claims regarding the validity and efficacy of their devices. The participants further suggested that facilitating more extensive formal collaborations, throughout the process of developing and testing hardware and apps, could help bridge the gap between industry and academic sponsors. One of the workshop sponsors, the NIMH, has accepted this suggestion; in a recent announcement of a new funding opportunity, “Standardizing Data and Metadata from Wearable Devices, it strongly encouraged the inclusion of device manufacturers in research teams responding to this announcement.
To reach a state where we understand how specific device hardware and software versions influence measurements of interest (and what those measurements are) will require channels of communication between digital device companies, researchers, clinicians, funders, and users with lived experience. The continuation of workshops such as the one that generated this report and the leveraging of existing spaces where funding agencies, industry, and academic groups may already meet can foster such engagement. These discussions can provide three main “products” that will facilitate scalable research: 1) information on how software and hardware versioning and updates influence the sensor data that researchers gather; 2) well-documented APIs for gathering digital sensing data; and 3) creation of standards for what comprises data fit for research25. Additionally, as current consumer devices may not include some sensors that could have the greatest potential for impacting our understanding of mental health, efforts to foster iterative discussions between device developers and clinicians, academics, and those with lived experience could increase the chances that mental health applications will be considered when manufacturers make changes in the sensors on such devices.
link