In an effort to help researchers find useful health data to fight new coronavirus, Google has added its Search Trends symptoms dataset, which aggregates anonymised search trends for over 400 symptoms, to its Covid-19 Open Data repository.
This will help researchers better understand the spread of Covid-19 and its potential secondary health impacts.
Lack of access to useful high-quality data has posed a significant challenge, and much of the publicly available data is scattered, incomplete, or compiled in many different formats.
“To help researchers spend more of their time understanding the disease instead of wrangling data, we’ve developed a set of tools and processes to make it simpler for researchers to discover and work with normalized high-quality public datasets,” Katherine Chou, Director of Product Management, Google Health said on Thursday.
The Covid-19 Open Data repository is a comprehensive, open-source resource of epidemiological data and related variables like economic indicators or population statistics from over 50 countries.
Each data source contains information on its origin, and how it’s processed so that researchers can confirm its validity and reliability.
The Covid-19 models need to account for uncertainty in order for their predictions to be reliable and useful.
“To help address this challenge, we’re providing researchers examples of how to implement bespoke epidemiological models using TensorFlow Probability (TFP), a library for building probabilistic models that can measure confidence in their own predictions,” Google informed.
With TFP, researchers can use a range of data sources with different granularities, properties, or confidence levels, and factor that uncertainty into the overall prediction models.
This could be particularly useful in fine-tuning the increasingly complex models that epidemiologists are using to understand the spread of Covid-19, particularly in gaining city or county-level insights when only state or national-level datasets exist.
Google researchers have also developed an open-source agent-based simulator that utilises real-world data to simulate populations to help public health organisations fine tune their exposure notification parameters.
The Covid-19 Open Data repository also includes two Google datasets developed to help researchers study the impact of the disease in a privacy-preserving manner.