Why open data?

The COVID-19 pandemic is extremely detrimental to society and we need to work together to overcome it. Several months ago, we shared the first clinical COVID-19 cough dataset to spur innovation and collaboration.

Data source

Our data is highly accurate because it was collected at a hospital under supervision by physicians following Standard Operating Procedures (SOP). Our data is preprocessed and labeled with COVID-19 status (acquired from PCR testing), along with patient demographics (age, gender, medical history).

Get Started

We have provided 121 segmented clinical cough samples. The data is labeled with COVID-19 PCR test status, along with patient demographics as can be found in GitHub.


Our data was collected with informed patient consent under IRB protocol at a university hospital.


We hope to create a collaborative community of AI researchers creating solutions for the pandemic. Please apply to join our community here!

Want to collaborate with us?

We have achieved state-of-the-art accuracies in our COVID detection smartphone app and are ready to share it freely with hospital networks and public health departments open to collaboration on refinement and tuning of our algorithm to local conditions. Additionally, our organization relies on pro bono partner companies for support in key areas such as development and data privacy. Please contact us if your company can support our cause in some way.