All Projects → IBM → covid19-india-data

IBM / covid19-india-data

Licence: MIT license
Publicly available structured COVID-19 data from India, extracted automatically from daily health bulletins published by state governments.

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to covid19-india-data

covid19-stream-processors
Stream Information & Example Applications for Processing JHU and CovidTracking.com COVID-19 data available as streams over Solace
Stars: ✭ 35 (+59.09%)
Mutual labels:  covid-19, covid19-data
coronavirus-dresden
Collects official SARS-CoV-2 infection statistics published by the city of Dresden.
Stars: ✭ 19 (-13.64%)
Mutual labels:  covid-19, covid19-data
corona
به فکر خودتان باشید...
Stars: ✭ 16 (-27.27%)
Mutual labels:  covid-19, covid19-data
PhoNER COVID19
COVID-19 Named Entity Recognition for Vietnamese (NAACL 2021)
Stars: ✭ 55 (+150%)
Mutual labels:  covid-19, covid19-data
covid19africa
Africa open COVID-19 data working group
Stars: ✭ 47 (+113.64%)
Mutual labels:  covid-19, covid19-data
coronainfobd
Real-time corona-virus tracker of Bangladesh 🇧🇩 which includes latest updates, data visualization, public awareness from WHO and some advice to aware people. 🥰❤
Stars: ✭ 46 (+109.09%)
Mutual labels:  covid-19, covid19-data
web-covid-api
🦠COVID-19 Coronavirus 🔥Tracker Dashboard and 🚀Super fast API's (< 200ms) 🆕Updates every 3 mins
Stars: ✭ 18 (-18.18%)
Mutual labels:  covid-19, covid19-data
open-data-covid-19
Open Data Repository for the Covid-19 dataset.
Stars: ✭ 19 (-13.64%)
Mutual labels:  covid-19, covid19-data
covid19 JHU dashboard
This script pull the data from JHU dashboard in real-time
Stars: ✭ 12 (-45.45%)
Mutual labels:  covid-19, covid19-data
data
Collecting and organising COVID-19 data for Slovenia as they come in from various sources
Stars: ✭ 20 (-9.09%)
Mutual labels:  covid-19, covid19-data
covid19-timeseries
Covid19 timeseries data store
Stars: ✭ 38 (+72.73%)
Mutual labels:  covid-19, covid19-data
covid19-pr-api
COVID-19 Open API for Datasets in Puerto Rico
Stars: ✭ 21 (-4.55%)
Mutual labels:  covid-19, covid19-data
Covidview
A complete COVID-19 tracker cum dashboard website made by me.
Stars: ✭ 24 (+9.09%)
Mutual labels:  covid-19, covid19-data
covid-19
An app made with Flutter to track COVID-19 case counts.
Stars: ✭ 47 (+113.64%)
Mutual labels:  covid-19, covid19-data
COVID19
A web app to display the live graphical state-wise reported corona cases in India so far. It also shows the latest news for COVID-19. Stay Home, Stay Safe!
Stars: ✭ 122 (+454.55%)
Mutual labels:  covid-19, covid19-data
Co-ronaBD.info
Interactive Dashboard of Bangladesh for the Covid-19 Pandemic
Stars: ✭ 28 (+27.27%)
Mutual labels:  covid-19, covid19-data
covid-dashboard
Help welcomed if you have expertise in public health web technology, data modeling and munging, or visualization.
Stars: ✭ 106 (+381.82%)
Mutual labels:  covid-19, covid19-data
COVID-19-DETECTION
Detect Covid-19 with Chest X-Ray Data
Stars: ✭ 43 (+95.45%)
Mutual labels:  covid-19, covid19-data
covid-19-image-repository
Anonymized dataset of COVID-19 cases with a focus on radiological imaging. This includes images (x-ray / ct) with extensive metadata, such as admission-, ICU-, laboratory-, and patient master-data.
Stars: ✭ 42 (+90.91%)
Mutual labels:  covid-19, covid19-data
coronavirus-data
This repository contains data on Coronavirus Disease 2019 (COVID-19) in New York City (NYC), from the NYC Department of Health and Mental Hygiene.
Stars: ✭ 926 (+4109.09%)
Mutual labels:  covid-19, covid19-data

Covid-19 India Data 🇮🇳

License Website Database Slack

Download data CSV JSON Microsoft Excel SQLite

Availability of COVID-19 data is crucial for researchers and policy makers to understand the progression of the pandemic and react to it in real time. Here is recent plea from researchers in India for they urgent access to COVID data collected by government agencies. Individual states and cities in India provide detailed information in their daily media bulletins about the current situation of COVID-19 in their respective locations. However, such data (usually in the form of PDF documents) is not readily accessible in structured form.

While there are fantastic crowd-sourced efforts underway to curate such data, manual approaches cannot scale to the volume of the data produced over the long term. Unfortunately, although this project originally began anticipating this outcome, this eventuality has already come to pass.

Project Overview

Read More

In this project, we use AI-assisted document and image extraction techniques to automate the extraction of such data in structured (SQL) form from the state-level daily health bulletins; and aim to make this data readily (and freely) available for further research and analysis. The target is to automate the data extraction and curation for each Indian state, so that once the extraction process of each state is complete, we can be on "autopilot" for that state, requiring little to none continued manual curation (other than to respond to changes in schema).

Citing us

If you are using this data in your reserach, please remember to cite us. 🙏 Note that the list of authors will continue to grow over time with our OSS contributors. Please make sure to update the citation text in your future papers accordingly.

@inproceedings{agarwal2021covid,
  title={COVID-19 India Dataset: Parsing Detailed COVID-19 Data in Daily Health Bulletins from States in India},
  author={Mayank Agarwal and Tathagata Chakraborti and Sachin Grover and Arunima Chaudhary},
  booktitle={NeurIPS 2021 Workshop on Machine Learning in Public Health},
  year={2021}
}

Getting Started with the Code

There are two ways to get started:

The Backend

The most important part of this codebase is the data extraction pipeline, as described above.

  1. To setup your environment, follow the instructions here.
  2. To run the extraction pipeline, refer to instructions here.
  3. For a detailed walkthrough of using the pipeline end to end on a state, refer to our Wiki.

The Frontend

Secondary, but almost as important, is the landing page that allows users to access the data quickly and in different forms such as time series visualization, data tables, CSVs, APIs, etc. For instructions on how to contribute to the landing page, see here.

How to Contribute

The following are a few ways to get going. In general, you can pick up any unassigned issue, or issues tagged with help wanted, from the issue board.

Own a State

priority

This is the biggest way you can contribute in the beginning stages of the project. "Owning a state" involves:

  1. Write the data extraction code for the bulletins of the state. This repository provides the starting code and helper packages to make this as simple as possible. See here for instructions.

  2. Eventually reacting (or helping others react) to additions or changes in schema for the bulletins being put out by that state. The schemas have remained quite stable all this while but this issue may show up in a few states as the pandemic evolves.

For the project to succeed, this is the most crucial part. Once the data extraction code for a state is done, the logging of data for that state is automatic and we can sit back and relax scale up to the rest of the country over time.

😒 Data Cleaning

Data at this volume and timeline is bound to suffer from inconsistencies. We will be documenting these as and when we find them on the dedicated Anomalies Page. Help us:

  1. Remove missing data / deal with missing for the plots.
  2. Idenitify possible outliers and errors.

🤓 Analysis

Analyze the data for insights, irregularities, etc. You can put up results of your analysis in your papers, blogs, etc. (and point to that from our landing page) or directly add it to our landing page as a standalone new page or in the existing Analysis Page. You can use the data to validate or extend models developed for other countries to India [1] [2] [3]; developing epidemiological models which integrate additional variables [4] [5] [6] [7]; understanding various aspects of the pandemic in detail [8] [1] [9], among others.

💡 💡 💡 If you are looking for some concrete tasks to get started, find out more about Challenge Tasks here.

Current state roster

State Link to Bulletin Owner Status
Andaman and Nicobar AN Link Own it! #113
Arunachal Pradesh AR Link Own it! #129
Assam AS Link Own it! #130
Bihar BR Link Own it! #126
Chhattisgarh CT Link Own it! #131
Dadra and Nagar Haveli and Daman and Diu DH Link Own it! #125
Delhi DL Link Mayank   COMPLETE Wiki
Goa GA Link Tathagata | Mayank   COMPLETE Wiki
Gujarat GJ Link Own it! #121
Haryana HR Link Mayank   COMPLETE Wiki
Himachal Pradesh HP Link Own it! #132
Jammu and Kashmir JK Link Own it! #133
Karnataka KA Link Sushovan De | Mayank 🚧   IN PROGRESS Wiki
Kerala KL Link Tathagata 🚧   IN PROGRESS Wiki
Ladakh LA Link Own it! #114
Madhya Pradesh MP Link Tathagata 🚧   IN PROGRESS Wiki
Maharashtra MH Link Mayank   COMPLETE Wiki
Manipur MN Link | Link Own it! #116
Meghalaya ML Link Own it! #111
Mizoram MZ Link Own it! #135
Nagaland NL Link Own it! #124
Puducherry PY Link Own it! #128
Punjab PB Link Sachin   COMPLETE Wiki
Odisha OR Link Own it! #115
Rajasthan RJ Link
Tamil Nadu TN Link Sachin | Tathagata   COMPLETE Wiki
Telengana TG Link Mayank   COMPLETE Wiki
Uttarakhand UK Link | Link Arunima   COMPLETE Wiki
Uttar Pradesh UP Link Own it! #127
West Bengal WB Link Mayank   COMPLETE Wiki
Add new state

As you might have noticed, this is an incomplete list of Indian states. Not all states produce this form of data and not all bulletins are accessible. ☹️ We will continue adding new sources over time.

Interested? Join the Community

slack

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].