All Projects → ShahinSHH → COVID-CT-MD

ShahinSHH / COVID-CT-MD

Licence: other
A COVID-19 CT Scan Dataset Applicable in Machine Learning and Deep Learning

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to COVID-CT-MD

covid19-visualized
COVID-19 World update with data Visualization (Include Indonesia cases)
Stars: ✭ 23 (+4.55%)
Mutual labels:  covid-19, covid-19-data
covid-19-uy-vacc-data
Uruguay COVID-19 vaccination data
Stars: ✭ 27 (+22.73%)
Mutual labels:  covid-19, covid-19-data
data
Collecting and organising COVID-19 data for Slovenia as they come in from various sources
Stars: ✭ 20 (-9.09%)
Mutual labels:  covid-19, covid-dataset
AlizaMS
DICOM Viewer
Stars: ✭ 144 (+554.55%)
Mutual labels:  dicom, dicom-files
covid-19-data-cleanup
Scripts to cleanup data from https://github.com/CSSEGISandData/COVID-19
Stars: ✭ 25 (+13.64%)
Mutual labels:  covid-19, covid-19-data
covid-19-data
COVID-19 datasets are constructed entirely from primary (government and public agency) sources
Stars: ✭ 104 (+372.73%)
Mutual labels:  covid-19, covid-19-data
cli-corona
📈 Track COVID-19 (2019 novel Coronavirus) statistics via the command line.
Stars: ✭ 14 (-36.36%)
Mutual labels:  covid-19, covid-19-data
COVID-CXNet
COVID-CXNet: Diagnosing COVID-19 in Frontal Chest X-ray Images using Deep Learning. Preprint available on arXiv: https://arxiv.org/abs/2006.13807
Stars: ✭ 48 (+118.18%)
Mutual labels:  covid-19, covid-dataset
rid-covid
Image-based COVID-19 diagnosis. Links to software, data, and other resources.
Stars: ✭ 74 (+236.36%)
Mutual labels:  ct-scans, covid-19
DICOM.jl
Julia package for reading and writing DICOM (Digital Imaging and Communications in Medicine) files
Stars: ✭ 45 (+104.55%)
Mutual labels:  dicom, dicom-files
impfbot
Benachrichtigungs-Bot für das niedersächische Impfportal / Notification bot for the lower saxony vaccination portal https://impfportal-niedersachsen.de
Stars: ✭ 37 (+68.18%)
Mutual labels:  covid-19
COVID-19-tracker
北航大数据高精尖中心研究团队进行数据来源的整理与获取,利用自然语言处理等技术从已公开全国4626确诊患者轨迹中抽取了基本信息(性别、年龄、常住地、工作、武汉/湖北接触史等)、轨迹(时间、地点、交通工具、事件)及病患关系形成结构化信息
Stars: ✭ 75 (+240.91%)
Mutual labels:  covid-19
covid19-tracker
📱 Tracking the impact of COVID-19 cases based on your location, built in Flutter
Stars: ✭ 34 (+54.55%)
Mutual labels:  covid-19
COVID-19-Resources
Resources for Covid-19
Stars: ✭ 25 (+13.64%)
Mutual labels:  covid-19-data
brazil-civil-registry-data
Raw scrapings of ARPEN https://transparencia.registrocivil.org.br/
Stars: ✭ 35 (+59.09%)
Mutual labels:  covid-19
vaccine-alarm
Check for Vaccine availability in a district at specified intervals and sounds a loud alarm when a slot is available.
Stars: ✭ 22 (+0%)
Mutual labels:  covid-19
CovidVaccineNotifier
Get notified with available vaccination centres via SMS
Stars: ✭ 21 (-4.55%)
Mutual labels:  covid-19
Plasma-Donor-App
An open-source app that helps in connecting patients and plasma donors. This is a beginner-friendly repository that helps you learn the basics of android development, git, and GitHub. Happy Hacktober!
Stars: ✭ 58 (+163.64%)
Mutual labels:  covid-19
dicom2stl
Python script to extract a STL surface from a DICOM image series.
Stars: ✭ 151 (+586.36%)
Mutual labels:  dicom
monai-deploy
MONAI Deploy aims to become the de-facto standard for developing, packaging, testing, deploying and running medical AI applications in clinical production.
Stars: ✭ 56 (+154.55%)
Mutual labels:  dicom

COVID-CT-MD

A COVID-19 CT Scan Dataset Applicable in Machine Learning and Deep Learning

The COVID-CT-MD dataset contains volumetric chest CT scans (DICOM files) of 169 patients positive for COVID-19 infection, 60 patients with CAP (Community Acquired Pneumonia), and 76 normal patients. Diagnosis of COVID-19 infection is based on positive real-time Reverse Transcription Polymerase Chain Reaction (rRT-PCR) test results, clinical parameters, and CT scan manifestations identified by three experienced thoracic radiologists. Diagnosis for CAP and normal cases was confirmed using clinical laboratory tests, and CT scans. A subset of 54 COVID-19, and 25 CAP cases were analyzed by the radiologists to identify and label slices with evidence of infection. The labeled subset of the data contains 4,957 number of slices demonstrating infection and 18,392 number of slices without the evidence of infection.
We're working closely with our collaborators in medical centers to provide more number of CT scans to introduce a larger Multi-Centre COVID-19 dataset to be used for a more extensive area of research. This dataset will be available for the public use in the near future.

Links

COVID-CT-MD dataset is accessible through Figshare. To access the associated clinical data and the labels from all three radiologists you can refer to the above link.

The detailed desription of the dataset is available at https://www.nature.com/articles/s41597-021-00900-3

UPDATE 1 (Sep 8, 2021)

After further review of two cases (P001 and P006), our team has decided to update the labels associated with them. Updated labels can be accessed through the following files:

  • Slice-level-labels-updated-1.npy
  • Lobe-level-labels-updated-1.npy

While the updated files contain more accurate lobe-level and slice-level labels for two cases, DL models developed based on the original version of the labels (Slice-level-labels, Lobe-level-labels) and those developed based on the updated ones don't show a significant difference as the changes are minor.

Suplementary Information

  • Collection Dates: COVID-19 cases are collected from February 2020 to April 2020, whereas CAP cases and normal cases are collected from April 2018 to December 2019 and January 2019 to May 2020, respectively.
  • De-Identification: To respect the patients’ privacy and comply with the DICOM supplement 142 (Clinical Trial De-identification Profiles), all the CT studies in our dataset have been de-identified and only gender and age of the patients are preserved in the dataset.
  • Data Distribution: The brief distribution of the COVID-CT-MD dataset is shown in the following table:
Table Cases Sex Age(year)
COVID-19 169 108 M/61 F 51.96 ± 14.39
CAP 60 35 M/25 F 57.7 ± 21.7
Normal 76 40 M/36 F 43.4 ± 14.1

Data Structure and Sample

A small sample of the dataset is available in the "Sample data" folder including DICOM files of two patients in each category to provide a quick insight into the dataset.

The hierarchical list below shows the structure of the COVID-CT-MD dataset shared through Figshare . COVID-19, CAP and Normal subjects are placed in separate folders, within which patients are arranged in folders, followed by CT scan slices in DICOM format.

  • Main Folder
    • COVID-19 subjects
      • Subject-ID
        • Slice-ID.dcm
    • CAP subjects
      • Subject-ID
        • Slice-ID.dcm
    • Normal subjects
      • Subject-ID
        • Slice-ID.dcm

NOTE: The correct order of slices in a CT scan doesn't necessarily follow the order of the Slice-IDs. You need to sort slices based on the "slice location" parameter provided in the DICOM files when you are reading the data. The “Slice Location” value is stored in DICOM files and is accessible through the following DICOM tag:
(0020,1041) - DS - Slice Location

Labels

  • Index.csv : specifies the patients having slice-level and lobe-level labels. The indices given to patients in “Index.csv” file are then used in “Slice-level-labels.npy” and “Lobe-level-labels.npy” to indicate the slice and lobe labels.
  • Slice-level-labels.npy : a 2D binary Numpy array in which the existence of infection in a specific slice is indicated by 1 and the lack of infection is shown by 0. The first dimension represents the case index and the second one represents the slice numbers.
  • Lobe-level-labels.npy : a 3D binary Numpy array in which the existence of infection in a specific lobe and slice is determined by 1 in the corresponding element of the array. Like the slice-level array. The two first dimensions represent the case index and slice numbers respectively. The third dimension shows the lobe indices which are specified as follows:
    • 0 : Left Lower Lobe (LLL)
    • 1 : Left Upper Lobe (LUL)
    • 2 : Right Lower Lobe (RLL)
    • 3 : Right Middle Lobe (RML)
    • 4 : Right Upper Lobe (RUL)

IMPORTANT:

While reading DICOM files, note that the correct order of slices in a CT scan doesn’t necessarily follow the order of the Slice-IDs. It’s recommended to use the slice location value to sort the slices. Otherwise, the labels will not match correctly to the images. The “Slice Location” value is stored in DICOM files and is accessible through the following DICOM tag:
(0020,1041) - DS - Slice Location

Statistical Analysis

"statistical_analysis.py" is the code to re-produce the statistical analysis provided in the data description.
Please note that your Python directory should be set to the folder where you store the downloaded pacakge.

Requirements:

  • pydicom (Installation)
  • pandas
  • seaborn
  • tempfile
  • os
  • numpy
  • matplotlib

Citation

If you found this dataset and the related data descritipon useful in your research, please consider citing:

@article{Afshar2021,
author = {Afshar, Parnian and Heidarian, Shahin and Enshaei, Nastaran and Naderkhani, Farnoosh and Rafiee, Moezedin Javad and Oikonomou, Anastasia and Fard, Faranak Babaki and Samimi, Kaveh and Plataniotis, Konstantinos N and Mohammadi, Arash},
doi = {10.1038/s41597-021-00900-3},
issn = {2052-4463},
journal = {Scientific Data},
number = {1},
pages = {121},
title = {{COVID-CT-MD, COVID-19 computed tomography scan dataset applicable in machine learning and deep learning}},
url = {https://doi.org/10.1038/s41597-021-00900-3},
volume = {8},
year = {2021}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].