All Projects → bupt-ai-cz → BALNMP

bupt-ai-cz / BALNMP

Licence: other
Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides, BCNB Dataset

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to BALNMP

93 Python Data Analytics Projects
This repository contains all the data analytics projects that I've worked on in python.
Stars: ✭ 208 (+940%)
Mutual labels:  breast-cancer-prediction
BIRADS classifier
High-resolution breast cancer screening with multi-view deep convolutional neural networks
Stars: ✭ 122 (+510%)
Mutual labels:  breast-cancer
GMIC
An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization
Stars: ✭ 106 (+430%)
Mutual labels:  breast-cancer
machina
Framework for Metastatic And Clonal History INtegrative Analysis
Stars: ✭ 28 (+40%)
Mutual labels:  metastasis
Mirai
This repository was used to develop Mirai, the risk model described in: Towards Robust Mammography-Based Models for Breast Cancer Risk.
Stars: ✭ 40 (+100%)
Mutual labels:  breast-cancer
eye-tracker-setup
👀 Tobii Eye Tracker 4C Setup
Stars: ✭ 24 (+20%)
Mutual labels:  breast-cancer
FAIRY
Fast and scalable search of whole-slide images via self-supervised deep learning - Nature Biomedical Engineering
Stars: ✭ 43 (+115%)
Mutual labels:  wsi-images
mammography metarepository
Meta-repository of screening mammography classifiers
Stars: ✭ 44 (+120%)
Mutual labels:  breast-cancer
Patch-GCN
Context-Aware Survival Prediction using Patch-based Graph Convolutional Networks - MICCAI 2021
Stars: ✭ 63 (+215%)
Mutual labels:  wsi-images
Breast-cancer-risk-prediction
Classification of Breast Cancer diagnosis Using Support Vector Machines
Stars: ✭ 143 (+615%)
Mutual labels:  breast-cancer-prediction
TOAD
AI-based pathology predicts origins for cancers of unknown primary - Nature
Stars: ✭ 138 (+590%)
Mutual labels:  wsi-images
Health-Discernment-System
A menu based multiple chronic disease detection system which will detect if a person is suffering from a severe disease by taking an essential input image.
Stars: ✭ 25 (+25%)
Mutual labels:  breast-cancer

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides visitors

Grand-Challenge | Arxiv | Dataset Page | Tweet

This repo is the official implementation and dataset introduction of our paper "Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides".

Our paper is accepted by Frontiers in Oncology, and you can also get access our paper from Arxiv or MedRxiv.

News

  • We launched a Grand Challenge: BCNB to promote relevant research.
  • We released our data. Please visit homepage to get the downloading information.

Abstract

  • Objectives: To develop and validate a deep learning (DL)-based primary tumor biopsy signature for predicting axillary lymph node (ALN) metastasis preoperatively in early breast cancer (EBC) patients with clinically negative ALN.

  • Methods: A total of 1,058 EBC patients with pathologically confirmed ALN status were enrolled from May 2010 to August 2020. A DL core-needle biopsy (DL-CNB) model was built on the attention-based multiple instance-learning (AMIL) framework to predict ALN status utilizing the DL features, which were extracted from the cancer areas of digitized whole-slide images (WSIs) of breast CNB specimens annotated by two pathologists. Accuracy, sensitivity, specificity, receiver operating characteristic (ROC) curves, and areas under the ROC curve (AUCs) were analyzed to evaluate our model.

  • Results: The best-performing DL-CNB model with VGG16_BN as the feature extractor achieved an AUC of 0.816 (95% confidence interval (CI): 0.758, 0.865) in predicting positive ALN metastasis in the independent test cohort. Furthermore, our model incorporating the clinical data, which was called DL-CNB+C, yielded the best accuracy of 0.831 (95% CI: 0.775, 0.878), especially for patients younger than 50 years (AUC: 0.918, 95% CI: 0.825, 0.971). The interpretation of DL-CNB model showed that the top signatures most predictive of ALN metastasis were characterized by the nucleus features including density (p = 0.015), circumference (p = 0.009), circularity (p = 0.010), and orientation (p = 0.012).

  • Conclusion: Our study provides a novel DL-based biomarker on primary tumor CNB slides to predict the metastatic status of ALN preoperatively for patients with EBC.

Paper results

The results in our paper are computed based on the cut-off value in ROC. For your convenient reference, we have recomputed the classification results with argmax prediction rule, where the threshold for binary classification is 0.5, and the detailed recomputed results are here.

The performance in prediction of ALN status (N0 vs. N(+))

N0 vs. N(+)

The performance in prediction of ALN status (N0 vs. N + (1-2))

N0 vs. N + (1-2)

The performance in prediction of ALN status (N0 vs. N + (>2))

N0 vs. N + (>2)

Implementation details

Data preparation

In our all experiments, the patch number (N) of each bag is fixed as 10, however, the bag number (M) for each WSI is not fixed and is dependent on the resolution of a WSI. According to our statistical results, the bag number (M) of WSIs varies from 1 to 300, which is not fixed for a WSI during training and testing. The process of dataset preparation is shown in the following figure, and the details are as follows:

  • Firstly, we cut out annotated tumor regions for each WSI, and there may exist multiple annotated tumor regions in a WSI.

  • Then, each extracted tumor region is cropped into amounts of non-overlapping square patches with a resolution of 256 * 256, and patches with a blank ratio greater than 0.3 are filtered out.

  • Finally, for each WSI, a bag is composed of randomly sampled 10 (N) patches, and the left patches which can not be grouped into a bag will be discarded.

The 5 clinical characteristics used in our experiments are age (numerical), tumor size (numerical), ER (categorical), PR (categorical), and HER2 (categorical), which are provided in our BCNB Dataset, and you can access them from our BCNB Dataset.

a

Model testing

As mentioned above, a WSI is split into multiple bags, and each bag is inputted into the MIL model to obtain predicted probabilities. So for obtaining the comprehensive predicted results of a WSI during testing, we compute the average predicted probabilities of all bags to achieve "Result Merging".

c

Model code and pre-trained model

We have provided the model code and pre-trained model for inference, the code is heavily borrowed from AttentionDeepMIL, which is implemented with Pytorch.

Demo software

We have also provided software for easily checking the performance of our model to predict ALN metastasis.

Please download the software from here, and check the README.txt for usage. Please note that this software is only used for demo, and it cannot be used for other purposes.

demo-software

BCNB Dataset

Our paper has introduced a new dataset of Early Breast Cancer Core-Needle Biopsy WSI (BCNB), which includes core-needle biopsy whole slide images (WSIs) of early breast cancer patients and the corresponding clinical data. The WSIs have been examined and annotated by two independent and experienced pathologists blinded to all patient-related information.

Based on this dataset, we have studied the deep learning algorithm for predicting the metastatic status of axillary lymph node (ALN) preoperatively by using multiple instance learning (MIL), and have achieved the best AUC of 0.831 in the independent test cohort. For more details, please review our paper.

Download

For full access to the BCNB dataset, please visit the Dataset Page.

Description

There are WSIs of 1058 patients, and only part of tumor regions are annotated in WSIs. Except for the WSIs, we have also provided the clinical characteristics of each patient, which includes age, tumor size, tumor type, ER, PR, HER2, HER2 expression, histological grading, surgical, Ki67, molecular subtype, number of lymph node metastases, and the metastatic status of ALN. The dataset has been desensitized, and not contained the privacy information of patients.

The slides were scanned with Iscan Coreo pathologic scanner, and the WSIs were viewed at 200x magnification using Image Viewer software.

The WSIs are provided with .jpg format and the clinical data are provided with .xlsx format. The dataset is collected and organized by the experienced doctors of our research group. The dataset has been desensitized, and not contained the privacy information of patients.

Based on this dataset, we have studied the prediction of the metastatic status of axillary lymph node (ALN) in our paper, which is a weakly supervised classification task. However, other researches based on our dataset are also feasible, such as the prediction of histological grading, molecular subtype, HER2, ER, and PR. We do not limit the specific content for your research, and any research based on our dataset is welcome.

Please note that the dataset is only used for education and research, and the usage for commercial and clinical applications is not allowed. The usage of this dataset must follow the license.

Annotation

Annotation information is stored in .json with the following format, where vertices have recorded coordinates of each point in the polygonal annotated area.

{
    "positive": [
        {
            "name": "Annotation 0",
            "vertices": [
                [
                    14274,
                    10723
                ],
                [
                    14259,
                    10657
                ],
                ......
            ]
        }
    ],
    "negative": []
}

Code for data preprocessing

We provide some codes for data preprocessing, which can be used to extract annotated tumor regions of all WSIs, and cutting patches with fixed size from all extracted annotated tumor regions, they may be helpful for you. Please check the code for more details.

Overview

For your convenience in research, we have split the BCNB Dataset into training cohort, validation cohort, and independent test cohort with the ratio as 6: 2: 2. The overall clinical characteristics statistics information of the BCNB Dataset are as follows:

demo-software

WSIs

N0
N+(1-2)
N+(>2)

Clinical data

clinical-data

Citation

Please cite our paper in your publications if it helps your research.

@article{xu2021predicting,
  title={Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides},
  author={Xu, Feng and Zhu, Chuang and Tang, Wenqi and Wang, Ying and Zhang, Yu and Li, Jie and Jiang, Hongchuan and Shi, Zhongyue and Liu, Jun and Jin, Mulan},
  journal={Frontiers in Oncology},
  pages={4133},
  year={2021},
  publisher={Frontiers}
}

License

This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use the data given that you agree to our license terms bellow:

  1. That you include a reference to the dataset in any work that makes use of the dataset. For research papers, cite our preferred publication as listed on our website; for other media cite our preferred publication as listed on our website or link to the website.
  2. That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data).
  3. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
  4. That all rights not expressly granted to you are reserved by us.

Contact

If you encounter any problems please contact us without hesitation.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].