Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → kexinhuang12345 → Clinicalbert

kexinhuang12345 / Clinicalbert

ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission (CHIL 2020 Workshop)

Labels

jupyter-notebook prediction

Projects that are alternatives of or similar to Clinicalbert

This repository helps you understand python from the scratch.

Stars: ✭ 285 (+62.86%)

Mutual labels: jupyter-notebook, prediction

Regression, Scrapers, and Visualization

Stars: ✭ 255 (+45.71%)

Mutual labels: jupyter-notebook, prediction

Temporal Causal Discovery Framework (PyTorch): discovering causal relationships between time series

Stars: ✭ 217 (+24%)

Mutual labels: jupyter-notebook, prediction

Time Series Forecast with Bitcoin value, to detect upward/down trends with Machine Learning Algorithms

Stars: ✭ 99 (-43.43%)

Mutual labels: jupyter-notebook, prediction

A Python library for integrating model-based and judgmental forecasting

Stars: ✭ 82 (-53.14%)

Mutual labels: jupyter-notebook, prediction

Deep Learning Time Series

List of papers, code and experiments using deep learning for time series forecasting

Stars: ✭ 796 (+354.86%)

Mutual labels: jupyter-notebook, prediction

Deep Learning Machine Learning Stock

Stock for Deep Learning and Machine Learning

Stars: ✭ 240 (+37.14%)

Mutual labels: jupyter-notebook, prediction

Attentive Neural Processes

implementing "recurrent attentive neural processes" to forecast power usage (w. LSTM baseline, MCDropout)

Stars: ✭ 33 (-81.14%)

Mutual labels: jupyter-notebook, prediction

Bitcoin Value Predictor

[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin

Stars: ✭ 91 (-48%)

Mutual labels: jupyter-notebook, prediction

Stock Price Predictor

This project seeks to utilize Deep Learning models, Long-Short Term Memory (LSTM) Neural Network algorithm, to predict stock prices.

Stars: ✭ 146 (-16.57%)

Mutual labels: jupyter-notebook, prediction

Evolutionary Computation Course

Jupyter/IPython notebooks about evolutionary computation.

Stars: ✭ 173 (-1.14%)

Mutual labels: jupyter-notebook

An OCaml kernel for the IPython notebook

Stars: ✭ 173 (-1.14%)

Mutual labels: jupyter-notebook

Stars: ✭ 174 (-0.57%)

Mutual labels: jupyter-notebook

Python implementations (on jupyter notebook) of algorithms described in the book "PRML"

Stars: ✭ 174 (-0.57%)

Mutual labels: jupyter-notebook

Stacked Generalization (Ensemble Learning)

Stars: ✭ 173 (-1.14%)

Mutual labels: prediction

Time series prediction

This is the code for "Time Series Prediction" By Siraj Raval on Youtube

Stars: ✭ 174 (-0.57%)

Mutual labels: jupyter-notebook

Package / Module importer for importing code from Jupyter Notebook files (.ipynb)

Stars: ✭ 174 (-0.57%)

Mutual labels: jupyter-notebook

Deep Crowd Counting crowdnet

An independent implementation of "CrowdNet: A Deep Convolutional Network for Dense Crowd Counting"

Stars: ✭ 173 (-1.14%)

Mutual labels: jupyter-notebook

One-Shot Video Object Segmentation

Stars: ✭ 173 (-1.14%)

Mutual labels: jupyter-notebook

Deep Algotrading

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Stars: ✭ 173 (-1.14%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

ClinicalBERT

This repo hosts pretraining and finetuning weights and relevant scripts for ClinicalBERT, a contextual representation for clinical notes.

New: Clinical XLNet and Pretraining Script

clinical XLNet pretrained model is available at here.
Detailed Step Instructions for pretraining ClinicalBERT and Clinical XLNet from scratch are available here
The predictive performance result is updated in this version using the correct pretraining test splitting method described in pretraining script above. For more clinical outcomes performance comparison with more baselines using the correct split for ClinicalBERT/XLNet, please see the Clinical XLNet paper.

Installation and Requirements

pip install pytorch-pretrained-bert

Datasets

We use MIMIC-III. As MIMIC-III requires the CITI training program in order to use it, we refer users to the link. However, as clinical notes share commonality, users can test any clinical notes using the ClinicalBERT weight, although further fine-tuning from our checkpoint is recommended.

File system expected:

-data
  -discharge
    -train.csv
    -val.csv
    -test.csv
  -3days
    -train.csv
    -val.csv
    -test.csv
  -2days
    -test.csv

Data file is expected to have column "TEXT", "ID" and "Label" (Note chunks, Admission ID, Label of readmission).

ClinicalBERT Weights

Use this google link or this oneDrive link for users in mainland China to download pretrained ClinicalBERT along with the readmission task fine-tuned model weights.

The following scripts presume a model folder that has following structure:

-model
	-discharge_readmission
		-bert_config.json
		-pytorch_model.bin
	-early_readmission
		-bert_config.json
		-pytorch_model.bin
	-pretraining
		-bert_config.json
		-pytorch_model.bin
		-vocab.txt

Hospital Readmission using ClinicalBERT

Below list the scripts for running prediction for 30 days hospital readmissions.

Early Notes Prediction

python ./run_readmission.py \
  --task_name readmission \
  --readmission_mode early \
  --do_eval \
  --data_dir ./data/3days(2days)/ \
  --bert_model ./model/early_readmission \
  --max_seq_length 512 \
  --output_dir ./result_early

Discharge Summary Prediction

python ./run_readmission.py \
  --task_name readmission \
  --readmission_mode discharge \
  --do_eval \
  --data_dir ./data/discharge/ \
  --bert_model ./model/discharge_readmission \
  --max_seq_length 512 \
  --output_dir ./result_discharge

Training your own readmission prediction model from pretraining ClinicalBERT

python ./run_readmission.py \
  --task_name readmission \
  --do_train \
  --do_eval \
  --data_dir ./data/(DATA_FILE) \
  --bert_model ./model/pretraining \
  --max_seq_length 512 \
  --train_batch_size (BATCH_SIZE) \
  --learning_rate 2e-5 \
  --num_train_epochs (EPOCHs) \
  --output_dir ./result_new

It will use the train.csv from the (DATA_FILE) folder.

The results will be in the output_dir folder and it consists of

'logits_clinicalbert.csv': logits from ClinicalBERT to compare with other models
'auprc_clinicalbert.png': Precision-Recall Curve
'auroc_clinicalbert.png': ROC Curve
'eval_results.txt': RP80, accuracy, loss

Preprocessing

We provide script for preprocessing clinical notes and merge notes with admission information on MIMIC-III.

Notebooks

Attention: this notebook is a tutorial to visualize self-attention.

Gensim Word2Vec and FastText models

Please use this link to download Word2Vec and FastText models for Clinical Notes.

To use, simply

import gensim
word2vec = gensim.models.KeyedVectors.load('word2vec.model')
weights = (m[m.wv.vocab])

Contact

Please contact [email protected] for help or submit an issue.

Citation

Please cite arxiv:

@article{clinicalbert,
author = {Kexin Huang and Jaan Altosaar and Rajesh Ranganath},
title = {ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission},
year = {2019},
journal = {arXiv:1904.05342},
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 175

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗