healthNER

This is the code used for the research in our paper "Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition". It is a little scattered, but fully functioning.

The code contains three main files (These three files are independent from each other!):

1.CRF: This is the file where we implement the Conditional Random Field (CRF) model which in the paper's results section has the same name (Do not confuse it with the prediction layer of the B-LSTM-CRF). It contains two main files. The HCRF2.0b open-source tool to train a CRF model and a set of files for data_preparation. Further description in the CRF file.

2.Bidirectional_LSTM-CRF: This is the file where we implement the Bidirectional LSTM and the Bidirectional-LSTM-CRF models which in the paper's result section have the same name. The code is created following the code of Bidirectional-LSTM-CRF-for-Clinical-Concept-Extraction created by Raghav Chalapathy. Further description in the Bidirectional_LSTM-CRF file.

3.Bidirectional_LSTM-CRF_plus_feature_engineering: This is identical to the Bidirectional_LSTM-CRF with the only extra option of addind the hand-crafted features described in the paper. Further description in the Bidirectional_LSTM-CRF_plus_feature_engineering file.

*The specialized embeddings used in the paper are available here.

NOTE: The commit number of the code used in the paper is e5dc2a0.

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

ijauregiCMCRC / healthNER

Programming Languages

healthNER