All Projects → scotthlee → NRC

scotthlee / NRC

Licence: Apache-2.0 license
Natural language generation for discrete data in EHRs

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to NRC

rtg
Reader Translator Generator - NMT toolkit based on pytorch
Stars: ✭ 26 (+36.84%)
Mutual labels:  natural-language-generation
loinc2hpo
Java library to map LOINC-encoded test results to Human Phenotype Ontology
Stars: ✭ 19 (+0%)
Mutual labels:  ehr
halyos
Redesigning the Patient Portal Experience with SMART on FHIR.
Stars: ✭ 20 (+5.26%)
Mutual labels:  ehr
SGCP
TACL 2020: Syntax-Guided Controlled Generation of Paraphrases
Stars: ✭ 67 (+252.63%)
Mutual labels:  natural-language-generation
ehr-blockchain
Electronic Health Record (EHR) and Electronic Medical Record (EMR) systems. However, they still face some issues regarding the security of medical records, user ownership of data, data integrity etc. The solution to these issues could be the use of a novel technology, i.e., Blockchain. This technology offers to provide a secure, temper-proof pl…
Stars: ✭ 41 (+115.79%)
Mutual labels:  ehr
numberwords
Convert a number to an approximated text expression: from '0.23' to 'less than a quarter'.
Stars: ✭ 191 (+905.26%)
Mutual labels:  natural-language-generation
nullarbor
💾 📃 "Reads to report" for public health and clinical microbiology
Stars: ✭ 111 (+484.21%)
Mutual labels:  public-health
Entity2Topic
[NAACL2018] Entity Commonsense Representation for Neural Abstractive Summarization
Stars: ✭ 20 (+5.26%)
Mutual labels:  natural-language-generation
TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (+121.05%)
Mutual labels:  natural-language-generation
linguistic-style-transfer-pytorch
Implementation of "Disentangled Representation Learning for Non-Parallel Text Style Transfer(ACL 2019)" in Pytorch
Stars: ✭ 55 (+189.47%)
Mutual labels:  natural-language-generation
mchtoolbox
⛔ ARCHIVED ⛔ What the Package Does (Title Case)
Stars: ✭ 13 (-31.58%)
Mutual labels:  public-health
easse
Easier Automatic Sentence Simplification Evaluation
Stars: ✭ 109 (+473.68%)
Mutual labels:  natural-language-generation
factedit
🧐 Code & Data for Fact-based Text Editing (Iso et al; ACL 2020)
Stars: ✭ 16 (-15.79%)
Mutual labels:  natural-language-generation
awesome-nlg
A curated list of resources dedicated to Natural Language Generation (NLG)
Stars: ✭ 386 (+1931.58%)
Mutual labels:  natural-language-generation
pdd-graph
PDD Graph : Bridging MIMIC-III and Linked Data Cloud
Stars: ✭ 31 (+63.16%)
Mutual labels:  ehr
chatbot-samples
🤖 聊天机器人,对话模板
Stars: ✭ 110 (+478.95%)
Mutual labels:  natural-language-generation
nlg-markovify-api
An API built on Plumber (R) utilizing Markovify, a Python package, wrapped in markovifyR (R). It builds a Markov Chain-model based on text (user input) and generates new text based on the model.
Stars: ✭ 19 (+0%)
Mutual labels:  natural-language-generation
permgen
Author: Wenhao Yu ([email protected]). EMNLP'21. Sentence-Permuted Paragraph Generation.
Stars: ✭ 33 (+73.68%)
Mutual labels:  natural-language-generation
PlanSum
[AAAI2021] Unsupervised Opinion Summarization with Content Planning
Stars: ✭ 25 (+31.58%)
Mutual labels:  natural-language-generation
mtdata
A tool that locates, downloads, and extracts machine translation corpora
Stars: ✭ 95 (+400%)
Mutual labels:  natural-language-generation

Neural Record Captioning (NRC)

This repository contains code from the paper Natural Language Generation for Electronic Health Records.

what's included

  1. Keras code for the NRC model.
  2. Training and testing scripts for the model.
  3. Example scripts for preprocessing EHR data to be used in the model.

getting started

  1. Install the necessary Python modules (list below)
  2. Use preprocessing/sparisfy.py to convert the discrete variables in your EHRs to sparse format
  3. Use preprocessing/words_to_integers.py to convert your free text field to integers
  4. Train the autoencoder on the sparse records with ae_training.py
  5. Train the NRC model with caption_training.py
  6. Generate text with caption_testing.py

required software

  1. Python 3.x
  2. Keras with the TensorFlow backend
  3. Pandas, NumPy, h5py, and scikit-learn

hot tips

The default hyperparameters worked well for the data used in our paper, but they might not for yours, so feel free to experiment! Also, we recommend a GPU for training the captioning model. We used a single NVIDIA Titan X for our experiments, and training with ~2 million records took around 6 hours.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].