All Projects → leanderme → Sytora

leanderme / Sytora

A sophisticated smart symptom search engine

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Sytora

ruimtehol
R package to Embed All the Things! using StarSpace
Stars: ✭ 95 (-14.41%)
Mutual labels:  embeddings, classification
Docproduct
Medical Q&A with Deep Language Models
Stars: ✭ 495 (+345.95%)
Mutual labels:  healthcare, medical
humanapi
The easiest way to integrate health data from anywhere - https://www.humanapi.co
Stars: ✭ 21 (-81.08%)
Mutual labels:  medical, healthcare
CODER
CODER: Knowledge infused cross-lingual medical term embedding for term normalization. [JBI, ACL-BioNLP 2022]
Stars: ✭ 24 (-78.38%)
Mutual labels:  embeddings, medical
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+869.37%)
Mutual labels:  data-analysis, classification
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-86.49%)
Mutual labels:  classifier, classification
Pyhealth
A Python Library for Health Predictive Models
Stars: ✭ 360 (+224.32%)
Mutual labels:  healthcare, medical
facerec-bias-bfw
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).
Stars: ✭ 40 (-63.96%)
Mutual labels:  classification, data-analysis
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-65.77%)
Mutual labels:  classification, classifier
Awesome Fraud Detection Papers
A curated list of data mining papers about fraud detection.
Stars: ✭ 843 (+659.46%)
Mutual labels:  classification, classifier
DocProduct
Medical Q&A with Deep Language Models
Stars: ✭ 527 (+374.77%)
Mutual labels:  medical, healthcare
Multi Matcher
simple rules engine
Stars: ✭ 84 (-24.32%)
Mutual labels:  classification, classifier
HealthCare-Scan-Nearby-Hospital-Locations
I developed this android application to help beginner developers to know how to use Google Maps API and how to convert JSON data into Java Object.
Stars: ✭ 23 (-79.28%)
Mutual labels:  medical, healthcare
aarogya seva
A beautiful 😍 covid-19 app with self - assessment and more.
Stars: ✭ 118 (+6.31%)
Mutual labels:  medical, healthcare
skeleton
Composer starter project for Ambulatory.
Stars: ✭ 43 (-61.26%)
Mutual labels:  medical, healthcare
support-tickets-classification
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+27.93%)
Mutual labels:  classifier, classification
Fraud-Detection-in-Online-Transactions
Detecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in Overfitting
Stars: ✭ 41 (-63.06%)
Mutual labels:  classification, data-analysis
dl-relu
Deep Learning using Rectified Linear Units (ReLU)
Stars: ✭ 20 (-81.98%)
Mutual labels:  classifier, classification
Eda nlp
Data augmentation for NLP, presented at EMNLP 2019
Stars: ✭ 902 (+712.61%)
Mutual labels:  classification, embeddings
Graph 2d cnn
Code and data for the paper 'Classifying Graphs as Images with Convolutional Neural Networks' (new title: 'Graph Classification with 2D Convolutional Neural Networks')
Stars: ✭ 67 (-39.64%)
Mutual labels:  classification, embeddings

Sytora

Sytora is a multilingual symptom-disease classification app. Translation is managed through the UMLS coding standard. A multinomial Naive Bayes classifier is trained on a handpicked dataset, which is freely available under CC4.0.

To get started:

  • Clone this repo
  • Install requirements
  • Run the scripts (see below) and npm dependencies
  • Get a UMLS license to download UMLS lexica & generate DB (umls.sh)
  • Run and check http://localhost:5001
  • Done! 🎉

search

Check out sytora.com for a demo.

Motivation

Finding the right diagnosis cannot be achieved by extracting symptoms and running a classification algorithm. The hardest part is asking the right questions, focusing what is important in the situation, connecting other events, and much more. Despite all this, I have long been exited about writing a symptom-disease lookup system to quickly gather related symptoms to symptoms etc. Not everything the model outputs is nonsense. Actually it helps a lot to quickly get a list of diseases given to a set of symptoms.

Data

The data is formatted as CSV files. Example entry:

Disease,Symptom
C0162565,C0039239

Data sources:

  • DiseaseSymptomKB.csv: extracted from Disease-Symptom Knowledge Database. This data solely belongs to the respective authors. The authors are not not affiliated with this project.
  • disease-symptom.csv: Manually created by hand. Freely available under CC 4.0.

Install

Training models & generating files from data:

  1. Run cui2vec-converter.py to convert to GloVe-format. You need to get the pretrained embeddings first, available here: https://figshare.com/s/00d69861786cd0156d81. Place them in the data folder.
  2. Run generateLabels.py to create the option labels for the select fields. Languages are currently hardcoded as list and can be extended if needed.
  3. Run train.py to train a MNB classifier (for the disease prediction). Other necessary files are generated, too.
  4. Run relatedSymptoms.py to train the model for the autosuggestion feature. This uses cui2vec. Please note that the authors of cui2vec are not affiliated with this code.

React client: cd into flaskapp and npm install. For development npm run watch, for production npm run build.

Flask Service

A small flask app is avaiable to showcase the trained models. cd into the flaskapp folder and start the app

python app.py

Deployment

Make sure to export REACT_APP_ENDPOINT with the correct address (e.g. http://yoursite.com)

Get going in ~10 min:

sudo apt update
sudo apt install python3-pip python3-dev build-essential libssl-dev libffi-dev python3-setuptools
sudo apt install python-pip python-dev
sudo apt install nodejs npm
pip install flask pandas sklearn numpy
pip install Flask-Limiter flask-expects-json
pip install more-itertools requests configparser
sudo apt-get install nginx supervisor

git clone https://github.com/leanderme/sytora
cd sytora/flaskapp && npm i

vi /etc/supervisor/conf.d/sytora.conf
sudo supervisorctl reread
sudo service supervisor restart
sudo supervisorctl status

sudo vim /etc/nginx/conf.d/virtual.conf
sudo nginx -t
sudo service nginx restart

sytora.conf:

[program:sytora]
directory=/root/sytora/flaskapp
command=gunicorn app:app -b 0.0.0.0:5001
autostart=true
autorestart=true
stderr_logfile=/var/log/sytora/sytora.err.log
stdout_logfile=/var/log/sytora/sytora.out.log

virtual.conf

server {
    listen       80;
    server_name  site.com;

    location / {
        proxy_pass http://127.0.0.1:8000;
    }
}

don't forget to transfer the umls.db, e.g. scp ./umls.db [email protected]:/root/sytora/flaskapp/umls/database

Coding quality, security & stability

This project was written very quickly with no performance or stability features in mind; the code base suffered accordingly. Expect things to be cleaned up soon though.

Please note that I'm a machine learning hobbyist and a medical student. The code may not in accordance with common conventions.

Acknowledgements

This project is heavily inspired by:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].