All Projects → haozhenWu → Calibrated-Boosting-Forest

haozhenWu / Calibrated-Boosting-Forest

Licence: other
Original implementation of Calibrated Boosting-Forest

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Calibrated-Boosting-Forest

egfr-att
Drug effect prediction using neural network
Stars: ✭ 17 (-5.56%)
Mutual labels:  drug-discovery
Tencent2017 Final Rank28 code
2017第一届腾讯社交广告高校算法大赛Rank28_code
Stars: ✭ 85 (+372.22%)
Mutual labels:  xgboost
kaggle-code
A repository for some of the code I used in kaggle data science & machine learning tasks.
Stars: ✭ 100 (+455.56%)
Mutual labels:  xgboost
recsys2019
The complete code and notebooks used for the ACM Recommender Systems Challenge 2019
Stars: ✭ 26 (+44.44%)
Mutual labels:  xgboost
protwis
Protwis is the backbone of the GPCRdb. The GPCRdb contains reference data, interactive visualisation and experiment design tools for G protein-coupled receptors (GPCRs).
Stars: ✭ 20 (+11.11%)
Mutual labels:  drug-discovery
featurewiz
Use advanced feature engineering strategies and select best features from your data set with a single line of code.
Stars: ✭ 229 (+1172.22%)
Mutual labels:  xgboost
Apartment-Interest-Prediction
Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.
Stars: ✭ 17 (-5.56%)
Mutual labels:  xgboost
cbh21-protein-solubility-challenge
Template with code & dataset for the "Structural basis for solubility in protein expression systems" challenge at the Copenhagen Bioinformatics Hackathon 2021.
Stars: ✭ 15 (-16.67%)
Mutual labels:  drug-discovery
Kaggle-Competition-Sberbank
Top 1% rankings (22/3270) code sharing for Kaggle competition Sberbank Russian Housing Market: https://www.kaggle.com/c/sberbank-russian-housing-market
Stars: ✭ 31 (+72.22%)
Mutual labels:  xgboost
tensorflow kaggle house price
[Done] Master version: developed the stacked regression (score 0.11, top 5%) based on (xgboost, sklearn). Branch v1.0: developed linear regression (score 0.45) based on Tensorflow
Stars: ✭ 25 (+38.89%)
Mutual labels:  xgboost
secure-xgboost
Secure collaborative training and inference for XGBoost.
Stars: ✭ 80 (+344.44%)
Mutual labels:  xgboost
kaggle getting started
Kaggle getting started competition examples
Stars: ✭ 18 (+0%)
Mutual labels:  xgboost
skywalkR
code for Gogleva et al manuscript
Stars: ✭ 28 (+55.56%)
Mutual labels:  drug-discovery
PyPLIF-HIPPOS
HIPPOS Is PyPLIF On Steroids. A Molecular Interaction Fingerprinting Tool for Docking Results of Autodock Vina and PLANTS
Stars: ✭ 15 (-16.67%)
Mutual labels:  drug-discovery
target-and-market
A data-driven tool to identify the best candidates for a marketing campaign and optimize it.
Stars: ✭ 19 (+5.56%)
Mutual labels:  xgboost
aws-machine-learning-university-dte
Machine Learning University: Decision Trees and Ensemble Methods
Stars: ✭ 119 (+561.11%)
Mutual labels:  xgboost
xgboost-lightgbm-hyperparameter-tuning
Bayesian Optimization and Grid Search for xgboost/lightgbm
Stars: ✭ 40 (+122.22%)
Mutual labels:  xgboost
Arch-Data-Science
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
Stars: ✭ 92 (+411.11%)
Mutual labels:  xgboost
ai-deployment
关注AI模型上线、模型部署
Stars: ✭ 149 (+727.78%)
Mutual labels:  xgboost
awesome-small-molecule-ml
A curated list of resources for machine learning for small-molecule drug discovery
Stars: ✭ 54 (+200%)
Mutual labels:  drug-discovery

Calibrated Boosting-Forest

Build Status

Calibrated Boosting-Forest (CBF) is an integrative technique that leverages both continuous and binary labels and output calibrated posterior probabilities. It is originally designed for ligand-based virtual screening and can be extended to other applications. Calibrated Boosting-Forest is a package created by Haozhen Wu from Small Molecule Screening Facility
at University of Wisconsin-Madison.

For more details, please see our paper:
Calibrated Boosting-Forest by Haozhen Wu

Key features:

  • Take both continuous and binary labels as input (multi-labels)
  • Superior ranking power over individual regression or classification model
  • Output well calibrated posterior probabilities
  • Streamlined hyper-parameter tuning stage
  • Support multiple evaluation and stopping metrics
  • Competitive benchmark results for well-known public datasets
  • XGBoost backend

Table of contents:

Dependencies:

Installation

We recommend you to use Anaconda for convenient installing packages. Right now, LightChem has been tested for Python 2.7 under OS X and linux Ubuntu Server 16.04.

  1. Download 64-bit Python 2.7 version of Anaconda for linux/OS X here and follow the instruction. After you installed Anaconda, you will have most of the dependencies ready.

  2. Install git if do not have:
    Linux Ubuntu:

    sudo yum install git-all
  3. Install scikit-learn:

    conda install scikit-learn=0.18
  4. Install conda distribution of xgboost

    conda install --yes -c conda-forge xgboost=0.6a2
  5. Install rdkit Note: rdkit is only used to transform SMILE string into fingerprint.

    conda install -c omnia rdkit
  6. Clone the Calibrated-Boosting-Forest github repository:

    git clone https://github.com/haozhenWu/Calibrated-Boosting-Forest.git

    cd into Calibrated-Boosting-Forest directory and execute

    pip install -e .
    

Testing

To test that the dependencies have been installed correctly, simply enter pytest in the lightchem directory. This requires the optional pytest Python package. The current tests 1.confirm that the required dependencies exist and can be imported, 2.confirm the model performance results of one target MUV-466 fall into expected ranges.

FAQ

  1. When I import lightchem, the following error shows up version GLIBCXX_3.4.20 not found:
    Try:
    conda install libgcc
    Source

Reference

  1. [DeepChem] (https://github.com/deepchem/deepchem): Deep-learning models for Drug Discovery and Quantum Chemistry
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].