All Projects → kingfengji → eForest

kingfengji / eForest

Licence: other
This is the official implementation for the paper 'AutoEncoder by Forest'

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to eForest

handson-ml
도서 "핸즈온 머신러닝"의 예제와 연습문제를 담은 주피터 노트북입니다.
Stars: ✭ 285 (+301.41%)
Mutual labels:  random-forest, autoencoder
Encoder-Forest
eForest: Reversible mapping between high-dimensional data and path rule identifiers using trees embedding
Stars: ✭ 22 (-69.01%)
Mutual labels:  autoencoder, eforest
Textclassification
several methods for text classification
Stars: ✭ 180 (+153.52%)
Mutual labels:  random-forest
seq2seq-autoencoder
Theano implementation of Sequence-to-Sequence Autoencoder
Stars: ✭ 12 (-83.1%)
Mutual labels:  autoencoder
cqr
Conformalized Quantile Regression
Stars: ✭ 152 (+114.08%)
Mutual labels:  random-forest
Quickml
A fast and easy to use decision tree learner in java
Stars: ✭ 230 (+223.94%)
Mutual labels:  random-forest
Unsupervised-Classification-with-Autoencoder
Using Autoencoders for classification as unsupervised machine learning algorithms with Deep Learning.
Stars: ✭ 43 (-39.44%)
Mutual labels:  autoencoder
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (+147.89%)
Mutual labels:  random-forest
adversarial-autoencoder
Tensorflow 2.0 implementation of Adversarial Autoencoders
Stars: ✭ 17 (-76.06%)
Mutual labels:  autoencoder
DESOM
🌐 Deep Embedded Self-Organizing Map: Joint Representation Learning and Self-Organization
Stars: ✭ 76 (+7.04%)
Mutual labels:  autoencoder
Heart disease prediction
Heart Disease prediction using 5 algorithms
Stars: ✭ 43 (-39.44%)
Mutual labels:  random-forest
AIML-Projects
Projects I completed as a part of Great Learning's PGP - Artificial Intelligence and Machine Learning
Stars: ✭ 85 (+19.72%)
Mutual labels:  random-forest
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+4339.44%)
Mutual labels:  random-forest
Trajectory-Analysis-and-Classification-in-Python-Pandas-and-Scikit-Learn
Formed trajectories of sets of points.Experimented on finding similarities between trajectories based on DTW (Dynamic Time Warping) and LCSS (Longest Common SubSequence) algorithms.Modeled trajectories as strings based on a Grid representation.Benchmarked KNN, Random Forest, Logistic Regression classification algorithms to classify efficiently t…
Stars: ✭ 41 (-42.25%)
Mutual labels:  random-forest
Shifu
An end-to-end machine learning and data mining framework on Hadoop
Stars: ✭ 207 (+191.55%)
Mutual labels:  random-forest
pykitml
Machine Learning library written in Python and NumPy.
Stars: ✭ 26 (-63.38%)
Mutual labels:  random-forest
Infiniteboost
InfiniteBoost: building infinite ensembles with gradient descent
Stars: ✭ 180 (+153.52%)
Mutual labels:  random-forest
decision-trees-for-ml
Building Decision Trees From Scratch In Python
Stars: ✭ 61 (-14.08%)
Mutual labels:  random-forest
STOCK-RETURN-PREDICTION-USING-KNN-SVM-GUASSIAN-PROCESS-ADABOOST-TREE-REGRESSION-AND-QDA
Forecast stock prices using machine learning approach. A time series analysis. Employ the Use of Predictive Modeling in Machine Learning to Forecast Stock Return. Approach Used by Hedge Funds to Select Tradeable Stocks
Stars: ✭ 94 (+32.39%)
Mutual labels:  random-forest
EZyRB
Easy Reduced Basis method
Stars: ✭ 49 (-30.99%)
Mutual labels:  autoencoder

eForest: A random forest based autoencoder.

This is the official clone for the implementation of encoderForest.(The University's webserver is unstable sometimes, therefore we put the official clone here at github)

Package Official Website: http://lamda.nju.edu.cn/code_eForest.ashx

This package is provided "AS IS" and free for academic usage. You can run it at your own risk. For other purposes, please contact Prof. Zhi-Hua Zhou ([email protected]).

Description: A python implementation of encoderForest proposed in [1]. A demo implementation of eForest library as well as some demo client scripts to demostrate how to use the code. The implementation is flexible enough for modifying the model or fit your own datasets.

Reference: [1] J. Feng and Z.-H. Zhou. AutoEncoder by Forest. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI'18), New Orleans, Lousiana, USA, 2018.

ATTN: This package was developed and maintained by Mr.Ji Feng(http://lamda.nju.edu.cn/fengj/) .For any problem concerning the codes, please feel free to contact Mr.Feng.([email protected]) or open some issues here.

Installation

Step 1: Create virtual envrionment

The eforest is based on a custom version of scikit-learn, in which we add some extra methods in the origin forest models.
To avoid confiction with other usage of the official version of scikit-learn, you need to create a seperate envirionment.
If you are using anaconda ( it's a must), you will need to run the following command.

conda create -n eforest python=3.5 anaconda
source activate eforest
pip uninstall scikit-learn

Step 2: checkout the scikit-learn codes

The latest released version of sklearn until the code released is v0.19.1, you need to clone the code and check-out that version. That is, run the following command (you cannot install sklearn via pip/conda):

git clone https://github.com/scikit-learn/scikit-learn.git
cd scikit-learn
git checkout tags/0.19.1 -b 0.19.1

Step 3: Merge eforest code (this repo) into scikit-learn and install it.

Exit the folder and go to the folder containing the package, run the following command.

sh copy_codes.sh
cd scikit-learn
python setup.py install

And that's it.

Usage

Current supported model in sklearn includes:

  • Supervised Model
    • sklearn.ensemble.RandomForestClassifier
    • sklearn.ensemble.RandomForestRegressor
    • sklearn.ensemble.ExtraTreesClassifier
    • sklearn.ensemble.ExtraTreesRegressor
  • Unsupervised Model
    • sklearn.ensemble.RandomTreesEmbedding

Simple Encode/Decode Demo

from sklearn.ensemble import RandomForestClassifier
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape((x_train.shape[0], -1))
x_test = x_test.reshape((x_test.shape[0], -1))
model = RandomForestClassifier(n_estimators=1000, max_depth=None, n_jobs=-1)
model.fit(x_train, y_train)
X_encode = model.encode(x_test)
X_decode = model.decode(X_encode)

API details of model.encode

Parameters

  • X [ndarray]
    • shape = [n_samples, n_features]

Returns

  • X_encode [ndarray]
    • shape = [n_samples, n_trees]
    • X_encode[i, j] represent the leaf index for the j'th tree for the i'th sample

API details of model.decode

Parameters

  • X_encode [ndarray]
    • shape = [n_samples, n_trees]
  • sample_method [str, default='minimal']    - If sample_method == 'minimal':
    • The value of each dimension of the decoded result is determined by the minimal value defined by the corresponding MCR, you can define your own sample method as well.
    • MCR (Maximal-Compatible Rule) is the rule defined by the decison paths of X, checkout the paper for more details.
  • null_value [float, default=0]
    • The value used to replace nan value when the MCR is not defined for the particular attribute.

Returns

  • X_decode [ndarray]
    • shape = [n_samples, n_feautres]

More Examples

MNIST AutoEncoding Example

The following scripys will display the autoencoding result for MNIST dataset.
The first row (origin) is the origin images.
The second row (supervised) is the decoded result of eforest in supervised setting.
The third row (unsupervised) is the decoded result of eforest in unsupervised setting.

python exp/mnist_autoencoder.py

mnist autoencoder

CIFAR10 AutoEncoding Example

Run the following scripts, It will display the autoencoding result for CIFAR10 dataset

python exp/cifar10_autoencoder.py

cifar10 autoencoder

For citation purpose, please cite:
J. Feng and Z.-H. Zhou. AutoEncoder by Forest. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI'18), New Orleans, Lousiana, USA, 2018.

Happy Hacking.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].