All Projects → ClimbsRocks → Machinejs

ClimbsRocks / Machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

Programming Languages

javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Machinejs

Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+278.4%)
Mutual labels:  data-science, scikit-learn, automl, machine-learning-library, automated-machine-learning
Igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Stars: ✭ 2,956 (+617.48%)
Mutual labels:  data-science, scikit-learn, machine-learning-algorithms, automl, machine-learning-library
Lale
Library for Semi-Automated Data Science
Stars: ✭ 198 (-51.94%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Featuretools
An open source python library for automated feature engineering
Stars: ✭ 5,891 (+1329.85%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-91.5%)
Mutual labels:  kaggle, data-science, scikit-learn, ml
Autogluon
AutoGluon: AutoML for Text, Image, and Tabular Data
Stars: ✭ 3,920 (+851.46%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+133.25%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+2496.6%)
Mutual labels:  data-science, machine-learning-algorithms, automl, automated-machine-learning
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-52.43%)
Mutual labels:  kaggle, data-science, automl, automated-machine-learning
Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+1933.5%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Automl alex
State-of-the art Automated Machine Learning python library for Tabular Data
Stars: ✭ 132 (-67.96%)
Mutual labels:  data-science, ml, automl, machine-learning-library
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+191.02%)
Mutual labels:  kaggle, data-science, automl, automated-machine-learning
Hyperactive
A hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models.
Stars: ✭ 182 (-55.83%)
Mutual labels:  data-science, scikit-learn, automated-machine-learning
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (-52.91%)
Mutual labels:  data-science, scikit-learn, ml
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-59.71%)
Mutual labels:  data-science, automl, automated-machine-learning
Autoviz
Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.
Stars: ✭ 310 (-24.76%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
Flaml
A fast and lightweight AutoML library.
Stars: ✭ 205 (-50.24%)
Mutual labels:  data-science, automl, automated-machine-learning
Python Machine Learning Book
The "Python Machine Learning (1st edition)" book code repository and info resource
Stars: ✭ 11,428 (+2673.79%)
Mutual labels:  data-science, scikit-learn, machine-learning-algorithms
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+5251.46%)
Mutual labels:  kaggle, data-science, scikit-learn
Hungabunga
HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict!
Stars: ✭ 614 (+49.03%)
Mutual labels:  kaggle, scikit-learn, automl

machineJS

a fully-featured default process for machine learning- all the parts are here and have functional default values in place. Modify to your heart's delight so you can focus on the important parts for your dataset, or run it all the way through with the default values to have fully automated machine learning!

auto_ml - machineJS, but better!

I just built out v2 of this project that now gives you analytics info from your models, and is production-ready. machineJS is an amazing research project that clearly proved there's a hunger for automated machine learning.

auto_ml tackles this exact same goal, but with more features, cleaner code, and the ability to be copy/pasted into production.

Check it out! https://github.com/ClimbsRocks/auto_ml

What is machineJS?

machineJS provides a fully automated framework for applying machine learning to a dataset.

All you have to do is give it a .csv file, with some basic information about each column in the first row, and it will go off and do all the machine learning for you!

If you've already done this kind of thing before, it's useful as an outline, putting in place a working structure for you to make modifications within, rather than having to build from scratch again every time.

machineJS will tell you:

  • Which algorithms are going to be most effective for this problem
  • Which features are most useful
  • Whether this problem is solvable by machine learning at all (useful if you're not sure you've collected enough data yet)
  • How effective machine learning can be with this problem, to compare against other potential solutions (like just taking a grouped average)

If you haven't done much (or any) machine learning before- it does some fairly advanced stuff for you!

Installation:

As a standalone directory (recommended)

If you want to install this in it's own standalone repo, and work on the source code directly, then from the command line, type the following:

  1. git clone https://github.com/ClimbsRocks/machineJS.git
  2. cd machineJS
  3. npm install
  4. pip install -r requirements.txt
  5. git clone https://github.com/scikit-learn/scikit-learn.git
  6. cd scikit-learn
  7. python setup.py build
  8. sudo python setup.py install

From the command line

node machineJS.js path/to/trainData.csv --predict path/to/testData.csv

Format of Data Files:

We use the data-formatter module to automatically format your data, and even perform some basic feature engineering on it. Please refer to data-formatter's docs for information on how to label each column to be ready for machineJS.

How to customize/dive in deeper:

machineJS is designed to be super easy to use without diving into any of the internals. Be a conjurer- just give it data and let it run! That said, it's super powerful once you start customizing it.

It's designed to be relatively easy to modify, and well-documented. The obvious place to start is inside processArgs.js. Here we set nearly all the parameters that are used throughout the project.

The other obvious area many people will be interested in is adding in new models, and different hyperparameter search spaces. This can be found in the pySetup folder. The exact steps are listed in stepsToAddNewClassifier.txt.

What types of problems does this library work on?

machineJS works on both regression and categorical problems, as long as there is a single output column in the training data. This includes multi-category (frequently called multi-class) problems, where the category you are predicting is one of many possible categories. There are no immediate plans to support multiple output columns in the training data. If you have three output columns you're interested in predicting, and they cannot be combined into a single column in the training data, you could run machineJS once for each of those three columns.

This library is well-tested on Macs. I've designed it to work on PCs as well, but I haven't tested that at all yet. If you're a PC user, I'd love some issues or Pull Requests to make this work for PCs!

Note: This library is designed to run across all but one cores on the host machine. What this means for you:

  1. Please plug in.
  2. Close all programs and restart right before invoking (this will clear out as much RAM as possible).
  3. Expect some noise from your fan- you're finally putting your computer to use!
  4. Don't expect to be able to do anything intense while this is running. Internet browsing or code editing is fine, but watching a movie may get challenging.
  5. Please don't run any other Python scripts while this is running.

Thanks for inviting us along on your machine learning journey!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].