All Projects → scitime → scitime

scitime / scitime

Licence: BSD-3-Clause license
Training time estimation for scikit-learn algorithms

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to scitime

imbalanced-ensemble
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+67.23%)
Mutual labels:  scikit-learn
Trajectory-Analysis-and-Classification-in-Python-Pandas-and-Scikit-Learn
Formed trajectories of sets of points.Experimented on finding similarities between trajectories based on DTW (Dynamic Time Warping) and LCSS (Longest Common SubSequence) algorithms.Modeled trajectories as strings based on a Grid representation.Benchmarked KNN, Random Forest, Logistic Regression classification algorithms to classify efficiently t…
Stars: ✭ 41 (-65.55%)
Mutual labels:  scikit-learn
scikit-hyperband
A scikit-learn compatible implementation of hyperband
Stars: ✭ 68 (-42.86%)
Mutual labels:  scikit-learn
pygrams
Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Stars: ✭ 52 (-56.3%)
Mutual labels:  scikit-learn
Movie-Recommendation-Chatbot
Movie Recommendation Chatbot provides information about a movie like plot, genre, revenue, budget, imdb rating, imdb links, etc. The model was trained with Kaggle’s movies metadata dataset. To give a recommendation of similar movies, Cosine Similarity and TFID vectorizer were used. Slack API was used to provide a Front End for the chatbot. IBM W…
Stars: ✭ 33 (-72.27%)
Mutual labels:  scikit-learn
langx-java
Java tools, helper, common utilities. A replacement of guava, apache-commons, hutool
Stars: ✭ 50 (-57.98%)
Mutual labels:  timer
UnityTimer
Powerful and convenient library for running actions after a delay in Unity3D. Fork from akbiggs/UnityTimer. Add some useful functions.
Stars: ✭ 26 (-78.15%)
Mutual labels:  timer
favloader
Vanilla JavaScript library for loading animation in favicon (favicon loader)
Stars: ✭ 20 (-83.19%)
Mutual labels:  timer
hub
Public reusable components for Polyaxon
Stars: ✭ 8 (-93.28%)
Mutual labels:  scikit-learn
PredictionAPI
Tutorial on deploying machine learning models to production
Stars: ✭ 56 (-52.94%)
Mutual labels:  scikit-learn
Chronity
⌛ Library for running functions after a delay by creating timers in Unity3D.
Stars: ✭ 40 (-66.39%)
Mutual labels:  timer
meditation-timer
🧘 Progressive web application for timing your meditations
Stars: ✭ 23 (-80.67%)
Mutual labels:  timer
timer
🕚 A simple, beautiful cubing timer.
Stars: ✭ 26 (-78.15%)
Mutual labels:  timer
100DaysOfMLCode
I am taking up the #100DaysOfMLCode Challenge 😎
Stars: ✭ 12 (-89.92%)
Mutual labels:  scikit-learn
scikit-learn-intelex
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Stars: ✭ 887 (+645.38%)
Mutual labels:  scikit-learn
intro-to-ml
A basic introduction to machine learning (one day training).
Stars: ✭ 15 (-87.39%)
Mutual labels:  scikit-learn
Splitter
A speedrunning timer for macOS
Stars: ✭ 34 (-71.43%)
Mutual labels:  timer
CreditCard Fraud Detection
利用Logistic回归实现信用卡欺诈检测
Stars: ✭ 31 (-73.95%)
Mutual labels:  scikit-learn
sklearn-matlab
Machine learning in Matlab using scikit-learn syntax
Stars: ✭ 27 (-77.31%)
Mutual labels:  scikit-learn
sktime-tutorial-pydata-amsterdam-2020
Introduction to Machine Learning with Time Series at PyData Festival Amsterdam 2020
Stars: ✭ 115 (-3.36%)
Mutual labels:  scikit-learn

Documentation Status Build Status Build status codecov PyPI version Conda Version Conda Downloads License

scitime

Training time estimation for scikit-learn algorithms. Method explained in this article

Currently supporting:

  • RandomForestRegressor
  • SVC
  • KMeans
  • RandomForestClassifier

Environment setup

Python version: 3.7

Package dependencies:

  • scikit-learn (~=0.24.1)
  • pandas (~=1.1.5)
  • joblib (~=1.0.1)
  • psutil (~=5.8.0)
  • scipy (~=1.5.4)

Install scitime

❱ pip install scitime
or 
❱ conda install -c conda-forge scitime

Usage

How to compute a runtime estimation

  • Example for RandomForestRegressor
from sklearn.ensemble import RandomForestRegressor
import numpy as np
import time

from scitime import RuntimeEstimator

# example for rf regressor
estimator = RuntimeEstimator(meta_algo='RF', verbose=3)
rf = RandomForestRegressor()

X,y = np.random.rand(100000,10),np.random.rand(100000,1)
# run the estimation
estimation, lower_bound, upper_bound = estimator.time(rf, X, y)

# compare to the actual training time
start_time = time.time()
rf.fit(X,y)
elapsed_time = time.time() - start_time
print("elapsed time: {:.2}".format(elapsed_time))
  • Example for KMeans
from sklearn.cluster import KMeans
import numpy as np
import time

from scitime import RuntimeEstimator

# example for kmeans clustering
estimator = RuntimeEstimator(meta_algo='RF', verbose=3)
km = KMeans()

X = np.random.rand(100000,10)
# run the estimation
estimation, lower_bound, upper_bound = estimator.time(km, X)

# compare to the actual training time
start_time = time.time()
km.fit(X)
elapsed_time = time.time() - start_time
print("elapsed time: {:.2}".format(elapsed_time))

The Estimator class arguments:

  • meta_algo: The estimator used to predict the time, either RF or NN
  • verbose: Controls the amount of log output (either 0, 1, 2 or 3)
  • confidence: Confidence for intervals (defaults to 95%)

Parameters of the estimator.time function:

  • X: np.array of inputs to be trained
  • y: np.array of outputs to be trained (set to None for unsupervised algo)
  • algo: algo whose runtime the user wants to predict

--- FOR TESTERS / CONTRIBUTORS ---

Local Testing

Inside virtualenv (with pytest>=3.2.1):

(env)$ python -m pytest

How to use _data.py to generate data / fit models?

$ python _data.py --help

usage: _data.py [-h] [--drop_rate DROP_RATE] [--meta_algo {RF,NN}]
                [--verbose VERBOSE]
                [--algo {RandomForestRegressor,RandomForestClassifier,SVC,KMeans}]
                [--generate_data] [--fit FIT] [--save]

Gather & Persist Data of model training runtimes

optional arguments:
  -h, --help            show this help message and exit
  --drop_rate DROP_RATE
                        drop rate of number of data generated (from all param
                        combinations taken from _config.json). Default is
                        0.999
  --meta_algo {RF,NN}   meta algo used to fit the meta model (NN or RF) -
                        default is RF
  --verbose VERBOSE     verbose mode (0, 1, 2 or 3)
  --algo {RandomForestRegressor,RandomForestClassifier,SVC,KMeans}
                        algo to train data on
  --generate_data       do you want to generate & write data in a dedicated
                        csv?
  --fit FIT             do you want to fit the model? If so indicate the csv
                        name
  --save                (only used for model fit) do you want to save /
                        overwrite the meta model from this fit?

(_data.py uses _model.py behind the scenes)

How to run _model.py?

After pulling the master branch (git pull origin master) and setting the environment (described above), run ipython and:

from scitime._model import RuntimeModelBuilder

# example of data generation for rf regressor
trainer = RuntimeModelBuilder(drop_rate=0.99999, verbose=3, algo='RandomForestRegressor')
inputs, outputs, _ = trainer._generate_data()

# then fitting the meta model
meta_algo = trainer.model_fit(generate_data=False, inputs=inputs, outputs=outputs)
# this should not locally overwrite the pickle file located at scitime/models/{your_model}
# if you want to save the model, set the argument save_model to True
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].