scitime / scitime

Licence: BSD-3-Clause license

Training time estimation for scikit-learn algorithms

Programming Languages

python

139335 projects - #7 most used programming language

shell

77523 projects

Projects that are alternatives of or similar to scitime

imbalanced-ensemble

Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库

Stars: ✭ 199 (+67.23%)

Mutual labels: scikit-learn

Trajectory-Analysis-and-Classification-in-Python-Pandas-and-Scikit-Learn

Formed trajectories of sets of points.Experimented on finding similarities between trajectories based on DTW (Dynamic Time Warping) and LCSS (Longest Common SubSequence) algorithms.Modeled trajectories as strings based on a Grid representation.Benchmarked KNN, Random Forest, Logistic Regression classification algorithms to classify efficiently t…

Stars: ✭ 41 (-65.55%)

Mutual labels: scikit-learn

scikit-hyperband

A scikit-learn compatible implementation of hyperband

Stars: ✭ 68 (-42.86%)

Mutual labels: scikit-learn

pygrams

Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence

Stars: ✭ 52 (-56.3%)

Mutual labels: scikit-learn

Movie-Recommendation-Chatbot

Movie Recommendation Chatbot provides information about a movie like plot, genre, revenue, budget, imdb rating, imdb links, etc. The model was trained with Kaggle’s movies metadata dataset. To give a recommendation of similar movies, Cosine Similarity and TFID vectorizer were used. Slack API was used to provide a Front End for the chatbot. IBM W…

Stars: ✭ 33 (-72.27%)

Mutual labels: scikit-learn

langx-java

Java tools, helper, common utilities. A replacement of guava, apache-commons, hutool

Stars: ✭ 50 (-57.98%)

Mutual labels: timer

UnityTimer

Powerful and convenient library for running actions after a delay in Unity3D. Fork from akbiggs/UnityTimer. Add some useful functions.

Stars: ✭ 26 (-78.15%)

Mutual labels: timer

favloader

Vanilla JavaScript library for loading animation in favicon (favicon loader)

Stars: ✭ 20 (-83.19%)

Mutual labels: timer

hub

Public reusable components for Polyaxon

Stars: ✭ 8 (-93.28%)

Mutual labels: scikit-learn

PredictionAPI

Tutorial on deploying machine learning models to production

Stars: ✭ 56 (-52.94%)

Mutual labels: scikit-learn

Chronity

⌛ Library for running functions after a delay by creating timers in Unity3D.

Stars: ✭ 40 (-66.39%)

Mutual labels: timer

meditation-timer

🧘 Progressive web application for timing your meditations

Stars: ✭ 23 (-80.67%)

Mutual labels: timer

timer

🕚 A simple, beautiful cubing timer.

Stars: ✭ 26 (-78.15%)

Mutual labels: timer

100DaysOfMLCode

I am taking up the #100DaysOfMLCode Challenge 😎

Stars: ✭ 12 (-89.92%)

Mutual labels: scikit-learn

scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Stars: ✭ 887 (+645.38%)

Mutual labels: scikit-learn

intro-to-ml

A basic introduction to machine learning (one day training).

Stars: ✭ 15 (-87.39%)

Mutual labels: scikit-learn

Splitter

A speedrunning timer for macOS

Stars: ✭ 34 (-71.43%)

Mutual labels: timer

CreditCard Fraud Detection

利用Logistic回归实现信用卡欺诈检测

Stars: ✭ 31 (-73.95%)

Mutual labels: scikit-learn

sklearn-matlab

Machine learning in Matlab using scikit-learn syntax

Stars: ✭ 27 (-77.31%)

Mutual labels: scikit-learn

sktime-tutorial-pydata-amsterdam-2020

Introduction to Machine Learning with Time Series at PyData Festival Amsterdam 2020

Stars: ✭ 115 (-3.36%)

Mutual labels: scikit-learn

View All Similar Projects ➔

scitime

Training time estimation for scikit-learn algorithms. Method explained in this article

Currently supporting:

RandomForestRegressor
SVC
KMeans
RandomForestClassifier

Environment setup

Python version: 3.7

Package dependencies:

scikit-learn (~=0.24.1)
pandas (~=1.1.5)
joblib (~=1.0.1)
psutil (~=5.8.0)
scipy (~=1.5.4)

Install scitime

❱ pip install scitime
or 
❱ conda install -c conda-forge scitime

Usage

How to compute a runtime estimation

Example for RandomForestRegressor

from sklearn.ensemble import RandomForestRegressor
import numpy as np
import time

from scitime import RuntimeEstimator

# example for rf regressor
estimator = RuntimeEstimator(meta_algo='RF', verbose=3)
rf = RandomForestRegressor()

X,y = np.random.rand(100000,10),np.random.rand(100000,1)
# run the estimation
estimation, lower_bound, upper_bound = estimator.time(rf, X, y)

# compare to the actual training time
start_time = time.time()
rf.fit(X,y)
elapsed_time = time.time() - start_time
print("elapsed time: {:.2}".format(elapsed_time))

Example for KMeans

from sklearn.cluster import KMeans
import numpy as np
import time

from scitime import RuntimeEstimator

# example for kmeans clustering
estimator = RuntimeEstimator(meta_algo='RF', verbose=3)
km = KMeans()

X = np.random.rand(100000,10)
# run the estimation
estimation, lower_bound, upper_bound = estimator.time(km, X)

# compare to the actual training time
start_time = time.time()
km.fit(X)
elapsed_time = time.time() - start_time
print("elapsed time: {:.2}".format(elapsed_time))

The Estimator class arguments:

meta_algo: The estimator used to predict the time, either RF or NN
verbose: Controls the amount of log output (either 0, 1, 2 or 3)
confidence: Confidence for intervals (defaults to 95%)

Parameters of the estimator.time function:

X: np.array of inputs to be trained
y: np.array of outputs to be trained (set to None for unsupervised algo)
algo: algo whose runtime the user wants to predict

--- FOR TESTERS / CONTRIBUTORS ---

Local Testing

Inside virtualenv (with pytest>=3.2.1):

(env)$ python -m pytest

How to use _data.py to generate data / fit models?

$ python _data.py --help

usage: _data.py [-h] [--drop_rate DROP_RATE] [--meta_algo {RF,NN}]
                [--verbose VERBOSE]
                [--algo {RandomForestRegressor,RandomForestClassifier,SVC,KMeans}]
                [--generate_data] [--fit FIT] [--save]

Gather & Persist Data of model training runtimes

optional arguments:
  -h, --help            show this help message and exit
  --drop_rate DROP_RATE
                        drop rate of number of data generated (from all param
                        combinations taken from _config.json). Default is
                        0.999
  --meta_algo {RF,NN}   meta algo used to fit the meta model (NN or RF) -
                        default is RF
  --verbose VERBOSE     verbose mode (0, 1, 2 or 3)
  --algo {RandomForestRegressor,RandomForestClassifier,SVC,KMeans}
                        algo to train data on
  --generate_data       do you want to generate & write data in a dedicated
                        csv?
  --fit FIT             do you want to fit the model? If so indicate the csv
                        name
  --save                (only used for model fit) do you want to save /
                        overwrite the meta model from this fit?

(_data.py uses _model.py behind the scenes)

How to run _model.py?

After pulling the master branch (git pull origin master) and setting the environment (described above), run ipython and:

from scitime._model import RuntimeModelBuilder

# example of data generation for rf regressor
trainer = RuntimeModelBuilder(drop_rate=0.99999, verbose=3, algo='RandomForestRegressor')
inputs, outputs, _ = trainer._generate_data()

# then fitting the meta model
meta_algo = trainer.model_fit(generate_data=False, inputs=inputs, outputs=outputs)
# this should not locally overwrite the pickle file located at scitime/models/{your_model}
# if you want to save the model, set the argument save_model to True

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

scitime / scitime

Programming Languages

Labels

Projects that are alternatives of or similar to scitime

scitime

Environment setup

Install scitime

Usage

How to compute a runtime estimation

--- FOR TESTERS / CONTRIBUTORS ---

Local Testing

How to use _data.py to generate data / fit models?

How to run _model.py?