All Projects → danielhanchen → sciblox

danielhanchen / sciblox

Licence: MIT license
sciblox - Easier Data Science and Machine Learning

Programming Languages

HTML
75241 projects
Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to sciblox

imbalanced-ensemble
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+314.58%)
Mutual labels:  data-mining, sklearn
modelscript
REPO MOVED TO https://github.com/repetere/jsonstack-data - Data Science and Machine learning in JavaScript
Stars: ✭ 40 (-16.67%)
Mutual labels:  data-mining, data-preprocessing
ml course
"Learning Machine Learning" Course, Bogotá, Colombia 2019 #LML2019
Stars: ✭ 22 (-54.17%)
Mutual labels:  sklearn
KaliIntelligenceSuite
Kali Intelligence Suite (KIS) shall aid in the fast, autonomous, central, and comprehensive collection of intelligence by executing standard penetration testing tools. The collected data is internally stored in a structured manner to allow the fast identification and visualisation of the collected information.
Stars: ✭ 58 (+20.83%)
Mutual labels:  data-mining
awesome-Python-data-science-books
Probably the best curated list of data science books in Python
Stars: ✭ 331 (+589.58%)
Mutual labels:  data-mining
MetQy
Repository for R package MetQy (read related publication here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6247936/)
Stars: ✭ 17 (-64.58%)
Mutual labels:  data-mining
KivyMLApp
The repository host the API for the ML model via FastAPI, Flask and contains android app development files using KivyMD.
Stars: ✭ 54 (+12.5%)
Mutual labels:  sklearn
Statistical-Learning-using-R
This is a Statistical Learning application which will consist of various Machine Learning algorithms and their implementation in R done by me and their in depth interpretation.Documents and reports related to the below mentioned techniques can be found on my Rpubs profile.
Stars: ✭ 27 (-43.75%)
Mutual labels:  boosting
scikitcrf NER
Python library for custom entity recognition using Sklearn CRF
Stars: ✭ 17 (-64.58%)
Mutual labels:  sklearn
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (+6.25%)
Mutual labels:  sklearn
SSIM Seq2Seq
SSIM - A Deep Learning Approach for Recovering Missing Time Series Sensor Data
Stars: ✭ 32 (-33.33%)
Mutual labels:  imputation
corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
Stars: ✭ 16 (-66.67%)
Mutual labels:  data-mining
nuts-ml
Flow-based data pre-processing for deep learning
Stars: ✭ 32 (-33.33%)
Mutual labels:  data-preprocessing
hierarchical-clustering
A Python implementation of divisive and hierarchical clustering algorithms. The algorithms were tested on the Human Gene DNA Sequence dataset and dendrograms were plotted.
Stars: ✭ 62 (+29.17%)
Mutual labels:  data-mining
flask-angular-data-science
Repository for a data science starter app using Flask, Angular and Docker. https://medium.com/@dvelsner/deploying-a-simple-machine-learning-model-in-a-modern-web-application-flask-angular-docker-a657db075280
Stars: ✭ 84 (+75%)
Mutual labels:  sklearn
get smarties
Dummy variable generation with fit/transform capabilities
Stars: ✭ 23 (-52.08%)
Mutual labels:  sklearn
TextClassification
基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (+79.17%)
Mutual labels:  data-mining
conferencias matutinas amlo
CSVs de las versiones estenográficas de las conferencias matutinas del Presidente Andres Manuel López Obrador ( Mañaneras AMLO )
Stars: ✭ 25 (-47.92%)
Mutual labels:  data-mining
scikit-cycling
Tools to analyze cycling data
Stars: ✭ 25 (-47.92%)
Mutual labels:  data-mining
Decision-Tree-Implementation
A python 3 implementation of decision tree (machine learning classification algorithm) from scratch
Stars: ✭ 19 (-60.42%)
Mutual labels:  sklearn

sciblox

An all in one Python3 Data Science Package. Easy visualisation, data mining, data preparation and machine learning.

Please check the Jupyter Notebook for instructions on how to use it. You can also check sciblox out on https://danielhanchen.github.io/

https://pypi.python.org/pypi/sciblox

Install:

[sudo] pip install sciblox

NOTE: If you intend to use remove linearly dependent rows or KNN,SVD impute:

[sudo] pip install fancyimpute sympy theano

If fancyimpute fails: Please install C++ or MingW compiler

WHAT'S NEW?

  1. FASTER (x10) BPCA fill
  2. Better analyser
  3. NEW modules - Machine Learning

Some features explained include:

  1. MICE, BPCA missing data imputation with Random Forests, XGBoost and Linear Regression support
  2. Automatic Data Plotting
  3. Word extraction and frequency plots
  4. Sequential text processing
  5. CARET like processes including ZeroVarCheck, FreqRatios etc.
  6. Discretization and Continuisation
  7. Easy data structure changes like Hcat, Vcat, reversing etc.
  8. Easy CARET like Machine Learning modules
  9. Automatic Best Graphs Plotting

IN CONSTRUCTION:

  1. Advanced text extraction methods
  2. Automatic Machine Learning methods

For easier calling:

from sciblox import *
%matplotlib notebook

If you are using other methods, just copy paste sciblox.py into whatever Python3 main directory. Then call it same as top.

Some screenshots:

Analysing

Preprocessing

Analytics

Plotting

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].