All Projects → Setl → Similar Projects or Alternatives

3828 Open source projects that are alternatives of or similar to Setl

Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+4726.58%)
Mutual labels:  spark, big-data
Labs
Research on distributed system
Stars: ✭ 73 (-7.59%)
Mutual labels:  spark, big-data
Neuraxle
A Sklearn-like Framework for Hyperparameter Tuning and AutoML in Deep Learning projects. Finally have the right abstractions and design patterns to properly do AutoML. Let your pipeline steps have hyperparameter spaces. Enable checkpoints to cut duplicate calculations. Go from research to production environment easily.
Stars: ✭ 377 (+377.22%)
Mutual labels:  pipeline, framework
Listenbrainz Server
Server for the ListenBrainz project
Stars: ✭ 420 (+431.65%)
Mutual labels:  spark, big-data
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+6878.48%)
Mutual labels:  spark, big-data
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+1417.72%)
Mutual labels:  data-science, pipeline
Magellan
Geo Spatial Data Analytics on Spark
Stars: ✭ 507 (+541.77%)
Mutual labels:  spark, big-data
Knowledge Repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Stars: ✭ 4,956 (+6173.42%)
Mutual labels:  data-science, data-analysis
Awesome R
A curated list of awesome R packages, frameworks and software.
Stars: ✭ 4,858 (+6049.37%)
Mutual labels:  data-science, data-analysis
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (-8.86%)
Mutual labels:  spark, etl
Awesome Twitter Data
A list of Twitter datasets and related resources.
Stars: ✭ 533 (+574.68%)
Mutual labels:  data-science, dataset
Nipype
Workflows and interfaces for neuroimaging packages
Stars: ✭ 557 (+605.06%)
Mutual labels:  data-science, big-data
Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (+298.73%)
Mutual labels:  data-science, data-analysis
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (+646.84%)
Mutual labels:  data-science, pipeline
Imbalanced Learn
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Stars: ✭ 5,617 (+7010.13%)
Mutual labels:  data-science, data-analysis
Alluxio
Alluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+6708.86%)
Mutual labels:  spark, data-analysis
Hyperlearn
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
Stars: ✭ 1,204 (+1424.05%)
Mutual labels:  data-science, data-analysis
Elki
ELKI Data Mining Toolkit
Stars: ✭ 613 (+675.95%)
Mutual labels:  data-science, data-analysis
Tsrepr
TSrepr: R package for time series representations
Stars: ✭ 75 (-5.06%)
Mutual labels:  data-science, data-analysis
Nfstream
NFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+687.34%)
Mutual labels:  data-science, data-analysis
Go Streams
A lightweight stream processing library for Go
Stars: ✭ 615 (+678.48%)
Mutual labels:  pipeline, etl
Dataproofer
A proofreader for your data
Stars: ✭ 628 (+694.94%)
Mutual labels:  data-science, data-analysis
Data Analysis And Machine Learning Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
Stars: ✭ 5,166 (+6439.24%)
Mutual labels:  data-science, data-analysis
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (+697.47%)
Mutual labels:  data-science, big-data
Sciblog support
Support content for my blog
Stars: ✭ 694 (+778.48%)
Mutual labels:  data-science, big-data
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+843.04%)
Mutual labels:  spark, big-data
Getting Started
This repository is a getting started guide to Singer.
Stars: ✭ 734 (+829.11%)
Mutual labels:  data-analysis, etl
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+903.8%)
Mutual labels:  spark, data-engineering
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+9970.89%)
Mutual labels:  data-science, data-engineering
Osint collection
Maintained collection of OSINT related resources. (All Free & Actionable)
Stars: ✭ 809 (+924.05%)
Mutual labels:  data-science, dataset
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-10.13%)
Mutual labels:  spark, big-data
Dataframe
C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (+948.1%)
Mutual labels:  data-science, data-analysis
Awesome Python Data Science
Probably the best curated list of data science software in Python.
Stars: ✭ 812 (+927.85%)
Mutual labels:  data-science, data-analysis
Skdata
Python tools for data analysis
Stars: ✭ 16 (-79.75%)
Mutual labels:  data-science, data-analysis
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+370.89%)
Mutual labels:  spark, etl
Cookbook 2nd
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+791.14%)
Mutual labels:  data-science, data-analysis
Datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+930.38%)
Mutual labels:  data-science, dataset
Phila Airflow
Stars: ✭ 16 (-79.75%)
Mutual labels:  pipeline, etl
Spring2017 proffosterprovost
Introduction to Data Science
Stars: ✭ 18 (-77.22%)
Mutual labels:  data-science, data-analysis
Pretzel
Javascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-67.09%)
Mutual labels:  data-science, big-data
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-67.09%)
Mutual labels:  data-science, spark
Cookbook
The Data Engineering Cookbook
Stars: ✭ 9,829 (+12341.77%)
Mutual labels:  big-data, data-engineering
Resources
PyMC3 educational resources
Stars: ✭ 930 (+1077.22%)
Mutual labels:  data-science, data-analysis
Socrat
A Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization
Stars: ✭ 26 (-67.09%)
Mutual labels:  data-science, data-analysis
Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+981.01%)
Mutual labels:  data-science, big-data
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+10443.04%)
Mutual labels:  data-science, data-analysis
Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-86.08%)
Mutual labels:  spark, big-data
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+1412.66%)
Mutual labels:  spark, etl
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+1075.95%)
Mutual labels:  dataset, spark
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-73.42%)
Mutual labels:  data-science, pipeline
Tedsds
Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-82.28%)
Mutual labels:  dataset, spark
Spark
Apache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+39922.78%)
Mutual labels:  spark, big-data
Janitor
simple tools for data cleaning in R
Stars: ✭ 981 (+1141.77%)
Mutual labels:  data-science, data-analysis
Dataconfs
A list of conferences connected with data worldwide.
Stars: ✭ 36 (-54.43%)
Mutual labels:  data-science, dataset
Countly Sdk Cordova
Countly Product Analytics SDK for Cordova, Icenium and Phonegap
Stars: ✭ 69 (-12.66%)
Mutual labels:  data-analysis, big-data
Mlcourse.ai
Open Machine Learning Course
Stars: ✭ 7,963 (+9979.75%)
Mutual labels:  data-science, data-analysis
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+1143.04%)
Mutual labels:  data-science, pipeline
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1163.29%)
Mutual labels:  data-science, spark
Attaca
Robust, distributed version control for large files.
Stars: ✭ 41 (-48.1%)
Mutual labels:  data-science, big-data
Ether sql
A python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-48.1%)
Mutual labels:  data-analysis, etl
61-120 of 3828 similar projects