All Projects → prosto → Similar Projects or Alternatives

1470 Open source projects that are alternatives of or similar to prosto

machine-learning-data-pipeline
Pipeline module for parallel real-time data processing for machine learning models development and production purposes.
Stars: ✭ 22 (-59.26%)
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+850%)
Mutual labels:  workflow, spark, olap
Pandera
A light-weight, flexible, and expressive pandas data validation library
Stars: ✭ 506 (+837.04%)
Mutual labels:  pandas, data-processing
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+70.37%)
Mutual labels:  spark, olap
Cape Python
Collaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (+131.48%)
Mutual labels:  spark, pandas
foofah
Foofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (-55.56%)
Mutual labels:  data-wrangling, data-preparation
Cboard
An easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+5075.93%)
Mutual labels:  business-intelligence, olap
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+40729.63%)
Mutual labels:  spark, pandas
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1725.93%)
Mutual labels:  spark, data-wrangling
Expand
DevExpress XAF extension framework. 𝗹𝗶𝗻𝗸𝗲𝗱𝗶𝗻.𝗲𝘅𝗽𝗮𝗻𝗱𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸.𝗰𝗼𝗺, 𝘆𝗼𝘂𝘁𝘂𝗯𝗲.𝗲𝘅𝗽𝗮𝗻𝗱𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸.𝗰𝗼𝗺 and 𝘁𝘄𝗶𝘁𝘁𝗲𝗿 @𝗲𝘅𝗽𝗮𝗻𝗱𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 and or simply 𝗦𝘁𝗮𝗿/𝘄𝗮𝘁𝗰𝗵 this repository and get notified from 𝗚𝗶𝘁𝗛𝘂𝗯
Stars: ✭ 158 (+192.59%)
Mutual labels:  workflow, business-intelligence
Data-Wrangling-with-Python
Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
Stars: ✭ 90 (+66.67%)
Mutual labels:  pandas, data-wrangling
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+1033.33%)
Mutual labels:  pandas, feature-engineering
Udacity-Data-Analyst-Nanodegree
Repository for the projects needed to complete the Data Analyst Nanodegree.
Stars: ✭ 31 (-42.59%)
Mutual labels:  pandas, data-wrangling
Data-Science-101
Notes and tutorials on how to use python, pandas, seaborn, numpy, matplotlib, scipy for data science.
Stars: ✭ 19 (-64.81%)
Mutual labels:  pandas, data-wrangling
Mining
Business Intelligence (BI) in Python, OLAP
Stars: ✭ 1,128 (+1988.89%)
Mutual labels:  business-intelligence, olap
OLAP-cube
is an hypercube of data
Stars: ✭ 23 (-57.41%)
Mutual labels:  business-intelligence, olap
The-Data-Visualization-Workshop
A New, Interactive Approach to Learning Data Visualization
Stars: ✭ 59 (+9.26%)
Mutual labels:  pandas, data-wrangling
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+461.11%)
Mutual labels:  spark, pandas
Guitar
A Simple and Efficient Distributed Multidimensional BI Analysis Engine.
Stars: ✭ 86 (+59.26%)
Mutual labels:  business-intelligence, olap
Data Forge Ts
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (+1690.74%)
Mutual labels:  pandas, data-wrangling
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (+33.33%)
Mutual labels:  workflow, spark
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+2112.96%)
Mutual labels:  workflow, spark
whyqd
data wrangling simplicity, complete audit transparency, and at speed
Stars: ✭ 16 (-70.37%)
Mutual labels:  pandas, data-wrangling
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+2401.85%)
Mutual labels:  data-wrangling, data-preparation
sparklanes
A lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-68.52%)
Market-Mix-Modeling
Market Mix Modelling for an eCommerce firm to estimate the impact of various marketing levers on sales
Stars: ✭ 31 (-42.59%)
spark-druid-olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (+429.63%)
Mutual labels:  spark, business-intelligence
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+3759.26%)
Mutual labels:  spark, feature-engineering
Handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+192.59%)
Mutual labels:  spark, pandas
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (+190.74%)
Mutual labels:  workflow, feature-engineering
xplore
A python package built for data scientist/analysts, AI/ML engineers for exploring features of a dataset in minimal number of lines of code for quick analysis before data wrangling and feature extraction.
Stars: ✭ 21 (-61.11%)
veridical-flow
Making it easier to build stable, trustworthy data-science pipelines.
Stars: ✭ 28 (-48.15%)
Mutual labels:  workflow, pandas
Retentioneering Tools
Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python. Opensource analytics, predictive analytics over clickstream, sentiment analysis, AB tests, machine learning, and Monte Carlo Markov Chain simulations, extending Pandas, Networkx and sklearn.
Stars: ✭ 291 (+438.89%)
Mutual labels:  pandas, business-intelligence
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+5537.04%)
Mutual labels:  spark, pandas
Spark Druid Olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 282 (+422.22%)
Mutual labels:  spark, business-intelligence
Ibis
A pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+2918.52%)
Mutual labels:  spark, pandas
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+37209.26%)
Mutual labels:  spark, business-intelligence
Data Forge Js
JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 139 (+157.41%)
Mutual labels:  pandas, data-wrangling
visions
Type System for Data Analysis in Python
Stars: ✭ 136 (+151.85%)
Mutual labels:  spark, pandas
Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (+1.85%)
Mutual labels:  spark, data-processing
Distributed Dataset
A distributed data processing framework in Haskell.
Stars: ✭ 108 (+100%)
Mutual labels:  spark, data-processing
SumStatsRehab
GWAS summary statistics files QC tool
Stars: ✭ 19 (-64.81%)
Datacompy
Pandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (+172.22%)
Mutual labels:  spark, pandas
pandas-workshop
An introductory workshop on pandas with notebooks and exercises for following along.
Stars: ✭ 161 (+198.15%)
Mutual labels:  pandas, data-wrangling
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-7.41%)
Mutual labels:  spark, data-processing
Data-Analyst-Nanodegree
Kai Sheng Teh - Udacity Data Analyst Nanodegree
Stars: ✭ 42 (-22.22%)
Mutual labels:  pandas, data-wrangling
alfred-workflow
No description or website provided.
Stars: ✭ 26 (-51.85%)
Mutual labels:  workflow
blog
blog entries
Stars: ✭ 39 (-27.78%)
Mutual labels:  spark
alfred-latex-symbols-workflow
🔎 Alfred 3-4 workflow to search for latex symbol commands
Stars: ✭ 33 (-38.89%)
Mutual labels:  workflow
mimir
Data-ish exploration through SQL+Uncertainty
Stars: ✭ 26 (-51.85%)
Mutual labels:  data-wrangling
my curd
超轻量 快速开发脚手架、流程平台。
Stars: ✭ 38 (-29.63%)
Mutual labels:  workflow
SparkV
🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-55.56%)
Mutual labels:  spark
ibis
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Stars: ✭ 48 (-11.11%)
Mutual labels:  workflow
zen-do-r
Um livro sobre programação para não-programadores.
Stars: ✭ 24 (-55.56%)
Mutual labels:  workflow
baleen3
Baleen 3 is a data processing tool based on the Annot8 framework
Stars: ✭ 15 (-72.22%)
Mutual labels:  data-processing
action-sync-node-meta
GitHub Action that syncs package.json with the repository metadata.
Stars: ✭ 25 (-53.7%)
Mutual labels:  workflow
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-74.07%)
Mutual labels:  spark
tukio
Tukio is an event based workflow generator library
Stars: ✭ 27 (-50%)
Mutual labels:  workflow
Covid19Tracker
A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (+20.37%)
Mutual labels:  spark
Tesseract
A set of libraries for rapidly developing Pipeline driven micro/macroservices.
Stars: ✭ 20 (-62.96%)
Mutual labels:  workflow
1-60 of 1470 similar projects