All Projects → Pachyderm → Similar Projects or Alternatives

2609 Open source projects that are alternatives of or similar to Pachyderm

Data Science Live Book
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-96.36%)
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (-13.65%)
Sciblog support
Support content for my blog
Stars: ✭ 694 (-86.92%)
Mutual labels:  data-science, analytics, big-data
nebula
A distributed block-based data storage and compute engine
Stars: ✭ 127 (-97.61%)
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (-98.21%)
Mutual labels:  big-data, analytics, pachyderm
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+703.66%)
Mutual labels:  data-science, analytics, data-analysis
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-98.51%)
Mutual labels:  data-science, data-analysis, big-data
My Journey In The Data Science World
📢 Ready to learn or review your knowledge!
Stars: ✭ 1,175 (-77.85%)
Mutual labels:  data-science, data-analysis, big-data
Tennis Crystal Ball
Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-97.98%)
Mutual labels:  data-science, data-analysis, big-data
Courses
Quiz & Assignment of Coursera
Stars: ✭ 454 (-91.44%)
Mutual labels:  data-science, data-analysis, big-data
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (-88.12%)
Mutual labels:  data-science, analytics, big-data
Model Describer
model-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-99.59%)
Mutual labels:  data-science, analytics, data-analysis
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-74.78%)
Mutual labels:  data-science, data-analysis, big-data
Countly Sdk Cordova
Countly Product Analytics SDK for Cordova, Icenium and Phonegap
Stars: ✭ 69 (-98.7%)
Mutual labels:  analytics, data-analysis, big-data
Spark R Notebooks
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-97.95%)
Mutual labels:  data-science, data-analysis, big-data
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (-83.9%)
Mutual labels:  data-science, data-analysis, big-data
Pythondata
repo for code published on pythondata.com
Stars: ✭ 113 (-97.87%)
Mutual labels:  data-science, data-analysis, big-data
Datasciencevm
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (-97.12%)
Mutual labels:  data-science, data-analysis, big-data
golearn
🔥 Golang basics and actual-combat (including: crawler, distributed-systems, data-analysis, redis, etcd, raft, crontab-task)
Stars: ✭ 36 (-99.32%)
hotmap
WebGL Heatmap Viewer for Big Data and Bioinformatics
Stars: ✭ 13 (-99.75%)
Mutual labels:  big-data, data-analysis
covid-19
COVID-19 World is yet another Project to build a Dashboard like app to showcase the data related to the COVID-19(Corona Virus).
Stars: ✭ 28 (-99.47%)
Mutual labels:  analytics, data-analysis
Service Fabric
Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
Stars: ✭ 2,874 (-45.82%)
Mutual labels:  distributed-systems, containers
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-99.75%)
Mutual labels:  big-data, data-analysis
Xlearn
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
Stars: ✭ 2,968 (-44.05%)
Mutual labels:  data-science, data-analysis
Introduction Datascience Python Book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
Stars: ✭ 275 (-94.82%)
Mutual labels:  data-science, analytics
Knowledge Repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Stars: ✭ 4,956 (-6.58%)
Mutual labels:  data-science, data-analysis
Dapy
Easy-to-use data analysis / manipulation framework for humans
Stars: ✭ 523 (-90.14%)
Mutual labels:  data-science, data-analysis
Sealion
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.
Stars: ✭ 278 (-94.76%)
Mutual labels:  data-science, data-analysis
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (-94.67%)
Mutual labels:  data-science, big-data
nebula
A distributed, fast open-source graph database featuring horizontal scalability and high availability
Stars: ✭ 8,196 (+54.5%)
Mutual labels:  distributed-systems, big-data
twitter-analytics-wrapper
A simple Python wrapper to download tweets data from the Twitter Analytics platform. Particularly interesting for the impressions metrics that are unavailable on current Twitter API. Also works for the videos data.
Stars: ✭ 44 (-99.17%)
Mutual labels:  analytics, data-analysis
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (-55.85%)
Mutual labels:  analytics, data-analysis
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-94.85%)
Mutual labels:  data-science, data-analysis
Awesome Datascience
📝 An awesome Data Science repository to learn and apply for real world problems.
Stars: ✭ 17,520 (+230.25%)
Mutual labels:  data-science, analytics
Cryptocurrency Analysis Python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
Stars: ✭ 278 (-94.76%)
Mutual labels:  data-science, data-analysis
Awesome Distributed Deep Learning
A curated list of awesome Distributed Deep Learning resources.
Stars: ✭ 277 (-94.78%)
Pydataroad
open source for wechat-official-account (ID: PyDataLab)
Stars: ✭ 302 (-94.31%)
Mutual labels:  data-science, data-analysis
Knowage Server
Knowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Stars: ✭ 276 (-94.8%)
Mutual labels:  data-analysis, big-data
Dagster
An orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (-22.73%)
Mutual labels:  data-science, analytics
Datascience course
Curso de Data Science em Português
Stars: ✭ 294 (-94.46%)
Mutual labels:  data-science, data-analysis
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (-42.62%)
Mutual labels:  data-science, big-data
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (-26.43%)
Mutual labels:  analytics, big-data
Urs
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Stars: ✭ 275 (-94.82%)
Mutual labels:  data-science, data-analysis
Crate
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.
Stars: ✭ 3,254 (-38.66%)
Mutual labels:  analytics, big-data
Ai Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Stars: ✭ 4,387 (-17.3%)
Mutual labels:  data-science, data-analysis
Akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (-18.3%)
Mutual labels:  data-science, data-analysis
Sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (-93.18%)
Mutual labels:  big-data, distributed-systems
Quantitative Notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (-93.29%)
Mutual labels:  data-science, data-analysis
Pandas Summary
An extension to pandas dataframes describe function.
Stars: ✭ 361 (-93.2%)
Mutual labels:  data-science, data-analysis
Dataexplorer
Automate Data Exploration and Treatment
Stars: ✭ 362 (-93.18%)
Mutual labels:  data-science, data-analysis
Scikit Mobility
scikit-mobility: mobility analysis in Python
Stars: ✭ 339 (-93.61%)
Mutual labels:  data-science, data-analysis
Articles
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (-93.4%)
Mutual labels:  data-science, data-analysis
Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (-94.06%)
Mutual labels:  data-science, data-analysis
Datacleaner
The premier open source Data Quality solution
Stars: ✭ 391 (-92.63%)
Mutual labels:  data-science, data-analysis
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (-92.82%)
Mutual labels:  data-science, analytics
Rumale
Rumale is a machine learning library in Ruby
Stars: ✭ 526 (-90.08%)
Mutual labels:  data-science, data-analysis
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-92.21%)
Mutual labels:  data-science, analytics
Beeva Best Practices
Best Practices and Style Guides in BEEVA
Stars: ✭ 335 (-93.69%)
Mutual labels:  analytics, big-data
Prettypandas
A Pandas Styler class for making beautiful tables
Stars: ✭ 376 (-92.91%)
Mutual labels:  data-science, data-analysis
The Elements Of Statistical Learning Python Notebooks
A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book
Stars: ✭ 405 (-92.37%)
Mutual labels:  data-science, data-analysis
1-60 of 2609 similar projects