JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-98.81%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+309.89%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+122.77%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-98.53%)
visionsType System for Data Analysis in Python
Stars: ✭ 136 (-97.47%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-99.7%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+11.69%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (-84.81%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-99.54%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-96.97%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-97.94%)
TrinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (-14.84%)
CoursesQuiz & Assignment of Coursera
Stars: ✭ 454 (-91.56%)
Docker practiceLearn and understand Docker technologies, with real DevOps practice!
Stars: ✭ 19,768 (+267.5%)
PandastableTable analysis in Tkinter using pandas DataFrames.
Stars: ✭ 376 (-93.01%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (-90.54%)
Presto EthereumPresto Ethereum Connector -- SQL on Ethereum
Stars: ✭ 450 (-91.63%)
BapBayesian Analysis with Python (Second Edition)
Stars: ✭ 379 (-92.95%)
PrettypandasA Pandas Styler class for making beautiful tables
Stars: ✭ 376 (-93.01%)
TensorflowonsparkTensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Stars: ✭ 3,748 (-30.32%)
Cookbook 2nd CodeCode of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (-89.94%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (-90.57%)
Bigdataie大数据博客、笔试题、教程、项目、面经的整理
Stars: ✭ 445 (-91.73%)
HiveApache Hive
Stars: ✭ 4,031 (-25.06%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (-94.14%)
Weibospider⚡ A distributed crawler for weibo, building with celery and requests.
Stars: ✭ 4,670 (-13.18%)
SparkmeasureThis is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Stars: ✭ 368 (-93.16%)
SidekickHigh Performance HTTP Sidecar Load Balancer
Stars: ✭ 366 (-93.2%)
DataexplorerAutomate Data Exploration and Treatment
Stars: ✭ 362 (-93.27%)
KyuubiKyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (-93.25%)
Qs ledgerQuantified Self Personal Data Aggregator and Data Analysis
Stars: ✭ 559 (-89.61%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (-90%)
Antd Umi Sys企业BI系统,数据可视化平台,主要技术:react、antd、umi、dva、es6、less等,与君共勉,互相学习,如果喜欢请start ⭐。
Stars: ✭ 503 (-90.65%)
Jupyter pivottablejsDrag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
Stars: ✭ 428 (-92.04%)
ArticlesA repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (-93.49%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (-93.29%)
Pandas SummaryAn extension to pandas dataframes describe function.
Stars: ✭ 361 (-93.29%)
SparklerSpark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (-93.27%)
Awesome RA curated list of awesome R packages, frameworks and software.
Stars: ✭ 4,858 (-9.69%)
Iclr2020 OpenreviewdataScript that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.
Stars: ✭ 426 (-92.08%)
Quantitative NotebooksEducational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (-93.38%)
SparkstreamingSpark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志分析统计;SpringBoot+Echarts实现数据可视化展示
Stars: ✭ 349 (-93.51%)
Dji Firmware ToolsTools for handling firmwares of DJI products, with focus on quadcopters.
Stars: ✭ 424 (-92.12%)
OapOptimized Analytics Package for Spark* Platform
Stars: ✭ 343 (-93.62%)
SparklensQubole Sparklens tool for performance tuning Apache Spark
Stars: ✭ 345 (-93.59%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (-90.15%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (-90.93%)
MoonboxMoonbox is a DVtaaS (Data Virtualization as a Service) Platform
Stars: ✭ 424 (-92.12%)
ScalnetA Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs
Stars: ✭ 342 (-93.64%)
IqlAn ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)
Stars: ✭ 341 (-93.66%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (-92.19%)
Notebooksinteractive notebooks from Planet Engineering
Stars: ✭ 339 (-93.7%)
Gis Tools For HadoopThe GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.
Stars: ✭ 485 (-90.98%)
LearningsparkScala examples for learning to use Spark
Stars: ✭ 421 (-92.17%)
Scikit Mobilityscikit-mobility: mobility analysis in Python
Stars: ✭ 339 (-93.7%)