All Projects → Griffon Vm → Similar Projects or Alternatives

10152 Open source projects that are alternatives of or similar to Griffon Vm

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+17.19%)

Mutual labels: jupyter-notebook, big-data, hadoop, apache-spark, database

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+3478.91%)

Mutual labels: data-science, big-data, hadoop, database

Etl with python

ETL with Python - Taught at DWH course 2017 (TAU)

Stars: ✭ 68 (-46.87%)

Mutual labels: jupyter-notebook, data-science, database, mysql

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+4318.75%)

Mutual labels: jupyter-notebook, data-science, big-data, hadoop

Courses

Quiz & Assignment of Coursera

Stars: ✭ 454 (+254.69%)

Mutual labels: jupyter-notebook, data-science, big-data

Hive

Apache Hive

Stars: ✭ 4,031 (+3049.22%)

Mutual labels: big-data, hadoop, database

Datasets For Recommender Systems

This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)

Stars: ✭ 564 (+340.63%)

Mutual labels: jupyter-notebook, data-science, database

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (-14.84%)

Mutual labels: jupyter-notebook, data-science, big-data

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+67.97%)

Mutual labels: big-data, hadoop, apache-spark

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (+25.78%)

Mutual labels: jupyter-notebook, hadoop, mysql

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-69.53%)

Mutual labels: big-data, apache-spark, hadoop

Elastic

R client for the Elasticsearch HTTP API

Stars: ✭ 227 (+77.34%)

Mutual labels: data-science, database, elasticsearch

Web Database Analytics

Web scrapping and related analytics using Python tools

Stars: ✭ 175 (+36.72%)

Mutual labels: jupyter-notebook, data-science, database

Tennis Crystal Ball

Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction

Stars: ✭ 107 (-16.41%)

Mutual labels: data-science, big-data, database

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+17125%)

Mutual labels: data-science, big-data, hadoop

My Journey In The Data Science World

📢 Ready to learn or review your knowledge!

Stars: ✭ 1,175 (+817.97%)

Mutual labels: jupyter-notebook, data-science, big-data

Sciblog support

Support content for my blog

Stars: ✭ 694 (+442.19%)

Mutual labels: jupyter-notebook, data-science, big-data

Datasciencevm

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

Stars: ✭ 153 (+19.53%)

Mutual labels: jupyter-notebook, data-science, big-data

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (-17.19%)

Mutual labels: hadoop, mysql, elasticsearch

Orc

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

Stars: ✭ 389 (+203.91%)

Mutual labels: big-data, hadoop, database

Ignite

Apache Ignite

Stars: ✭ 4,027 (+3046.09%)

Mutual labels: big-data, hadoop, database

Datax

DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server

Stars: ✭ 116 (-9.37%)

Mutual labels: hadoop, database, mysql

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+9282.03%)

Mutual labels: hadoop, mysql, elasticsearch

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+222.66%)

Mutual labels: jupyter-notebook, data-science, apache-spark

Phpmyfaq

phpMyFAQ - Open Source FAQ web application for PHP and MySQL, PostgreSQL and other databases

Stars: ✭ 494 (+285.94%)

Mutual labels: database, mysql, elasticsearch

Dist Keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

Stars: ✭ 613 (+378.91%)

Mutual labels: data-science, hadoop, apache-spark

Nagios Plugins

450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...

Stars: ✭ 1,000 (+681.25%)

Mutual labels: hadoop, mysql, elasticsearch

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+545.31%)

Mutual labels: hadoop, mysql, elasticsearch

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-89.84%)

Mutual labels: big-data, apache-spark, hadoop

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+38.28%)

Mutual labels: big-data, hadoop, apache-spark

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-13.28%)

Mutual labels: big-data, apache-spark, hadoop

Pythondata

repo for code published on pythondata.com

Stars: ✭ 113 (-11.72%)

Mutual labels: jupyter-notebook, data-science, big-data

Ebean

Ebean ORM

Stars: ✭ 1,172 (+815.63%)

Mutual labels: database, mysql, elasticsearch

Learn machine learning

Road to Machine Learning

Stars: ✭ 81 (-36.72%)

Mutual labels: jupyter-notebook, data-science, hadoop

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (-75%)

Mutual labels: big-data, apache-spark, hadoop

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+945.31%)

Mutual labels: jupyter-notebook, data-science, big-data

Openml R

R package to interface with OpenML

Stars: ✭ 81 (-36.72%)

Mutual labels: jupyter-notebook, data-science, database

Antsdb

AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase

Stars: ✭ 99 (-22.66%)

Mutual labels: hadoop, database, mysql

Models

DLTK Model Zoo

Stars: ✭ 101 (-21.09%)

Mutual labels: jupyter-notebook, data-science

Csv2db

The CSV to database command line loader

Stars: ✭ 102 (-20.31%)

Mutual labels: database, mysql

Python Data Science Handbook

A Chinese translation of Jake Vanderplas' "Python Data Science Handbook". 《Python数据科学手册》在线Jupyter notebook中文翻译

Stars: ✭ 102 (-20.31%)

Mutual labels: jupyter-notebook, data-science

Emkc

Engineer Man Knowledge Center

Stars: ✭ 104 (-18.75%)

Mutual labels: mysql, elasticsearch

Codesearchnet

Datasets, tools, and benchmarks for representation learning of code.

Stars: ✭ 1,378 (+976.56%)

Mutual labels: jupyter-notebook, data-science

Sigmoidal ai

Tutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal

Stars: ✭ 103 (-19.53%)

Mutual labels: jupyter-notebook, data-science

Spring Boot 2.x Examples

Spring Boot 2.x code examples

Stars: ✭ 104 (-18.75%)

Mutual labels: mysql, elasticsearch

Mysql Haskell

Pure haskell mysql driver

Stars: ✭ 106 (-17.19%)

Mutual labels: database, mysql

Mysql perf analyzer

MySQL performance monitoring and analysis.

Stars: ✭ 1,423 (+1011.72%)

Mutual labels: big-data, mysql

Torchbear

🔥🐻 The Speakeasy Scripting Engine Which Combines Speed, Safety, and Simplicity

Stars: ✭ 128 (+0%)

Mutual labels: data-science, database

Vizuka

Explore high-dimensional datasets and how your algo handles specific regions.

Stars: ✭ 100 (-21.87%)

Mutual labels: data-science, big-data

Yabox

Yet another black-box optimization library for Python

Stars: ✭ 103 (-19.53%)

Mutual labels: jupyter-notebook, data-science

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+8789.06%)

Mutual labels: mysql, elasticsearch

Kangaroo

SQL client and admin tool for popular databases

Stars: ✭ 127 (-0.78%)

Mutual labels: database, mysql

Mall

mall项目是一套电商系统，包括前台商城系统及后台管理系统，基于SpringBoot+MyBatis实现，采用Docker容器化部署。前台商城系统包含首页门户、商品推荐、商品搜索、商品展示、购物车、订单流程、会员中心、客户服务、帮助中心等模块。后台管理系统包含商品管理、订单管理、会员管理、促销管理、运营管理、内容管理、统计报表、财务管理、权限管理、设置等模块。

Stars: ✭ 54,797 (+42710.16%)

Mutual labels: mysql, elasticsearch

Blog

我的日记

Stars: ✭ 110 (-14.06%)

Mutual labels: mysql, elasticsearch

Grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

Stars: ✭ 45,930 (+35782.81%)

Mutual labels: mysql, elasticsearch

Hass Data Detective

Explore and analyse your Home Assistant data

Stars: ✭ 109 (-14.84%)

Mutual labels: jupyter-notebook, data-science

Ml Da Coursera Yandex Mipt

Machine Learning and Data Analysis Coursera Specialization from Yandex and MIPT

Stars: ✭ 108 (-15.62%)

Mutual labels: jupyter-notebook, data-science

Directus is a real-time API and App dashboard for managing SQL database content. 🐰

Stars: ✭ 111 (-13.28%)