Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-48.68%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-55.92%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-25.66%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-46.05%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+574.34%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-23.68%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+682.89%)
SpydraEphemeral Hadoop clusters using Google Compute Platform
Stars: ✭ 128 (-15.79%)
Haproxy Configs80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (-30.26%)
Jsr203 HadoopA Java NIO file system provider for HDFS
Stars: ✭ 35 (-76.97%)
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+965.13%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-46.71%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+980.26%)
Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-50%)
XlearningAI on Hadoop
Stars: ✭ 1,709 (+1024.34%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-60.53%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-66.45%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-17.11%)
Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-38.82%)
AkkeeperAn easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-80.26%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-23.03%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+972.37%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (-7.89%)
ChukwaMirror of Apache Chukwa
Stars: ✭ 77 (-49.34%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (-25%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+686.18%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-15.79%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-51.32%)
Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: ✭ 114 (-25%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-55.26%)
HadoopApache Hadoop
Stars: ✭ 12,177 (+7911.18%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-57.89%)
Avro Hadoop StarterExample MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Stars: ✭ 110 (-27.63%)
LikelikeAn implementation of locality sensitive hashing with Hadoop
Stars: ✭ 58 (-61.84%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-15.79%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-64.47%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1121.05%)
Basehttps://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (-68.42%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+557.89%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (-17.76%)
AntsdbAntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-34.87%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-1.32%)
Parquet RsApache Parquet implementation in Rust
Stars: ✭ 144 (-5.26%)
DynamometerA tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-19.74%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-39.47%)