Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-99.38%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-99.51%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-99.33%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-98.97%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-99.58%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-99.33%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (-92.21%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-99.05%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-99.36%)
SpydraEphemeral Hadoop clusters using Google Compute Platform
Stars: ✭ 128 (-98.95%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-99.07%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-99.45%)
Haproxy Configs80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (-99.13%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-91.58%)
DynamometerA tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-99%)
Jsr203 HadoopA Java NIO file system provider for HDFS
Stars: ✭ 35 (-99.71%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-99.24%)
Storm Camel ExampleReal-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.
Stars: ✭ 28 (-99.77%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (-86.61%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-98.95%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (-99.06%)
ChukwaMirror of Apache Chukwa
Stars: ✭ 77 (-99.37%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (-90.19%)
Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: ✭ 114 (-99.06%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-99.39%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-98.95%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-99.44%)
Avro Hadoop StarterExample MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Stars: ✭ 110 (-99.1%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-99.47%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (-98.85%)
LikelikeAn implementation of locality sensitive hashing with Hadoop
Stars: ✭ 58 (-99.52%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (-84.76%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-99.56%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (-98.97%)
Basehttps://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (-99.61%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (-91.79%)
AntsdbAntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-99.19%)
AkkeeperAn easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-99.75%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-99.04%)
Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-99.24%)
Parquet RsApache Parquet implementation in Rust
Stars: ✭ 144 (-98.82%)
XlearningAI on Hadoop
Stars: ✭ 1,709 (-85.97%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (-86.52%)
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (-86.7%)