CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-47.74%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-67.1%)
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+944.52%)
Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-50.97%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+959.35%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-61.29%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-40.65%)
DynamometerA tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-21.29%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-47.1%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-49.68%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-25.16%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+667.74%)
Parquet RsApache Parquet implementation in Rust
Stars: ✭ 144 (-7.1%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-56.77%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-27.1%)
SpydraEphemeral Hadoop clusters using Google Compute Platform
Stars: ✭ 128 (-17.42%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+561.29%)
Haproxy Configs80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (-31.61%)
AntsdbAntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-36.13%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (-19.35%)
Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-40%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-24.52%)
HadoopApache Hadoop
Stars: ✭ 12,177 (+7756.13%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+951.61%)
ChukwaMirror of Apache Chukwa
Stars: ✭ 77 (-50.32%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+670.97%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (-26.45%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-52.26%)
Hadoop HdfsMirror of Apache Hadoop HDFS
Stars: ✭ 152 (-1.94%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-56.13%)
Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: ✭ 114 (-26.45%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-58.71%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-17.42%)
LikelikeAn implementation of locality sensitive hashing with Hadoop
Stars: ✭ 58 (-62.58%)
Avro Hadoop StarterExample MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Stars: ✭ 110 (-29.03%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-65.16%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (-9.68%)
Basehttps://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (-69.03%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1097.42%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+545.16%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-17.42%)
Movie recommend基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Stars: ✭ 2,092 (+1249.68%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-3.23%)
XlearningAI on Hadoop
Stars: ✭ 1,709 (+1002.58%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-18.71%)