Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+32207.89%)

Mutual labels: hadoop

learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Stars: ✭ 146 (+284.21%)

Mutual labels: hadoop

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (+323.68%)

Mutual labels: hadoop

hadoop-ecosystem

Visualizations of the Hadoop Ecosystem

Stars: ✭ 20 (-47.37%)

Mutual labels: hadoop

Hadoop Common

Mirror of Apache Hadoop common

Stars: ✭ 155 (+307.89%)

Mutual labels: hadoop

openPDC

Open Source Phasor Data Concentrator

Stars: ✭ 109 (+186.84%)

Mutual labels: hadoop

Hadoop Hdfs

Mirror of Apache Hadoop HDFS

Stars: ✭ 152 (+300%)

Mutual labels: hadoop

pyspark-ML-in-Colab

Pyspark in Google Colab: A simple machine learning (Linear Regression) model

Stars: ✭ 32 (-15.79%)

Mutual labels: hadoop

Hadoop

Apache Hadoop

Stars: ✭ 12,177 (+31944.74%)

Mutual labels: hadoop

dpkb

大数据相关内容汇总，包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词：Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse

Stars: ✭ 123 (+223.68%)

Mutual labels: hadoop

Eel Sdk

Big Data Toolkit for the JVM

Stars: ✭ 140 (+268.42%)

Mutual labels: hadoop

rastercube

rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)

Stars: ✭ 15 (-60.53%)

Mutual labels: hadoop

Hbaseclient

HBase客户端数据管理软件

Stars: ✭ 135 (+255.26%)

Mutual labels: hadoop

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (+15.79%)

Mutual labels: hadoop

Calcite Avatica

Mirror of Apache Calcite - Avatica

Stars: ✭ 130 (+242.11%)

Mutual labels: hadoop

big-data-exploration

[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product

Stars: ✭ 43 (+13.16%)

Mutual labels: hadoop

Airflow Pipeline

An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR

Stars: ✭ 128 (+236.84%)

Mutual labels: hadoop

teraslice

Scalable data processing pipelines in JavaScript

Stars: ✭ 48 (+26.32%)

Mutual labels: hadoop

Griffon Vm

Griffon Data Science Virtual Machine

Stars: ✭ 128 (+236.84%)

Mutual labels: hadoop

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-50%)

Mutual labels: hadoop

Parquet4s

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

Stars: ✭ 125 (+228.95%)

Mutual labels: hadoop

beanszoo

Distributed Java micro-services using ZooKeeper

Stars: ✭ 12 (-68.42%)

Mutual labels: hadoop

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (+207.89%)

Mutual labels: hadoop

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+2.63%)

Mutual labels: hadoop

Datax

DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server

Stars: ✭ 116 (+205.26%)

Mutual labels: hadoop

gradle-consistent-versions

Compact, constraint-friendly lockfiles for your dependencies

Stars: ✭ 92 (+142.11%)

Mutual labels: octo-correct-managed

Tensorflowonyarn

Support TensorFlow on YARN

Stars: ✭ 114 (+200%)

Mutual labels: hadoop

go-baseapp

A lightweight starting point for Go web servers

Stars: ✭ 61 (+60.53%)

Mutual labels: octo-correct-managed

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (+197.37%)

Mutual labels: hadoop

palantir-java-format

A modern, lambda-friendly, 120 character Java formatter.

Stars: ✭ 203 (+434.21%)

Mutual labels: octo-correct-managed

Introtohadoopandmr udacity course

🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"

Stars: ✭ 110 (+189.47%)

Mutual labels: hadoop

witchcraft-go-server

A highly opinionated Go embedded application server for RESTy APIs

Stars: ✭ 47 (+23.68%)

Mutual labels: octo-correct-managed

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (+178.95%)

Mutual labels: hadoop

amalgomate

Go tool for combining multiple different main packages into a single program or library

Stars: ✭ 19 (-50%)

Mutual labels: octo-correct-managed

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+28823.68%)

Mutual labels: hadoop

hadoop-etl-udfs

The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL

Stars: ✭ 17 (-55.26%)

Mutual labels: hadoop

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (+142.11%)

Mutual labels: hadoop

ambari-hdp-docker

Dockerfiles and Docker Compose for HDP 2.6 with Blueprints

Stars: ✭ 23 (-39.47%)

Mutual labels: hadoop

Data-pipeline-project

Data pipeline project

Stars: ✭ 18 (-52.63%)

Mutual labels: hadoop

Hadoop cookbook

Cookbook to install Hadoop 2.0+ using Chef

Stars: ✭ 82 (+115.79%)

Mutual labels: hadoop

phoenix

Apache Phoenix / Hbase Spring Boot Microservices

Stars: ✭ 23 (-39.47%)

Mutual labels: hadoop

Camus

Mirror of Linkedin's Camus

Stars: ✭ 81 (+113.16%)

Mutual labels: hadoop

jmx exporter-cloudera-hadoop

Prometheus jmx_exporter configurations for Cloudera Hadoop

Stars: ✭ 33 (-13.16%)

Mutual labels: hadoop

bigdatatutorial

Stars: ✭ 34 (-10.53%)

Mutual labels: hadoop

Devops Bash Tools

550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...

Stars: ✭ 226 (+494.74%)

Mutual labels: hadoop

presto

Teradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data

Stars: ✭ 91 (+139.47%)

Mutual labels: hadoop

memex-gate

General Architecture for Text Engineering

Stars: ✭ 47 (+23.68%)

Mutual labels: hadoop

skein

A tool and library for easily deploying applications on Apache YARN