DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server

Stars: ✭ 116 (-99.24%)

Mutual labels: hadoop

Tensorflowonyarn

Support TensorFlow on YARN

Stars: ✭ 114 (-99.25%)

Mutual labels: hadoop

Xlearning

AI on Hadoop

Stars: ✭ 1,709 (-88.78%)

Mutual labels: hadoop

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (-19.37%)

Mutual labels: hadoop

Calcite Avatica

Mirror of Apache Calcite - Avatica

Stars: ✭ 130 (-99.15%)

Mutual labels: hadoop

Quartzite

Quarzite is a thin idiomatic Clojure layer on top the Quartz Scheduler

Stars: ✭ 194 (-98.73%)

Mutual labels: scheduling

Griffon Vm

Griffon Data Science Virtual Machine

Stars: ✭ 128 (-99.16%)

Mutual labels: hadoop

Presto

The official home of the Presto distributed SQL query engine for big data

Stars: ✭ 12,957 (-14.9%)

Mutual labels: hadoop

React Native Alarm Notification

schedule alarm and local notification in react-native

Stars: ✭ 122 (-99.2%)

Mutual labels: scheduling

Shifu

An end-to-end machine learning and data mining framework on Hadoop

Stars: ✭ 207 (-98.64%)

Mutual labels: hadoop

Drill

Apache Drill is a distributed MPP query layer for self describing data

Stars: ✭ 1,619 (-89.37%)

Mutual labels: hadoop

Hadoop Hdfs

Mirror of Apache Hadoop HDFS

Stars: ✭ 152 (-99%)

Mutual labels: hadoop

Taskpacker

🎒 Simple schedule optimization library for Python

Stars: ✭ 115 (-99.24%)

Mutual labels: scheduling

Optaplanner

AI constraint solver in Java to optimize the vehicle routing problem, employee rostering, task assignment, maintenance scheduling, conference scheduling and other planning problems.

Stars: ✭ 2,454 (-83.88%)

Mutual labels: scheduling

Parquet Rs

Apache Parquet implementation in Rust

Stars: ✭ 144 (-99.05%)

Mutual labels: hadoop

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-99.26%)

Mutual labels: hadoop

Avro Hadoop Starter

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.

Stars: ✭ 110 (-99.28%)

Mutual labels: hadoop

Smart Industry

🏭 Open Source Manufacturing Execution System for JobShop type manufacturer.

Stars: ✭ 138 (-99.09%)

Mutual labels: scheduling

Pai

Resource scheduling and cluster management for AI

Stars: ✭ 2,223 (-85.4%)

Mutual labels: scheduling

Hbaseclient

HBase客户端数据管理软件

Stars: ✭ 135 (-99.11%)

Mutual labels: hadoop

Awesome Learning

实践源码库：https://github.com/jast90/bigdata 。微信搜索Jast关注公众号，获取最新技术分享😯。

Stars: ✭ 197 (-98.71%)

Mutual labels: hadoop

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-99.13%)

Mutual labels: hadoop

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-98.93%)

Mutual labels: hadoop

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (-89.22%)

Mutual labels: hadoop

Facebook Hive Udfs

Facebook's Hive UDFs

Stars: ✭ 213 (-98.6%)

Mutual labels: hadoop

Spydra

Ephemeral Hadoop clusters using Google Compute Platform

Stars: ✭ 128 (-99.16%)

Mutual labels: hadoop

Bookstore

📚 Notebook storage and publishing workflows for the masses

Stars: ✭ 162 (-98.94%)

Mutual labels: scheduling

Hivedscheduler

Kubernetes Scheduler for Deep Learning

Stars: ✭ 126 (-99.17%)

Mutual labels: scheduling

Nutch

Apache Nutch is an extensible and scalable web crawler

Stars: ✭ 2,277 (-85.05%)

Mutual labels: hadoop

Parquet4s

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

Stars: ✭ 125 (-99.18%)

Mutual labels: hadoop

Cylc Flow

Cylc: a workflow engine for cycling systems. Repository master branch: core meta-scheduler component of cylc-8 (in development); Repository 7.8.x branch: full cylc-7 system.

Stars: ✭ 154 (-98.99%)

Mutual labels: scheduling

Dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.

Stars: ✭ 122 (-99.2%)

Mutual labels: hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-98.59%)

Mutual labels: hadoop

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-99.23%)

Mutual labels: hadoop

Movie recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Stars: ✭ 2,092 (-86.26%)

Mutual labels: hadoop

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (-89.29%)

Mutual labels: hadoop

Unified Hosts Autoupdate

Quickly and easily install, uninstall, and set up automatic updates for any of Steven Black's unified hosts files.

Stars: ✭ 185 (-98.78%)

Mutual labels: scheduling

Optaweb Employee Rostering

Web application for solving Employee Rostering using OptaPlanner

Stars: ✭ 115 (-99.24%)

Mutual labels: scheduling

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-99.01%)

Mutual labels: hadoop

Asakusafw

Asakusa Framework

Stars: ✭ 114 (-99.25%)

Mutual labels: hadoop

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (-98.67%)

Mutual labels: hadoop

Parquet Go

Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.

Stars: ✭ 114 (-99.25%)

Mutual labels: hadoop

Hadoop

Apache Hadoop

Stars: ✭ 12,177 (-20.02%)

Mutual labels: hadoop

Liteflow

liteflow是一个基于任务版本来实现的分布式任务流调度系统

Stars: ✭ 112 (-99.26%)

Mutual labels: scheduling

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (-98.84%)

Mutual labels: hadoop

Introtohadoopandmr udacity course

🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"

Stars: ✭ 110 (-99.28%)

Mutual labels: hadoop

Nn dataflow

Explore the energy-efficient dataflow scheduling for neural networks.

Stars: ✭ 141 (-99.07%)

Mutual labels: scheduling

Hadoop Connectors

Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.