Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+5531.65%)

Mutual labels: hadoop

Spydra

Ephemeral Hadoop clusters using Google Compute Platform

Stars: ✭ 128 (-41.28%)

Mutual labels: hadoop

Geomancer

Automated feature engineering for geospatial data

Stars: ✭ 194 (-11.01%)

Mutual labels: bigquery

Spark Bigquery Connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

Stars: ✭ 126 (-42.2%)

Mutual labels: bigquery

Presto

The official home of the Presto distributed SQL query engine for big data

Stars: ✭ 12,957 (+5843.58%)

Mutual labels: hadoop

Beast

Load data from Kafka to any data warehouse

Stars: ✭ 119 (-45.41%)

Mutual labels: bigquery

Shifu

An end-to-end machine learning and data mining framework on Hadoop

Stars: ✭ 207 (-5.05%)

Mutual labels: hadoop

Cube.js

📊 Cube — Open-Source Analytics API for Building Data Apps

Stars: ✭ 11,983 (+5396.79%)

Mutual labels: bigquery

Hadoop Hdfs

Mirror of Apache Hadoop HDFS

Stars: ✭ 152 (-30.28%)

Mutual labels: hadoop

Parquet Go

Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.

Stars: ✭ 114 (-47.71%)

Mutual labels: hadoop

Quix

Quix Notebook Manager

Stars: ✭ 184 (-15.6%)

Mutual labels: bigquery

Eel Sdk

Big Data Toolkit for the JVM

Stars: ✭ 140 (-35.78%)

Mutual labels: hadoop

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+751.38%)

Mutual labels: hadoop

Bigdata Notebook

Stars: ✭ 100 (-54.13%)

Mutual labels: hadoop

Go Bqstreamer

Stream data into Google BigQuery concurrently using InsertAll()

Stars: ✭ 133 (-38.99%)

Mutual labels: bigquery

Bitcoin Etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ

Stars: ✭ 174 (-20.18%)

Mutual labels: bigquery

Calcite Avatica

Mirror of Apache Calcite - Avatica

Stars: ✭ 130 (-40.37%)

Mutual labels: hadoop

Awesome Learning

实践源码库：https://github.com/jast90/bigdata 。微信搜索Jast关注公众号，获取最新技术分享😯。

Stars: ✭ 197 (-9.63%)

Mutual labels: hadoop

Airflow Pipeline

An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR

Stars: ✭ 128 (-41.28%)

Mutual labels: hadoop

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-25.23%)

Mutual labels: hadoop

Griffon Vm

Griffon Data Science Virtual Machine

Stars: ✭ 128 (-41.28%)

Mutual labels: hadoop

Mprove

Open source Business Intelligence tool 🎉

Stars: ✭ 212 (-2.75%)

Mutual labels: bigquery

Parquet4s

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

Stars: ✭ 125 (-42.66%)

Mutual labels: hadoop

Gpt2 Bert Reddit Bot

a bot that generates realistic replies using a combination of pretrained GPT-2 and BERT models

Stars: ✭ 158 (-27.52%)

Mutual labels: bigquery

Mais

Universalizando o acesso a dados no Brasil. Docs: https://basedosdados.github.io/mais/

Stars: ✭ 122 (-44.04%)

Mutual labels: bigquery

Nutch

Apache Nutch is an extensible and scalable web crawler

Stars: ✭ 2,277 (+944.5%)

Mutual labels: hadoop

Professional Services

Common solutions and tools developed by Google Cloud's Professional Services team

Stars: ✭ 1,923 (+782.11%)

Mutual labels: bigquery

Hadoop Common

Mirror of Apache Hadoop common

Stars: ✭ 155 (-28.9%)

Mutual labels: hadoop

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-46.33%)

Mutual labels: hadoop

Calcite

Apache Calcite

Stars: ✭ 2,816 (+1191.74%)

Mutual labels: hadoop

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+647.71%)

Mutual labels: hadoop

Movie recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Stars: ✭ 2,092 (+859.63%)

Mutual labels: hadoop

Datax

DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server

Stars: ✭ 116 (-46.79%)

Mutual labels: hadoop

Bigquery Grafana

Google BigQuery Datasource Plugin for Grafana.

Stars: ✭ 188 (-13.76%)

Mutual labels: bigquery

Tensorflowonyarn

Support TensorFlow on YARN

Stars: ✭ 114 (-47.71%)

Mutual labels: hadoop

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-31.19%)

Mutual labels: hadoop

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-48.17%)

Mutual labels: hadoop

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (-6.88%)

Mutual labels: hadoop

Introtohadoopandmr udacity course

🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"

Stars: ✭ 110 (-49.54%)

Mutual labels: hadoop

Parquet Rs

Apache Parquet implementation in Rust

Stars: ✭ 144 (-33.94%)

Mutual labels: hadoop

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (-51.38%)

Mutual labels: hadoop

Scio

A Scala API for Apache Beam and Google Cloud Dataflow.

Stars: ✭ 2,247 (+930.73%)

Mutual labels: bigquery

Gcp Variant Transforms

GCP Variant Transforms

Stars: ✭ 100 (-54.13%)

Mutual labels: bigquery

Xlearning

AI on Hadoop

Stars: ✭ 1,709 (+683.94%)

Mutual labels: hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-1.38%)

Mutual labels: hadoop

Facebook Hive Udfs

Facebook's Hive UDFs

Stars: ✭ 213 (-2.29%)

Mutual labels: hadoop

Recommendsys

推荐项目（实时推荐和离线推荐）

Stars: ✭ 198 (-9.17%)

Mutual labels: hadoop

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (-18.81%)

Mutual labels: hadoop

1-60 of 323 similar projects

›

next*5