📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.

Stars: ✭ 21 (-91.73%)

Mutual labels: bigdata

bqv

The simplest tool to manage views of BigQuery.

Stars: ✭ 22 (-91.34%)

Mutual labels: bigdata

Covid19Tracker

A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.

Stars: ✭ 65 (-74.41%)

Mutual labels: spark

vulkn

Love your Data. Love the Environment. Love VULKИ.

Stars: ✭ 43 (-83.07%)

Mutual labels: bigdata

docker-spark

Apache Spark docker container image (Standalone mode)

Stars: ✭ 34 (-86.61%)

Mutual labels: spark

BigDataTools

tools for bigData

Stars: ✭ 36 (-85.83%)

Mutual labels: bigdata

jigsaw-seed

这是组件库 Jigsaw-七巧板(https://github.com/rdkmaster/jigsaw) 的种子工程，建议所有新增的app都以这个工程作为种子开始构建。

Stars: ✭ 17 (-93.31%)

Mutual labels: bigdata

UnROOT.jl

Native Julia I/O package to work with CERN ROOT files

Stars: ✭ 52 (-79.53%)

Mutual labels: bigdata

Python Master Courses

人生苦短我用Python

Stars: ✭ 61 (-75.98%)

Mutual labels: spark

cds

Data syncing in golang for ClickHouse.

Stars: ✭ 839 (+230.31%)

Mutual labels: bigdata

SparkV

🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.

Stars: ✭ 24 (-90.55%)

Mutual labels: spark

meetups-archivos

Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …

Stars: ✭ 60 (-76.38%)

Mutual labels: bigdata

spark-acid

ACID Data Source for Apache Spark based on Hive ACID

Stars: ✭ 91 (-64.17%)

Mutual labels: spark

hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

Stars: ✭ 56 (-77.95%)

Mutual labels: bigdata

daf-kylo

Kylo integration with PDND (previously DAF).

Stars: ✭ 20 (-92.13%)

Mutual labels: spark

learning-spark

Tidy up Spark and Hadoop tutorials.

Stars: ✭ 28 (-88.98%)

Mutual labels: bigdata

spark-word2vec

A parallel implementation of word2vec based on Spark

Stars: ✭ 24 (-90.55%)

Mutual labels: spark

columnify

Make record oriented data to columnar format.

Stars: ✭ 28 (-88.98%)

Mutual labels: bigdata

trembita

Model complex data transformation pipelines easily

Stars: ✭ 44 (-82.68%)

Mutual labels: spark

shamash

Autoscaling for Google Cloud Dataproc

Stars: ✭ 31 (-87.8%)

Mutual labels: spark

Notes

This is a learning note | Java基础，JVM，源码，大数据，面经

Stars: ✭ 69 (-72.83%)

Mutual labels: bigdata

gan deeplearning4j

Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.

Stars: ✭ 19 (-92.52%)

Mutual labels: bigdata

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (-92.91%)

Mutual labels: bigdata

laravel-spark-camera

Profile Photo Camera support for Laravel Spark

Stars: ✭ 30 (-88.19%)

Mutual labels: spark

sparkProjectTemplate.g8

Template for Spark Projects

Stars: ✭ 77 (-69.69%)

Mutual labels: spark

dllib

dllib is a distributed deep learning library running on Apache Spark

Stars: ✭ 32 (-87.4%)

Mutual labels: spark

spark-extension

A library that provides useful extensions to Apache Spark and PySpark.

Stars: ✭ 25 (-90.16%)

Mutual labels: spark

Spark-and-Kafka IoT-Data-Processing-and-Analytics

Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time

Stars: ✭ 42 (-83.46%)

Mutual labels: bigdata

anovos

Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark

Stars: ✭ 77 (-69.69%)

Mutual labels: bigdata

dockerfiles

Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )

Stars: ✭ 29 (-88.58%)

Mutual labels: bigdata

Search Ads Web Service

Online search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]

Stars: ✭ 30 (-88.19%)

Mutual labels: spark

StreamBench

Measuring the performance of popular streaming engines with Yahoo's Streaming Benchmark

Stars: ✭ 52 (-79.53%)

Mutual labels: bigdata

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-94.49%)

Mutual labels: spark

the-apache-ignite-book

All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above

Stars: ✭ 65 (-74.41%)

Mutual labels: bigdata

zdh web

大数据采集,抽取平台

Stars: ✭ 292 (+14.96%)

Mutual labels: bigdata

jhdf

A pure Java HDF5 library

Stars: ✭ 83 (-67.32%)

Mutual labels: bigdata

spark-gradle-template

Apache Spark in your IDE with gradle

Stars: ✭ 39 (-84.65%)

Mutual labels: spark

amas

Amas is recursive acronym for “Amas, monitor alert system”.

Stars: ✭ 77 (-69.69%)

Mutual labels: bigdata

dt-sql-parser

SQL Parsers for BigData, built with antlr4.

Stars: ✭ 135 (-46.85%)

Mutual labels: bigdata

Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Stars: ✭ 70 (-72.44%)

Mutual labels: spark

Casper

A compiler for automatically re-targeting sequential Java code to Apache Spark.

Stars: ✭ 45 (-82.28%)

Mutual labels: spark

spark-util

low-level helpers for Apache Spark libraries and tests

Stars: ✭ 16 (-93.7%)

Mutual labels: spark

163-bigdate-note

bigdata note

Stars: ✭ 38 (-85.04%)

Mutual labels: bigdata

greycat

GreyCat - Data Analytics, Temporal data, What-if, Live machine learning

Stars: ✭ 104 (-59.06%)

Mutual labels: bigdata

openverse-catalog

Identifies and collects data on cc-licensed content across web crawl data and public apis.

Stars: ✭ 27 (-89.37%)

Mutual labels: spark

2019 egu workshop jupyter notebooks

Short course on interactive analysis of Big Earth Data with Jupyter Notebooks

Stars: ✭ 29 (-88.58%)

Mutual labels: bigdata

smolder

HL7 Apache Spark Datasource

Stars: ✭ 33 (-87.01%)

Mutual labels: spark

lectures-hse-spark

Масштабируемое машинное обучение и анализ больших данных с Apache Spark

Stars: ✭ 20 (-92.13%)

Mutual labels: bigdata

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-85.43%)

Mutual labels: bigdata

TiBigData

TiDB connectors for Flink/Hive/Presto

Stars: ✭ 192 (-24.41%)

Mutual labels: bigdata

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-62.6%)

Mutual labels: spark

awesome-coder-resources

编程路上加油站！------【持续更新中...欢迎star,欢迎常回来看看......】【内容：编程/学习/阅读资源，开源项目,面试题,网站,书,博客,教程等等】

Stars: ✭ 54 (-78.74%)

Mutual labels: bigdata

chatnoir-resiliparse

A robust web archive analytics toolkit

Stars: ✭ 26 (-89.76%)

Mutual labels: bigdata

Book

本项目收藏这些年来看过或者听过的一些不错的书籍，在整理文件时看见这些，发现删掉有点可惜，放着又太浪费空间，本着分享的原则，就把它们共享出来，一方面给需要的读者提供这些书籍，另一方面也是一种像知识库的积累吧

Stars: ✭ 47 (-81.5%)

Mutual labels: spark

61-120 of 529 similar projects

‹

›

next*5