Big Data Rosetta CodeCode snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-86.31%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-99.41%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-96.39%)
Hadoop study定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (-69.45%)
bandar-logMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-98.92%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-94.13%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (-4.09%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-95.64%)
SparklearningLearning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (-69.94%)
daf-kyloKylo integration with PDND (previously DAF).
Stars: ✭ 20 (-98.92%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-98.65%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-96.39%)
hadoop-docker-liteDocker build project to setup a lightweight hadoop cluster containing hadoop, pig, zookeeper, hbase, phoenix, storm, kafka, kafka manager
Stars: ✭ 24 (-98.71%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-95.37%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-99.14%)
Covid19TrackerA Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-96.5%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-96.5%)
blogblog entries
Stars: ✭ 39 (-97.9%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-98.65%)
StetlStetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (-96.55%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (-70.2%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (-71.01%)
CasperA compiler for automatically re-targeting sequential Java code to Apache Spark.
Stars: ✭ 45 (-97.58%)
visionsType System for Data Analysis in Python
Stars: ✭ 136 (-92.67%)
Hale(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (-95.47%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+32.49%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-96.55%)
Spark-PMoFSpark Shuffle Optimization with RDMA+AEP
Stars: ✭ 28 (-98.49%)
Spark GbtlrHybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-95.64%)
Awesome Flink😎 A curated list of amazingly awesome Flink and Flink ecosystem resources
Stars: ✭ 530 (-71.44%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (-71.44%)
Spark Submit UiThis is a based on playframwork for submit spark app
Stars: ✭ 53 (-97.14%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (-72.68%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-96.61%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-27.91%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (-73.71%)
sentry-sparkApache Spark Sentry Integration
Stars: ✭ 14 (-99.25%)
Spark RedisA connector for Spark that allows reading and writing to/from Redis cluster
Stars: ✭ 773 (-58.35%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-95.1%)
Gis Tools For HadoopThe GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.
Stars: ✭ 485 (-73.87%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-97.25%)
School Of SreAt LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Stars: ✭ 5,141 (+176.99%)
KgtkKnowledge Graph Toolkit
Stars: ✭ 81 (-95.64%)
Spark NkpNatural Korean Processor for Apache Spark
Stars: ✭ 50 (-97.31%)
PointblankData validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (-74.14%)