cobra-policytoolManage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-79.49%)
Dji Firmware ToolsTools for handling firmwares of DJI products, with focus on quadcopters.
Stars: ✭ 424 (+443.59%)
flokkrDocumentation placeholder and utilities for all the other containers.
Stars: ✭ 30 (-61.54%)
hadoop-deployment-bashCode for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.
Stars: ✭ 31 (-60.26%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (+438.46%)
MLHadoopThis repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Stars: ✭ 50 (-35.9%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-30.77%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-69.23%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+1182.05%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-79.49%)
UBAUEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (-53.85%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+429.49%)
implyrSQL backend to dplyr for Impala
Stars: ✭ 74 (-5.13%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-82.05%)
datasqueezeHadoop utility to compact small files
Stars: ✭ 18 (-76.92%)
prestoTeradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
Stars: ✭ 91 (+16.67%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-26.92%)
liquibase-impalaLiquibase extension to add Impala Database support
Stars: ✭ 23 (-70.51%)
memex-gateGeneral Architecture for Text Engineering
Stars: ✭ 47 (-39.74%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-28.21%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-14.1%)
xxhadoopData Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (-52.56%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-84.62%)
corcAn ORC File Scheme for the Cascading data processing platform.
Stars: ✭ 14 (-82.05%)
IgniteApache Ignite
Stars: ✭ 4,027 (+5062.82%)
pyspark-ML-in-ColabPyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (-58.97%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-29.49%)
big-data-exploration[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (-44.87%)
Docker practiceLearn and understand Docker technologies, with real DevOps practice!
Stars: ✭ 19,768 (+25243.59%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-85.9%)
confluent-spark-avroSpark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-76.92%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-62.82%)
TensorflowonsparkTensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Stars: ✭ 3,748 (+4705.13%)
Lpa DetectorOptimize and improve the Label propagation algorithm
Stars: ✭ 75 (-3.85%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+792.31%)
Covid19TrackerA Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-16.67%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-79.49%)
knitDeprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead
Stars: ✭ 53 (-32.05%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+87.18%)
SparkmeasureThis is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Stars: ✭ 368 (+371.79%)
Utils4sscala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+1271.79%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1179.49%)
Winutilswinutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+742.31%)
blogblog entries
Stars: ✭ 39 (-50%)
SparkV🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-69.23%)
trembitaModel complex data transformation pipelines easily
Stars: ✭ 44 (-43.59%)
RoffildlibraryLibrary for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-19.23%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+711.54%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-67.95%)