Hadoop PotA scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-90.24%)
shamashAutoscaling for Google Cloud Dataproc
Stars: ✭ 31 (-62.2%)
hiveql-parserHiveQL Parser. Parse HiveQL code and print AST in JSON format if success, else print well formed syntax error message.
Stars: ✭ 25 (-69.51%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+304.88%)
Meme GeneratorMemeGen is a web application where the user gives an image as input and our tool generates a meme at one click for the user.
Stars: ✭ 57 (-30.49%)
Gather DeploymentGathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+297.56%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-68.29%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-21.95%)
HouseProof of Concept and Research repository.
Stars: ✭ 37 (-54.88%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+574.39%)
Search Ads Web ServiceOnline search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Stars: ✭ 30 (-63.41%)
chef-chromeChef cookbook to install Google Chrome browser
Stars: ✭ 16 (-80.49%)
KazooKazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
Stars: ✭ 1,161 (+1315.85%)
Dis Seckill👊SpringBoot+Zookeeper+Dubbo打造分布式高并发商品秒杀系统
Stars: ✭ 315 (+284.15%)
np-flinkflink详细学习实践
Stars: ✭ 26 (-68.29%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-69.51%)
SparklintA tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+285.37%)
big-data-exploration[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (-47.56%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-32.93%)
CookFair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Stars: ✭ 314 (+282.93%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-52.44%)
ChoregraphieChoregraphie offers primitive to coordinate convergence of chef resources.
Stars: ✭ 24 (-70.73%)
MleapMLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+1402.44%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1102.44%)
Javafamily【Java面试+Java学习指南】 一份涵盖大部分Java程序员所需要掌握的核心知识。
Stars: ✭ 28,668 (+34860.98%)
Cp Helm ChartsThe Confluent Platform Helm charts enable you to deploy Confluent Platform services on Kubernetes for development, test, and proof of concept environments.
Stars: ✭ 539 (+557.32%)
go-solrsolr go client from sendgrid, zookeeper aware, incorporates retries
Stars: ✭ 39 (-52.44%)
openverse-catalogIdentifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-67.07%)
DeltaAn open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+4659.76%)
last fmA simple app to demonstrate a testable, maintainable, and scalable architecture for flutter. flutter_bloc, get_it, hive, and REST API are some of the tech stacks used in this project.
Stars: ✭ 134 (+63.41%)
Spark GbtlrHybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-1.22%)
EasyrpcEasyRpc is a simple, high-performance, easy-to-use RPC framework based on Netty, ZooKeeper and ProtoStuff.
Stars: ✭ 79 (-3.66%)
Lpa DetectorOptimize and improve the Label propagation algorithm
Stars: ✭ 75 (-8.54%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+556.1%)
aixResources for AIX hosts
Stars: ✭ 22 (-73.17%)
Awesome AdaA curated list of awesome resources related to the Ada and SPARK programming language
Stars: ✭ 299 (+264.63%)
ActivemqDevelopment repository for activemq Chef Cookbook
Stars: ✭ 19 (-76.83%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+546.34%)
sitecore-packerPacker templates for Sitecore development with IIS, SOLR and SQL Server on Windows
Stars: ✭ 19 (-76.83%)
docker-repoA repository stores some dockerfiles or docker-compose files for quickly starting service or service cluster.
Stars: ✭ 26 (-68.29%)
awesome-AI-kubernetes❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+15.85%)
Chef Vaultchef-vault cookbook
Stars: ✭ 63 (-23.17%)
kzmonitorkafka zookeeper monitor
Stars: ✭ 34 (-58.54%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+525.61%)
spark-druid-olapSparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (+248.78%)