Spark FlamegraphEasy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (-43.4%)
MlfeatureFeature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-77.36%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-52.83%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-73.58%)
Play Reactive SlickThis is Play Template with a nice User Interface. If you want to use Play as web framework and Postgres as Database then this demo project can be used as a starting point for your application.
Stars: ✭ 40 (-24.53%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-49.06%)
DigitrecognizerJava Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-56.6%)
FlintA Time Series Library for Apache Spark
Stars: ✭ 878 (+1556.6%)
UrhoxUrho3D extension library
Stars: ✭ 13 (-75.47%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-79.25%)
Vagrant ProjectsVagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-35.85%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-50.94%)
PujanggaPujangga - Indonesian Natural Language Processing Tool with REST API, an Interface for InaNLP and Deeplearning4j's Word2Vec
Stars: ✭ 47 (-11.32%)
ChroniclerScala toolchain for InfluxDB
Stars: ✭ 24 (-54.72%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-45.28%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1783.02%)
PlayaccessloggerGenerates access logs compatible with Apache httpd (enhanced combined format)
Stars: ✭ 21 (-60.38%)
Sparkling WaterSparkling Water provides H2O functionality inside Spark cluster
Stars: ✭ 887 (+1573.58%)
Play SilhouetteSilhouette is an authentication library for Play Framework applications that supports several authentication methods, including OAuth1, OAuth2, OpenID, CAS, 2FA, TOTP, Credentials, Basic Authentication or custom authentication schemes.
Stars: ✭ 826 (+1458.49%)
Lagom ExampleExample usage of the Lagom Framework for writing Java-based microservices
Stars: ✭ 20 (-62.26%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1760.38%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-73.58%)
Delta ArchitectureStreaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Stars: ✭ 43 (-18.87%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-77.36%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-11.32%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-79.25%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1516.98%)
GatkOfficial code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+1790.57%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+1498.11%)
Spark NkpNatural Korean Processor for Apache Spark
Stars: ✭ 50 (-5.66%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+1652.83%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+1700%)
KyloKylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+1628.3%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+1690.57%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+59556.6%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1441.51%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-15.09%)
SnappydataProject SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
Stars: ✭ 995 (+1777.36%)
Play Json Extraplayframework2 json extra module. provide convenience functions for define Format, Reads, Writes
Stars: ✭ 20 (-62.26%)