All Projects → spark-http-stream → Similar Projects or Alternatives

399 Open source projects that are alternatives of or similar to spark-http-stream

This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.

Stars: ✭ 123 (+623.53%)

Mutual labels: spark

Hail

Scalable genomic data analysis.

Stars: ✭ 706 (+4052.94%)

Mutual labels: spark

yuzhouwan

Code Library for My Blog

Stars: ✭ 39 (+129.41%)

Mutual labels: spark

Scriptis

Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.

Stars: ✭ 696 (+3994.12%)

Mutual labels: spark

Deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Stars: ✭ 2,020 (+11782.35%)

Mutual labels: spark

Pyspark Example Project

Example project implementing best practices for PySpark ETL jobs and applications.

Stars: ✭ 633 (+3623.53%)

Mutual labels: spark

Hydro Serving

MLOps Platform

Stars: ✭ 213 (+1152.94%)

Mutual labels: spark

Dev Setup

macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.

Stars: ✭ 5,590 (+32782.35%)

Mutual labels: spark

Eat pyspark in 10 days

pyspark🍒🥭 is delicious，just eat it!😋😋

Stars: ✭ 116 (+582.35%)

Mutual labels: spark

Datafusion

DataFusion has now been donated to the Apache Arrow project

Stars: ✭ 611 (+3494.12%)

Mutual labels: spark

Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Stars: ✭ 70 (+311.76%)

Mutual labels: spark

Mongo Spark

The MongoDB Spark Connector

Stars: ✭ 588 (+3358.82%)

Mutual labels: spark

Teddy

Spark Streaming监控平台，支持任务部署与告警、自启动

Stars: ✭ 120 (+605.88%)

Mutual labels: spark

Sparklearning

Learning Apache spark,including code and data .Most part can run local.

Stars: ✭ 558 (+3182.35%)

Mutual labels: spark

Spark Knn

k-Nearest Neighbors algorithm on Spark

Stars: ✭ 205 (+1105.88%)

Mutual labels: spark

Justenoughscalaforspark

A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.

Stars: ✭ 538 (+3064.71%)

Mutual labels: spark

Elassandra

Elassandra = Elasticsearch + Apache Cassandra

Stars: ✭ 1,610 (+9370.59%)

Mutual labels: spark

Sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Stars: ✭ 513 (+2917.65%)

Mutual labels: spark

spark-util

low-level helpers for Apache Spark libraries and tests

Stars: ✭ 16 (-5.88%)

Mutual labels: spark

Magellan

Geo Spatial Data Analytics on Spark

Stars: ✭ 507 (+2882.35%)

Mutual labels: spark

Spark Lucenerdd

Spark RDD with Lucene's query and entity linkage capabilities

Stars: ✭ 114 (+570.59%)

Mutual labels: spark

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+70541.18%)

Mutual labels: spark

Mmlspark

Simple and Distributed Machine Learning

Stars: ✭ 2,899 (+16952.94%)

Mutual labels: spark

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+2582.35%)

Mutual labels: spark

Spark Mllib Twitter Sentiment Analysis

🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib

Stars: ✭ 113 (+564.71%)

Mutual labels: spark

Bigdataie

大数据博客、笔试题、教程、项目、面经的整理

Stars: ✭ 445 (+2517.65%)

Mutual labels: spark

spark-demos

Collection of different demo applications using Apache Spark

Stars: ✭ 15 (-11.76%)

Mutual labels: spark

High Performance Spark Examples

Examples for High Performance Spark

Stars: ✭ 436 (+2464.71%)

Mutual labels: spark

Python Bigdata

Data science and Big Data with Python

Stars: ✭ 112 (+558.82%)

Mutual labels: spark

Dji Firmware Tools

Tools for handling firmwares of DJI products, with focus on quadcopters.

Stars: ✭ 424 (+2394.12%)

Mutual labels: spark

Ballista

Distributed compute platform implemented in Rust, and powered by Apache Arrow.

Stars: ✭ 2,274 (+13276.47%)

Mutual labels: spark

Featran

A Scala feature transformation library for data science and machine learning

Stars: ✭ 420 (+2370.59%)

Mutual labels: spark

Elephas

Distributed Deep learning with Keras & Spark

Stars: ✭ 1,521 (+8847.06%)

Mutual labels: spark

Listenbrainz Server

Server for the ListenBrainz project

Stars: ✭ 420 (+2370.59%)

Mutual labels: spark

data processing course

Some class materials for a data processing course using PySpark

Stars: ✭ 50 (+194.12%)

Mutual labels: spark

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+2329.41%)

Mutual labels: spark

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+10817.65%)

Mutual labels: spark

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (+2335.29%)

Mutual labels: spark

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (+1000%)

Mutual labels: spark

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+2288.24%)

Mutual labels: spark

Bigdataclass

Two-day workshop that covers how to use R to interact databases and Spark

Stars: ✭ 110 (+547.06%)

Mutual labels: spark

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (+2252.94%)

Mutual labels: spark

Covid19Tracker

A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.

Stars: ✭ 65 (+282.35%)

Mutual labels: spark

Redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Stars: ✭ 20,147 (+118411.76%)

Mutual labels: spark

Distributed Dataset

A distributed data processing framework in Haskell.

Stars: ✭ 108 (+535.29%)

Mutual labels: spark

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+22329.41%)

Mutual labels: spark

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (+976.47%)

Mutual labels: spark

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (+2088.24%)

Mutual labels: spark

Hnswlib

Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs

Stars: ✭ 108 (+535.29%)

Mutual labels: spark

Sparkmeasure

This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.

Stars: ✭ 368 (+2064.71%)

Mutual labels: spark

spark-druid-olap

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Stars: ✭ 286 (+1582.35%)

Mutual labels: spark

Spark As Service Using Embedded Server

This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server

Stars: ✭ 46 (+170.59%)

Mutual labels: spark

Seldon Server

Machine Learning Platform and Recommendation Engine built on Kubernetes

Stars: ✭ 1,435 (+8341.18%)

Mutual labels: spark

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (+47.06%)

Mutual labels: spark

dllib

dllib is a distributed deep learning library running on Apache Spark

Stars: ✭ 32 (+88.24%)

Mutual labels: spark

prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Stars: ✭ 54 (+217.65%)

Mutual labels: spark

spark-extension

A library that provides useful extensions to Apache Spark and PySpark.

Stars: ✭ 25 (+47.06%)

Mutual labels: spark

Python Master Courses

人生苦短我用Python

Stars: ✭ 61 (+258.82%)

Mutual labels: spark

Dpark

Python clone of Spark, a MapReduce alike framework in Python