macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.

Stars: ✭ 5,590 (+2842.11%)

Mutual labels: spark

Milvus

An open-source vector database for embedding similarity search and AI applications.

Stars: ✭ 9,015 (+4644.74%)

Mutual labels: nearest-neighbor-search

Smile

Statistical Machine Intelligence & Learning Engine

Stars: ✭ 5,412 (+2748.42%)

Mutual labels: nearest-neighbor-search

Laravel Spark Google2fa

Google Authenticator support for Laravel Spark

Stars: ✭ 86 (-54.74%)

Mutual labels: spark

Mongo Spark

The MongoDB Spark Connector

Stars: ✭ 588 (+209.47%)

Mutual labels: spark

Opaque

An encrypted data analytics platform

Stars: ✭ 129 (-32.11%)

Mutual labels: spark

Sparklearning

Learning Apache spark,including code and data .Most part can run local.

Stars: ✭ 558 (+193.68%)

Mutual labels: spark

Flint

Webex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)

Stars: ✭ 85 (-55.26%)

Mutual labels: spark

Spark Daria

Essential Spark extensions and helper methods ✨😲

Stars: ✭ 553 (+191.05%)

Mutual labels: spark

Sparkmonitor

Monitor Apache Spark from Jupyter Notebook

Stars: ✭ 154 (-18.95%)

Mutual labels: spark

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+2876.84%)

Mutual labels: spark

Spark States

Custom state store providers for Apache Spark

Stars: ✭ 83 (-56.32%)

Mutual labels: spark

Cdap

An open source framework for building data analytic applications.

Stars: ✭ 509 (+167.89%)

Mutual labels: spark

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+805.79%)

Mutual labels: spark

Pointblank

Data validation and organization of metadata for data frames and database tables

Stars: ✭ 480 (+152.63%)

Mutual labels: spark

Spark Dependencies

Spark job for dependency links

Stars: ✭ 82 (-56.84%)

Mutual labels: spark

Spark

Cross-platform real-time collaboration client optimized for business and organizations.

Stars: ✭ 471 (+147.89%)

Mutual labels: spark

Spark Structured Streaming Examples

Spark Structured Streaming / Kafka / Cassandra / Elastic

Stars: ✭ 168 (-11.58%)

Mutual labels: spark

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+140%)

Mutual labels: spark

Lehar

Visualize data using relative ordering

Stars: ✭ 81 (-57.37%)

Mutual labels: spark

Bigdataie

大数据博客、笔试题、教程、项目、面经的整理

Stars: ✭ 445 (+134.21%)

Mutual labels: spark

Airflow Pipeline

An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR

Stars: ✭ 128 (-32.63%)

Mutual labels: spark

High Performance Spark Examples

Examples for High Performance Spark

Stars: ✭ 436 (+129.47%)

Mutual labels: spark

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-58.42%)

Mutual labels: spark

Dji Firmware Tools

Tools for handling firmwares of DJI products, with focus on quadcopters.

Stars: ✭ 424 (+123.16%)

Mutual labels: spark

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (-19.47%)

Mutual labels: spark

Featran

A Scala feature transformation library for data science and machine learning

Stars: ✭ 420 (+121.05%)

Mutual labels: spark

Home

ApacheCN 开源组织：公告、介绍、成员、活动、交流方式

Stars: ✭ 1,199 (+531.05%)

Mutual labels: spark

Listenbrainz Server

Server for the ListenBrainz project

Stars: ✭ 420 (+121.05%)

Mutual labels: spark

Spring Boot Quick

🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如：rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌

Stars: ✭ 1,819 (+857.37%)

Mutual labels: spark

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+117.37%)

Mutual labels: spark

Gann

gann(go-approximate-nearest-neighbor) is a library for Approximate Nearest Neighbor Search written in Go

Stars: ✭ 75 (-60.53%)

Mutual labels: nearest-neighbor-search

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (+117.89%)

Mutual labels: spark

Xsql

Unified SQL Analytics Engine Based on SparkSQL

Stars: ✭ 176 (-7.37%)

Mutual labels: spark

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+113.68%)

Mutual labels: spark

Ds Cheatsheets

List of Data Science Cheatsheets to rule the world

Stars: ✭ 9,452 (+4874.74%)

Mutual labels: spark

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (+110.53%)

Mutual labels: spark

Lift

The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.

Stars: ✭ 127 (-33.16%)

Mutual labels: spark

Redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Stars: ✭ 20,147 (+10503.68%)

Mutual labels: spark

Apache Spark Hands On

Educational notes,Hands on problems w/ solutions for hadoop ecosystem

Stars: ✭ 74 (-61.05%)

Mutual labels: spark

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+1906.84%)

Mutual labels: spark

Spark Tsne

Distributed t-SNE via Apache Spark

Stars: ✭ 151 (-20.53%)

Mutual labels: spark

Tensorflowonspark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Stars: ✭ 3,748 (+1872.63%)

Mutual labels: spark

Spark Twitter Stream Example

"Sentiment analysis" on a live Twitter feed with Apache Spark and Apache Bahir

Stars: ✭ 73 (-61.58%)

Mutual labels: spark

Spark Structured Streaming Book

The Internals of Spark Structured Streaming

Stars: ✭ 371 (+95.26%)

Mutual labels: spark

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (-33.68%)

Mutual labels: spark

Kamu Cli

Next generation tool for decentralized exchange and transformation of semi-structured data

Stars: ✭ 69 (-63.68%)

Mutual labels: spark

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (-1.58%)

Mutual labels: spark

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (-3.68%)

Mutual labels: spark

Sparkstreaming

💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算)；🚀 支持运行过程中增删topic；🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。

Stars: ✭ 179 (-5.79%)

Mutual labels: spark

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+996.84%)

Mutual labels: spark

Learningapachespark

LearningApacheSpark

Stars: ✭ 155 (-18.42%)

Mutual labels: spark

Knn Matting

Source Code for KNN Matting, CVPR 2012 / TPAMI 2013. MATLAB code ready to run. Simple and robust implementation under 40 lines.

Stars: ✭ 130 (-31.58%)

Mutual labels: nearest-neighbor-search

Spark python ml examples

Spark 2.0 Python Machine Learning examples