Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (+66.01%)

Mutual labels: spark, hadoop

Big Data Rosetta Code

Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

Stars: ✭ 254 (+25.12%)

Mutual labels: spark, bigdata

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+1778.33%)

Mutual labels: spark, hadoop

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+702.96%)

Mutual labels: hadoop, spark

Airflow Pipeline

An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR

Stars: ✭ 128 (-36.95%)

Mutual labels: spark, hadoop

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (+103.94%)

Mutual labels: spark, hadoop

Alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Stars: ✭ 5,379 (+2549.75%)

Mutual labels: spark, hadoop

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+5815.76%)

Mutual labels: spark, hadoop

Kafka Storm Starter

Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.

Stars: ✭ 728 (+258.62%)

Mutual labels: spark, storm

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+124.63%)

Mutual labels: spark, storm

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-97.54%)

Mutual labels: hadoop, bigdata

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+747.78%)

Mutual labels: spark, bigdata

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-65.02%)

Mutual labels: spark, bigdata

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-44.33%)

Mutual labels: spark, hadoop

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+708.87%)

Mutual labels: spark, hadoop

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-34.98%)

Mutual labels: spark, hadoop

Algorithms Leetcode Javascript

Algorithms resolution in Javascript. Leetcode - Geeksforgeeks - Careercup

Stars: ✭ 157 (-22.66%)

Mutual labels: interview

Software Engineer Interview Questions

A lot of questions and links to prepare yourself for an interview.

Stars: ✭ 176 (-13.3%)

Mutual labels: interview

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (-25.12%)

Mutual labels: spark

Hadoop Common

Mirror of Apache Hadoop common

Stars: ✭ 155 (-23.65%)

Mutual labels: hadoop

Hive Jdbc Uber Jar

Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version

Stars: ✭ 188 (-7.39%)

Mutual labels: hadoop

Spark Kafka Writer

Write your Spark data to Kafka seamlessly

Stars: ✭ 175 (-13.79%)

Mutual labels: spark

Learningapachespark

LearningApacheSpark

Stars: ✭ 155 (-23.65%)

Mutual labels: spark

Interview

写在2019年后的蚂蚁、头条、拼多多的面试总结

Stars: ✭ 155 (-23.65%)

Mutual labels: interview

Kraps Rpc

A RPC framework leveraging Spark RPC module

Stars: ✭ 175 (-13.79%)

Mutual labels: spark

Nmflibrary

MATLAB library for non-negative matrix factorization (NMF): Version 1.8.1

Stars: ✭ 153 (-24.63%)

Mutual labels: bigdata

Sparkmonitor

Monitor Apache Spark from Jupyter Notebook

Stars: ✭ 154 (-24.14%)

Mutual labels: spark

Ballista

Distributed compute platform implemented in Rust, and powered by Apache Arrow.

Stars: ✭ 2,274 (+1020.2%)

Mutual labels: spark

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (-7.88%)

Mutual labels: spark

Spark

Firely's open source FHIR server

Stars: ✭ 174 (-14.29%)

Mutual labels: spark

Javainterview

最全的Java技术知识点，以及Java源码分析。为开源贡献自己的一份力。

Stars: ✭ 154 (-24.14%)

Mutual labels: bigdata

Movie recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Stars: ✭ 2,092 (+930.54%)

Mutual labels: hadoop

Spark Nlp

State of the Art Natural Language Processing

Stars: ✭ 2,518 (+1140.39%)

Mutual labels: spark

Quill

Compile-time Language Integrated Queries for Scala

Stars: ✭ 1,998 (+884.24%)

Mutual labels: spark

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (-24.63%)

Mutual labels: spark

Interview Questions

List of all the Interview questions practiced from online resources and books

Stars: ✭ 187 (-7.88%)

Mutual labels: interview

Interview

Everything you need to prepare for your technical interview

Stars: ✭ 14,788 (+7184.73%)

Mutual labels: interview

61-120 of 1094 similar projects

‹

›

next*5