Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

yuantiku / Ytk Learn

Licence: mit

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Programming Languages

java

68154 projects - #9 most used programming language

Labels

machine-learning spark distributed hadoop logistic-regression factorization-machines

Projects that are alternatives of or similar to Ytk Learn

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-66.47%)

Mutual labels: spark, hadoop, distributed

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+1578.34%)

Mutual labels: spark, hadoop, distributed

Ruby Spark

Ruby wrapper for Apache Spark

Stars: ✭ 221 (-34.42%)

Mutual labels: spark, distributed

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-94.07%)

Mutual labels: spark, hadoop

yuzhouwan

Code Library for My Blog

Stars: ✭ 39 (-88.43%)

Mutual labels: spark, hadoop

Ballista

Distributed compute platform implemented in Rust, and powered by Apache Arrow.

Stars: ✭ 2,274 (+574.78%)

Mutual labels: spark, distributed

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (-39.76%)

Mutual labels: spark, hadoop

spark-util

low-level helpers for Apache Spark libraries and tests

Stars: ✭ 16 (-95.25%)

Mutual labels: spark, hadoop

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (-52.23%)

Mutual labels: spark, hadoop

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-67.06%)

Mutual labels: spark, hadoop

BigData-News

基于Spark2.2新闻网大数据实时系统项目

Stars: ✭ 36 (-89.32%)

Mutual labels: spark, hadoop

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-95.85%)

Mutual labels: spark, hadoop

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (-44.51%)

Mutual labels: spark, distributed

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+3543.03%)

Mutual labels: spark, hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-36.2%)

Mutual labels: spark, hadoop

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-51.63%)

Mutual labels: spark, hadoop

swordfish

Open-source distribute workflow schedule tools, also support streaming task.

Stars: ✭ 35 (-89.61%)

Mutual labels: spark, hadoop

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-11.57%)

Mutual labels: spark, hadoop

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-60.83%)

Mutual labels: spark, hadoop

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-55.49%)

Mutual labels: spark, hadoop

View All Similar Projects ➔

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms. It runs on single, multiple machines and major distributed environments(hadoop, spark)，and supports major operating systems(Linux, Windows, Mac OS)，the communication of distributed environments is implemented based on ytk-mp4j which is pure java, mpi-like message passing interface.

Features

Supports most of operating systems: Linux, Mac OS, Windows
Supports various platforms: single machine, common cluster, hadoop, spark
Supports local file system and hdfs file system
Provides uniform file system interface and can be applied to other file systems easily.
Provides user friendly codes for online prediction.
Without complex installation, only needs Java SE Runtime Environment 8 installation.

For more details, refer to features

Documents

Experiments

We compare our GBDT with XGBoost and LightGBM, see gbdt experiments for more details.

Environment Requirements

To run or develop ytk-learn，just install JRE 8 or JDK 8 and set JAVA_HOME.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 337

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗