All Projects → yuantiku → Ytk Learn

yuantiku / Ytk Learn

Licence: mit
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Ytk Learn

Xlearning Xdml
extremely distributed machine learning
Stars: ✭ 113 (-66.47%)
Mutual labels:  spark, hadoop, distributed
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+1578.34%)
Mutual labels:  spark, hadoop, distributed
Ruby Spark
Ruby wrapper for Apache Spark
Stars: ✭ 221 (-34.42%)
Mutual labels:  spark, distributed
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-94.07%)
Mutual labels:  spark, hadoop
yuzhouwan
Code Library for My Blog
Stars: ✭ 39 (-88.43%)
Mutual labels:  spark, hadoop
Ballista
Distributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+574.78%)
Mutual labels:  spark, distributed
Javaorbigdata Interview
Java开发者或者大数据开发者面试知识点整理
Stars: ✭ 203 (-39.76%)
Mutual labels:  spark, hadoop
spark-util
low-level helpers for Apache Spark libraries and tests
Stars: ✭ 16 (-95.25%)
Mutual labels:  spark, hadoop
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (-52.23%)
Mutual labels:  spark, hadoop
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-67.06%)
Mutual labels:  spark, hadoop
BigData-News
基于Spark2.2新闻网大数据实时系统项目
Stars: ✭ 36 (-89.32%)
Mutual labels:  spark, hadoop
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-95.85%)
Mutual labels:  spark, hadoop
Js Spark
Realtime calculation distributed system. AKA distributed lodash
Stars: ✭ 187 (-44.51%)
Mutual labels:  spark, distributed
Deeplearning4j
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+3543.03%)
Mutual labels:  spark, hadoop
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-36.2%)
Mutual labels:  spark, hadoop
Big Whale
Spark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-51.63%)
Mutual labels:  spark, hadoop
swordfish
Open-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-89.61%)
Mutual labels:  spark, hadoop
Elasticluster
Create clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-11.57%)
Mutual labels:  spark, hadoop
Aliyun Emapreduce Datasources
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
Stars: ✭ 132 (-60.83%)
Mutual labels:  spark, hadoop
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-55.49%)
Mutual labels:  spark, hadoop

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms. It runs on single, multiple machines and major distributed environments(hadoop, spark),and supports major operating systems(Linux, Windows, Mac OS),the communication of distributed environments is implemented based on ytk-mp4j which is pure java, mpi-like message passing interface.

Features

  • Supports most of operating systems: Linux, Mac OS, Windows
  • Supports various platforms: single machine, common cluster, hadoop, spark
  • Supports local file system and hdfs file system
  • Provides uniform file system interface and can be applied to other file systems easily.
  • Provides user friendly codes for online prediction.
  • Without complex installation, only needs Java SE Runtime Environment 8 installation.

For more details, refer to features

Documents

Experiments

We compare our GBDT with XGBoost and LightGBM, see gbdt experiments for more details.

Environment Requirements

To run or develop ytk-learn,just install JRE 8 or JDK 8 and set JAVA_HOME.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].