All Projects → saurfang → Spark Tsne

saurfang / Spark Tsne

Licence: apache-2.0
Distributed t-SNE via Apache Spark

Programming Languages

scala
5932 projects

Labels

Projects that are alternatives of or similar to Spark Tsne

Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-9.27%)
Mutual labels:  spark
Spark Authorizer
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-6.62%)
Mutual labels:  spark
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (-2.65%)
Mutual labels:  spark
Isolation Forest
A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Stars: ✭ 139 (-7.95%)
Mutual labels:  spark
Data science blogs
A repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-7.95%)
Mutual labels:  spark
Technology Talk
汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+7937.09%)
Mutual labels:  spark
Horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+7809.27%)
Mutual labels:  spark
Benchm Ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Stars: ✭ 1,835 (+1115.23%)
Mutual labels:  spark
Rasterframes
Geospatial Raster support for Spark DataFrames
Stars: ✭ 142 (-5.96%)
Mutual labels:  spark
Pyspark Learning
Updated repository
Stars: ✭ 147 (-2.65%)
Mutual labels:  spark
Sparkling Graph
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-7.95%)
Mutual labels:  spark
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-7.28%)
Mutual labels:  spark
Spark Cassandra Connector
DataStax Spark Cassandra Connector
Stars: ✭ 1,816 (+1102.65%)
Mutual labels:  spark
Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+1105.96%)
Mutual labels:  spark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-0.66%)
Mutual labels:  spark
Apache Spark Node
Node.js bindings for Apache Spark DataFrame APIs
Stars: ✭ 136 (-9.93%)
Mutual labels:  spark
Nd4j
Fast, Scientific and Numerical Computing for the JVM (NDArrays)
Stars: ✭ 1,742 (+1053.64%)
Mutual labels:  spark
Spark Ml Source Analysis
spark ml 算法原理剖析以及具体的源码实现分析
Stars: ✭ 1,873 (+1140.4%)
Mutual labels:  spark
Aztk
AZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure
Stars: ✭ 152 (+0.66%)
Mutual labels:  spark
Datacompy
Pandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-2.65%)
Mutual labels:  spark

spark-tsne

Join the chat at https://gitter.im/saurfang/spark-tsne Build Status Distributed t-SNE with Apache Spark. WIP...

t-SNE is a dimension reduction technique that is particularly good for visualizing high dimensional data. This is an attempt to implement this algorithm using Spark to leverage distributed computing power.

The project is still in progress of replicating reference implementations from the original papers. Spark specific optimizations will be the next goal once the correctness is verified.

Currently I'm showcasing this using the standard MNIST handwriting recognition dataset. I have created a WebGL player (built using pixi.js) to visualize the inner workings as well as the final results of t-SNE. If a WebGL is unavailable for you, you may checkout the d3.js player instead.

Credits

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].