1. Distributed Graph AnalyticsDistributed Graph Analytics (DGA) is a compendium of graph analytics written for Bulk-Synchronous-Parallel (BSP) processing frameworks such as Giraph and GraphX. The analytics included are High Betweenness Set Extraction, Weakly Connected Components, Page Rank, Leaf Compression, and Louvain Modularity.
2. Correlation ApproximationSpark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets
5. grapheneNo description, website, or topics provided.
6. zephyrZephyr is a big data, platform agnostic ETL API, with Hadoop MapReduce, Storm, and other big data bindings.
7. watchmanWatchman: An open-source social-media event-detection system