All Projects → razrLeLe → fastwalk

razrLeLe / fastwalk

Licence: MIT license
A multi-thread implementation of node2vec random walk.

Programming Languages

C++
36643 projects - #6 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to fastwalk

FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-33.33%)
Mutual labels:  embedding, node2vec
RolX
An alternative implementation of Recursive Feature and Role Extraction (KDD11 & KDD12)
Stars: ✭ 52 (+116.67%)
Mutual labels:  embedding, node2vec
Awesome Community Detection
A curated list of community detection research papers with implementations.
Stars: ✭ 1,874 (+7708.33%)
Mutual labels:  embedding, node2vec
walklets
A lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).
Stars: ✭ 94 (+291.67%)
Mutual labels:  embedding, node2vec
NMFADMM
A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).
Stars: ✭ 39 (+62.5%)
Mutual labels:  embedding, node2vec
AnnA Anki neuronal Appendix
Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Stars: ✭ 39 (+62.5%)
Mutual labels:  embedding
playing with vae
Comparing FC VAE / FCN VAE / PCA / UMAP on MNIST / FMNIST
Stars: ✭ 53 (+120.83%)
Mutual labels:  embedding
exembed
Go Embed experiments
Stars: ✭ 27 (+12.5%)
Mutual labels:  embedding
KGReasoning
Multi-Hop Logical Reasoning in Knowledge Graphs
Stars: ✭ 197 (+720.83%)
Mutual labels:  embedding
Siamese Triplet
Siamese and triplet networks with online pair/triplet mining in PyTorch
Stars: ✭ 2,564 (+10583.33%)
Mutual labels:  embedding
Milvus
An open-source vector database for embedding similarity search and AI applications.
Stars: ✭ 9,015 (+37462.5%)
Mutual labels:  embedding
pymde
Minimum-distortion embedding with PyTorch
Stars: ✭ 420 (+1650%)
Mutual labels:  embedding
text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
Stars: ✭ 81 (+237.5%)
Mutual labels:  embedding
XLNet embbeding
Using XLNet as Embedding of Keras
Stars: ✭ 32 (+33.33%)
Mutual labels:  embedding
Chinese Word Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
Stars: ✭ 9,548 (+39683.33%)
Mutual labels:  embedding
TransE-Knowledge-Graph-Embedding
TensorFlow implementation of TransE and its extended models for Knowledge Representation Learning
Stars: ✭ 64 (+166.67%)
Mutual labels:  embedding
GLOM-TensorFlow
An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data
Stars: ✭ 32 (+33.33%)
Mutual labels:  embedding
Embedding
Embedding模型代码和学习笔记总结
Stars: ✭ 25 (+4.17%)
Mutual labels:  embedding
Cool-NLPCV
Some Cool NLP and CV Repositories and Solutions (收集NLP中常见任务的开源解决方案、数据集、工具、学习资料等)
Stars: ✭ 143 (+495.83%)
Mutual labels:  embedding
nodebb-plugin-ns-embed
Embed media and rich content in posts: YouTube, Vimeo, Twitch and more.
Stars: ✭ 27 (+12.5%)
Mutual labels:  embedding

fastwalk

A multi-thread implementation of node2vec random walk.

Introduction

This repository provides a multi-thread implementation of node2vec random walk, with alias table based on LRU cache, it can process with limited memory usage, so that walking through a giant graph on a single machine can be possible.

Tested for a graph that contains 23 thousand nodes and 23 million edges, with parameter

--walk_length=80 --num_walks=10 --workers=20 --max_nodes=50000 --max_edges=100000 --p=10 --q=0.01

only 11GB memory used, and finished walking within 2 hours.

Visit https://blog.razrlele.com/p/2650 for more.

Prerequisites

  • g++ 4.8+.

Usage

Prepare input data with format as below:

node1 node2 [edge_weight]
node2 node3 [edge_weight]
...

edge_weight is 1.0 as default.

compile:

make

run with:

./fastwalk --edge_list <path_to_edgelist> --output <path_to_output> --delimiter space --p 10 --q 0.01 --max_nodes 50000 --max_edges 50000 --workers 10

If you wanna walk faster, add more workers, if you wanna run with less memory consumption, decrease max_nodes or max_edges, checkout more information with

./fastwalk --help

Reference

node2vec: Scalable Feature Learning for Networks.
Aditya Grover and Jure Leskovec.
Knowledge Discovery and Data Mining, 2016.
https://arxiv.org/abs/1607.00653

Acknowledgements

I would like to thank Yuanyuan Zhu for discussions on the performance of node2vec, and thanks to Weichen Shen for his great work.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].