All Projects β†’ houchengbin β†’ NetEmb-Datasets

houchengbin / NetEmb-Datasets

Licence: MIT License
A collection of real-world networks/graphs for Network Embedding

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to NetEmb-Datasets

OpenANE
OpenANE: the first Open source framework specialized in Attributed Network Embedding. The related paper was accepted by Neurocomputing. https://doi.org/10.1016/j.neucom.2020.05.080
Stars: ✭ 39 (+116.67%)
Mutual labels:  network-embedding, attributed-networks
disent
🧢 Modular VAE disentanglement framework for python built with PyTorch Lightning β–Έ Including metrics and datasets β–Έ With strongly supervised, weakly supervised and unsupervised methods β–Έ Easily configured and run with Hydra config β–Έ Inspired by disentanglement_lib
Stars: ✭ 41 (+127.78%)
Mutual labels:  datasets
kaggle-code
A repository for some of the code I used in kaggle data science & machine learning tasks.
Stars: ✭ 100 (+455.56%)
Mutual labels:  datasets
databrewer
The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!
Stars: ✭ 39 (+116.67%)
Mutual labels:  datasets
the-weather-scraper
A Lightweight Weather Scraper
Stars: ✭ 56 (+211.11%)
Mutual labels:  datasets
ml4se
A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering
Stars: ✭ 46 (+155.56%)
Mutual labels:  datasets
text-classification-small-datasets
Building a text classifier with extremely small datasets
Stars: ✭ 34 (+88.89%)
Mutual labels:  datasets
TADW
Network Representation Learning with Rich Text Information (IJCAI 2015)
Stars: ✭ 42 (+133.33%)
Mutual labels:  network-embedding
TSForecasting
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.
Stars: ✭ 53 (+194.44%)
Mutual labels:  datasets
ck-env
CK repository with components and automation actions to enable portable workflows across diverse platforms including Linux, Windows, MacOS and Android. It includes software detection plugins and meta packages (code, data sets, models, scripts, etc) with the possibility of multiple versions to co-exist in a user or system environment:
Stars: ✭ 67 (+272.22%)
Mutual labels:  datasets
CIKM18-LCVA
Code for CIKM'18 paper, Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects.
Stars: ✭ 13 (-27.78%)
Mutual labels:  network-embedding
bnk48 photo datasets
BNK48 Photo Datasets
Stars: ✭ 12 (-33.33%)
Mutual labels:  datasets
download audioset
πŸ“ This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Stars: ✭ 53 (+194.44%)
Mutual labels:  datasets
panoptic parts
This repository contains code and tools for reading, processing, evaluating on, and visualizing Panoptic Parts datasets. Moreover, it contains code for reproducing our CVPR 2021 paper results.
Stars: ✭ 82 (+355.56%)
Mutual labels:  datasets
podium
Podium: a framework agnostic Python NLP library for data loading and preprocessing
Stars: ✭ 55 (+205.56%)
Mutual labels:  datasets
awesome-sweden-datasets
A curated list of awesome datasets to use when coding for the Swedish market.
Stars: ✭ 17 (-5.56%)
Mutual labels:  datasets
RData.jl
Read R data files from Julia
Stars: ✭ 49 (+172.22%)
Mutual labels:  datasets
opendatasets
A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
Stars: ✭ 161 (+794.44%)
Mutual labels:  datasets
databrewer-recipes
DataBrewer Recipes Repository.
Stars: ✭ 19 (+5.56%)
Mutual labels:  datasets
recurrent-defocus-deblurring-synth-dual-pixel
Reference github repository for the paper "Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data". We propose a procedure to generate realistic DP data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Lev…
Stars: ✭ 30 (+66.67%)
Mutual labels:  datasets

NetEmb Datasets

A collection of real-world networks/graphs for Network Embedding

by Chengbin HOU 2018

Why this repository

As a beginner who has just entered this field, it is time-consuming to find datasets from different websites. And it might be painful to transform different formats in some required format. In this repository, we directly provide one of commonly-used formats as used in OpenANE and OpenNE. We hope this saves your time.

One of commonly-used formats:

*--------------- Structural Info (each row) --------------------*
adjlist: node_id1 node_id2 node_id3 ... (neighbors of node_id1)
or edgelist: node_id1 node_id2 weight (weight is optional)
*--------------- Attribute Info (each row) ---------------------*
node_id1 attr1 attr2 ...
*--------------- Label Info (each row) -------------------------*
node_id1 label1 label2 ...

Please consider citing the following paper(s) if this repository is useful for your research.
For static networks:

@article{hou2020RoSANE,
  title={Ro{SANE}: Robust and Scalable Attributed Network Embedding for Sparse Networks},
  author={Hou, Chengbin and He, Shan and Tang, Ke},
  journal={Neurocomputing},
  year={2020},
  publisher={Elsevier},
  url={https://doi.org/10.1016/j.neucom.2020.05.080},
  doi={10.1016/j.neucom.2020.05.080},
}

For dynamic networks:

@article{hou2020glodyne,
    title={GloDyNE: Global Topology Preserving Dynamic Network Embedding},
    author={Hou, Chengbin and Zhang, Han and He, Shan and Tang, Ke},
    journal={IEEE Transactions on Knowledge and Data Engineering},
    year={2020},
    doi={10.1109/TKDE.2020.3046511}
}
@article{hou2021robust,
  title={Robust Dynamic Network Embedding via Ensembles},
  author={Hou, Chengbin and Fu, Guoji and Yang, Peng and He, Shan and Tang, Ke},
  journal={arXiv preprint arXiv:2105.14557},
  year={2021}
}

Original datasets

Due to the storage limit in Github, we only provide well-transformed files in the format as described above.
Nevertheless, we also offer the hyper-links to the corresponding original datasets before transformation: cora, citeseer, pubmed, dblp, and mit, stanford, nyu, uIllinois.

Contact me [email protected], if you need python script for such transformation or any other questions.

Contribution

Please consider to contribute if you have dataset in the format as described above. We will announce your contribution in this repository.

Useful links to network/graph datasets

http://konect.cc/
http://networkrepository.com/
https://snap.stanford.edu/data/
http://snap.stanford.edu/biodata/index.html
https://linqs.soe.ucsc.edu/data
https://aminer.org/data
http://socialcomputing.asu.edu/pages/datasets
http://networkrepository.com/
http://cnets.indiana.edu/resources/data-repository/
https://sites.google.com/site/ucinetsoftware/datasets
http://vlado.fmf.uni-lj.si/pub/networks/data/
http://www.sociopatterns.org/datasets/
http://networksciencebook.com/translations/en/resources/data.html

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].