All Projects → malllabiisc → NeuralDater

malllabiisc / NeuralDater

Licence: Apache-2.0 license
ACL 2018: Dating Documents using Graph Convolution Networks

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to NeuralDater

DSTGCN
codes of Deep Spatio-Temporal Graph Convolutional Network for Traffic Accident Prediction
Stars: ✭ 37 (-38.33%)
Mutual labels:  graph-convolutional-networks
Spatio-Temporal-papers
This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.
Stars: ✭ 180 (+200%)
Mutual labels:  graph-convolutional-networks
ets
Command output timestamper
Stars: ✭ 71 (+18.33%)
Mutual labels:  timestamping
awesome-efficient-gnn
Code and resources on scalable and efficient Graph Neural Networks
Stars: ✭ 498 (+730%)
Mutual labels:  graph-convolutional-networks
GNN-Recommender-Systems
An index of recommendation algorithms that are based on Graph Neural Networks.
Stars: ✭ 505 (+741.67%)
Mutual labels:  graph-convolutional-networks
graphml-tutorials
Tutorials for Machine Learning on Graphs
Stars: ✭ 125 (+108.33%)
Mutual labels:  graph-convolutional-networks
AliNet
Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation, AAAI 2020
Stars: ✭ 89 (+48.33%)
Mutual labels:  graph-convolutional-networks
Extremely-Fine-Grained-Entity-Typing
PyTorch implementation of our paper "Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing" (NAACL19)
Stars: ✭ 89 (+48.33%)
Mutual labels:  graph-convolutional-networks
chainer-graph-cnn
Chainer implementation of 'Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering' (https://arxiv.org/abs/1606.09375)
Stars: ✭ 67 (+11.67%)
Mutual labels:  graph-convolutional-networks
kglib
TypeDB-ML is the Machine Learning integrations library for TypeDB
Stars: ✭ 523 (+771.67%)
Mutual labels:  graph-convolutional-networks
text gcn tutorial
A tutorial & minimal example (8min on CPU) for Graph Convolutional Networks for Text Classification. AAAI 2019
Stars: ✭ 23 (-61.67%)
Mutual labels:  graph-convolutional-networks
SimP-GCN
Implementation of the WSDM 2021 paper "Node Similarity Preserving Graph Convolutional Networks"
Stars: ✭ 43 (-28.33%)
Mutual labels:  graph-convolutional-networks
resolutions-2019
A list of data mining and machine learning papers that I implemented in 2019.
Stars: ✭ 19 (-68.33%)
Mutual labels:  graph-convolutional-networks
SelfGNN
A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling" paper, which appeared in The International Workshop on Self-Supervised Learning for the Web (SSL'21) @ the Web Conference 2021 (WWW'21).
Stars: ✭ 24 (-60%)
Mutual labels:  graph-convolutional-networks
Representation Learning on Graphs with Jumping Knowledge Networks
Representation Learning on Graphs with Jumping Knowledge Networks
Stars: ✭ 31 (-48.33%)
Mutual labels:  graph-convolutional-networks
L2-GCN
[CVPR 2020] L2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks
Stars: ✭ 26 (-56.67%)
Mutual labels:  graph-convolutional-networks
PaiConvMesh
Official repository for the paper "Learning Local Neighboring Structure for Robust 3D Shape Representation"
Stars: ✭ 19 (-68.33%)
Mutual labels:  graph-convolutional-networks
STEP
Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits
Stars: ✭ 39 (-35%)
Mutual labels:  graph-convolutional-networks
pb-gcn
Code for the BMVC paper (http://bmvc2018.org/contents/papers/1003.pdf)
Stars: ✭ 32 (-46.67%)
Mutual labels:  graph-convolutional-networks
TextCategorization
⚡ Using deep learning (MLP, CNN, Graph CNN) to classify text in TensorFlow.
Stars: ✭ 30 (-50%)
Mutual labels:  graph-convolutional-networks

Dating Documents using Graph Convolution Networks

Conference Paper Slides Poster

Source code and dataset for ACL 2018 paper: Document Dating using Graph Convolution Networks.

Overview of NeuralDater (proposed method). NeuralDater exploits syntactic and temporal structure in a document to learn effective representation, which in turn are used to predict the document time. NeuralDater uses a Bi-directional LSTM (Bi-LSTM), two Graph Convolution Networks (GCN) – one over the dependency tree and the other over the document’s temporal graph – along with a softmax classifier, all trained end-to-end jointly. Please refer paper for more details.

Dependencies

  • Compatible with TensorFlow 1.x and Python 3.x.
  • Dependencies can be installed using requirements.txt.

Dataset:

  • Download the processed version (includes dependency and temporal graphs of each document) of NYT and APW datasets.

  • Unzip the .pkl file in data directory.

  • Documents are originally taken from NYT and APW section of Gigaword Corpus, 5th ed.

  • The structure of the processed input data is as follows.

    {
        "voc2id":   {"w1": 0, "w2": 1, ...},
        "et2id":    {"NONE":0, "INCLUDES": 1, "BEFORE":2, "IS_INCLUDED":3 ...},
        "de2id":	{"subj":0, "obj":1, "conj":3 ...},
        "train":    {
          "X":        [[s1_w1, s1_w2, ...], [s2_w1, s2_w2, ...], ...],
          "Y":        [s1_time_stamp, s2_time_stamp, s3_time_stamp, ...],
          "DepEdges": [[s1_dep_edges], [s2_dep_edges] ...],
          "ETEdges":  [[s1_et_edges], [s2_et_edges], ...],
          "ETIdx":    [[s1_et1, s1_et2, ...], [s2_et1, s2_et2, ...], ...],
          "ET":       [[s1_et1_type, s1_et2_type, ...], [s2_et1_type, s2_et2_type, ...], ...],
        }
        "test": {same as "train"},
        "valid": {same as "train"}
    }
    • voc2id is the mapping of words to their unique identifier
    • et2id is the maping of temporal graph edge types to their unique identifier.
    • de2id is the mapping of dependency graph edges types to their unique identifier.
    • Each entry of train, test and valid is a bag of sentences, where
      • X denotes the list sentences as the list of list of word indices.
      • Y is the time stamp associated with each sentence.
      • DepEdges is the edgelist of dependency parse for each sentence (required for S-GCN).
      • ETEdges is the edgelist of temporal graph for each sentence (required for T-GCN).
      • ETIdx is the position indices of event/time_expression in each sentence.
      • ET is the type of each word in a sentence. 0 denotes normal word, 1 event and 2 time expression.

Preprocessing:

For getting temporal graph of new documents. The following steps need to be followed:

  • Setup CAEVO and CATENA as explained in their respective repositories.

  • For extracting event and time mentions of a document

    • ./runcaevoraw.sh <path_of_document>

    • Above command generates an .xml file. This is used by CATENA for extracting temporal graph and it also contains the dependency parse information of the document which can be extracted using the following command:

      python preprocess/read_caevo_out.py <caevo_out_path> <destination_path>
  • For making the generated .xml file compatible for input to CATENA, use the following script as

    python preprocess/make_catena_input.py <caevo_out_path> <destination_path>
  • .xml generated above is given as input to CATENA for getting the temporal graph of the document.

     java -Xmx6G -jar ./target/CATENA-1.0.3.jar -i <path_to_xml> \
     	--tlinks ./data/TempEval3.TLINK.txt \
     	--clinks ./data/Causal-TimeBank.CLINK.txt \
     	-l ./models/CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer.model \
     	-g ./models/CoNLL2009-ST-English-ALL.anna-3.3.postagger.model \
     	-p ./models/CoNLL2009-ST-English-ALL.anna-3.3.parser.model \
     	-x ./tools/TextPro2.0/ -d ./models/catena-event-dct.model \
     	-t ./models/catena-event-timex.model \
     	-e ./models/catena-event-event.model 
     	-c ./models/catena-causal-event-event.model > <destination_path>

    The above command outputs the list of links in the temporal graph which are given as input to NeuralDater. The output file can be read using the following command:

    python preprocess/read_catena_out.py <catena_out_path> <destination_path>

Usage:

  • After installing python dependencies from requirements.txt, execute sh setup.sh for downloading GloVe embeddings.

  • neural_dater.py contains TensorFlow (1.x) based implementation of NeuralDater (proposed method).

  • To start training:

    python neural_dater.py -data data/nyt_processed_data.pkl -class 10 -name test_run
    • -class denotes the number of classes in datasets, 10 for NYT and 16 for APW.
    • -name is arbitrary name for the run.

Citing:

Please cite the following paper if you use this code in your work.

@InProceedings{neuraldater2018,
  author = "Vashishth, Shikhar and Dasgupta, Shib Sankar and Ray, Swayambhu Nath and Talukdar, Partha",
  title = "Dating Documents using Graph Convolution Networks",
  booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
  year = "2018",
  publisher = "Association for Computational Linguistics",
  pages = "1605--1615",
  location = "Melbourne, Australia",
  url = "http://aclweb.org/anthology/P18-1149"
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].