All Projects → mast-group → convolutional-attention

mast-group / convolutional-attention

Licence: BSD-3-Clause license
Repository for the code of the "A Convolutional Attention Network for Extreme Summarization of Source Code" paper

Programming Languages

HTML
75241 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to convolutional-attention

tassal
Tree-based Autofolding Software Summarization Algorithm
Stars: ✭ 38 (-67.8%)
Mutual labels:  ml4code
adversarial-code-generation
Source code for the ICLR 2021 work "Generating Adversarial Computer Programs using Optimized Obfuscations"
Stars: ✭ 16 (-86.44%)
Mutual labels:  ml4code
embeddings-for-trees
Set of PyTorch modules for developing and evaluating different algorithms for embedding trees.
Stars: ✭ 19 (-83.9%)
Mutual labels:  ml4code
code-transformer
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".
Stars: ✭ 130 (+10.17%)
Mutual labels:  ml4code

Convolutional Attention Network

Code related to the paper:

@inproceedings{allamanis2016convolutional,
  title={A Convolutional Attention Network for Extreme Summarization of Source Code},
  author={Allamanis, Miltiadis and Peng, Hao and Sutton, Charles},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2016}
}

For more information and the data of the paper, see here.

The project depends on Theano and uses Python 2.7.

Usage Instructions

To train the copy_attention model with the data use

> python copy_conv_rec_learner.py <training_file> <max_num_epochs> <D> <test_file>

were D is the embedding space dimenssion (128 in paper.) The best model will be saved at <training_file>.pkl

To evaluate an existing model re-run with exactly the same parameteres except for <max_num_epochs> which should be zero.

The following code will generate names from a pre-trained model and a test_file with code examples.

model = ConvolutionalCopyAttentionalRecurrentLearner.load(model_fname)
test_data, original_names = model.naming_data.get_data_in_recurrent_copy_convolution_format(test_file, model.padding_size)
test_name_targets, test_code_sentences, test_code, test_target_is_unk, test_copy_vectors = test_data

idx = 2  # pick an example from test_file
res = model.predict_name(np.atleast_2d(test_code[idx]))
print "original name:", ' '.join(original_names[idx].split(','))
print "code:", ' '.join(test_code[idx])
print "generated names:"
for r,v in res:
    print v, ' '.join(r)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].