All Projects → davidjurgens → citation-function

davidjurgens / citation-function

Licence: other
Measuring the Evolution of a Scientific Field through Citation Frames

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to citation-function

scholarly
Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
Stars: ✭ 761 (+1802.5%)
Mutual labels:  citation-network, citation-analysis
lxa5
Linguistica 5: Unsupervised Learning of Linguistic Structure
Stars: ✭ 27 (-32.5%)
Mutual labels:  computational-linguistics
CISTEM
Stemmer for German
Stars: ✭ 33 (-17.5%)
Mutual labels:  computational-linguistics
perke
A keyphrase extractor for Persian
Stars: ✭ 60 (+50%)
Mutual labels:  computational-linguistics
python-arpa
🐍 Python library for n-gram models in ARPA format
Stars: ✭ 35 (-12.5%)
Mutual labels:  computational-linguistics
bllip-parser
BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Stars: ✭ 217 (+442.5%)
Mutual labels:  computational-linguistics
wikipron
Massively multilingual pronunciation mining
Stars: ✭ 167 (+317.5%)
Mutual labels:  computational-linguistics
frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+75%)
Mutual labels:  computational-linguistics
nytwit
New York Times Word Innovation Types dataset
Stars: ✭ 21 (-47.5%)
Mutual labels:  computational-linguistics
yap
Yet Another (natural language) Parser
Stars: ✭ 40 (+0%)
Mutual labels:  computational-linguistics
pylangacq
Language Acquisition Research Tools
Stars: ✭ 33 (-17.5%)
Mutual labels:  computational-linguistics
ArabicProcessingCog
A Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-52.5%)
Mutual labels:  computational-linguistics
linguistics problems
Natural language processing in examples and games
Stars: ✭ 23 (-42.5%)
Mutual labels:  computational-linguistics
foliapy
An extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (NLP). This library was formerly part of PyNLPl.
Stars: ✭ 13 (-67.5%)
Mutual labels:  computational-linguistics
ucto
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …
Stars: ✭ 58 (+45%)
Mutual labels:  computational-linguistics
sembei
🍘 単語分割を経由しない単語埋め込み 🍘
Stars: ✭ 14 (-65%)
Mutual labels:  computational-linguistics
esapp
An unsupervised Chinese word segmentation tool.
Stars: ✭ 13 (-67.5%)
Mutual labels:  computational-linguistics
LD-Connect
LD Connect is a Linked Data portal for IOS Press in collaboration with the STKO Lab at UC Santa Barbara.
Stars: ✭ 0 (-100%)
Mutual labels:  scientometrics
datalinguist
Stanford CoreNLP in idiomatic Clojure.
Stars: ✭ 93 (+132.5%)
Mutual labels:  computational-linguistics
dimcli
Python client and CLI for scientometrics and research analytics using the Dimensions API.
Stars: ✭ 32 (-20%)
Mutual labels:  scientometrics

Overview

This repository contains the code and resources for the paper Citation Classification for Measuring the Evolution of a Scientific Field through Citation Frames. David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky. Transactions of the Association for Computational Linguistics (TACL). 2018

For full details, see the Project Website which has links to all the data.

Requirements

This project uses Python 2 and requires the following packages

pycorenlp==0.2.0
fuzzywuzzy
joblib
sklearn
ftfy==4.4.3

The program was developed using Stanford CoreNLP 3.6.0 . Later versions may work but have not been tested.

If you're running the preprocessing steps from scratch, you'll need to have the Stanford CoreNLP server running on port 8999. See [https://stanfordnlp.github.io/CoreNLP/corenlp-server.html] for instructions.

Getting things running

The whole project consists of a series of scripts that convert and classify the ACL Anthology, all detailed in code/run-pipeline.sh. This file should allow you to replicate the full set of experiments. Approximate versions of the code to generate figures and tables for each part of the paper are found in the Jupyter notebooks in analysis/.

Contact

For general questions, contact the first author. For code issues, please file an issue and we'll debug it from there.

Citing

  @article{jurgens2018citation,
           title={Measuring the Evolution of a Scientific Field through Citation Frames},
           author={Jurgens, David and Kumar, Srijan and  Hoover,Raine  and McFarland, Dan and Jurafsky, Dan },
           journal={Transactions of the Association of Computational Linguistics},
           year={2018}
  }
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].