All Projects → milangritta → WhatsMissingInGeoparsing

milangritta / WhatsMissingInGeoparsing

Licence: GPL-3.0 License
The accompanying code and data for the Springer 2017 publication "What's missing in geographical parsing?" in Language Resources and Evaluation.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to WhatsMissingInGeoparsing

Geocoding-with-Map-Vector
Resources for the ACL 2018 publication "Which Melbourne? Augmenting Geocoding with Maps", published in July 2018.
Stars: ✭ 24 (+60%)
Mutual labels:  geoparsing, geocoding, toponyms, toponymy, toponym-resolution
Xponents
Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
Stars: ✭ 39 (+160%)
Mutual labels:  geoparsing, geonames, geocoding, geotagging
CLAVIN-NERD
Stanford NLP Implementation of the CLAVIN LocationTagger
Stars: ✭ 22 (+46.67%)
Mutual labels:  geoparsing, geonames, geotagging
CLAVIN-rest
A Spring Boot microservice that serves the CLAVIN (https://github.com/novetta/CLAVIN) library for geo rectifying locations mentioned in text.
Stars: ✭ 16 (+6.67%)
Mutual labels:  geoparsing, geonames, geotagging
EpiTator
EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and EIDR Connect.
Stars: ✭ 38 (+153.33%)
Mutual labels:  geonames, toponym-resolution
local-reverse-geocoder
Local reverse geocoder for Node.js based on GeoNames data
Stars: ✭ 155 (+933.33%)
Mutual labels:  geonames, geocoding
Machine-learning-toolkits-with-python
Machine learning toolkits with Python
Stars: ✭ 31 (+106.67%)
Mutual labels:  evaluation
image-matching-toolbox
This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.
Stars: ✭ 252 (+1580%)
Mutual labels:  evaluation
S-measure
Structure-measure: A New Way to Evaluate Foreground Maps, IJCV2021 (ICCV 2017-Spotlight)
Stars: ✭ 43 (+186.67%)
Mutual labels:  evaluation
eval-estree-expression
Safely evaluate JavaScript (estree) expressions, sync and async.
Stars: ✭ 22 (+46.67%)
Mutual labels:  evaluation
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+92366.67%)
Mutual labels:  evaluation
leaflet-examples
🍁 A collection of examples of leaflet map usage
Stars: ✭ 90 (+500%)
Mutual labels:  geocoding
travelling-salesman
Rules for Kiwi.com travelling salesman competition
Stars: ✭ 14 (-6.67%)
Mutual labels:  evaluation
midi degradation toolkit
A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.
Stars: ✭ 29 (+93.33%)
Mutual labels:  evaluation
CBLUE
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Stars: ✭ 379 (+2426.67%)
Mutual labels:  evaluation
geoparser
⛔ ARCHIVED ⛔ R package for the Geoparser.io API
Stars: ✭ 38 (+153.33%)
Mutual labels:  geocoding
python-opencage-geocoder
Python module to access the OpenCage geocoding API
Stars: ✭ 54 (+260%)
Mutual labels:  geocoding
article-tagging
Natural Language Processing of Chicago news articles
Stars: ✭ 41 (+173.33%)
Mutual labels:  geocoding
AIODrive
Official Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"
Stars: ✭ 32 (+113.33%)
Mutual labels:  evaluation
NominatimGeocoderBackend
UnifiedNlp geocoder backend that uses the OSM Nominatim service
Stars: ✭ 49 (+226.67%)
Mutual labels:  geocoding

What's Missing In Geoparsing?


NEWS UPDATE 31.9.2019 - We have a LONG FOLLOW-UP PAPER OUT NOW that greatly expands on this topic. The title is "A Pragmatic Guide to Geoparsing Evaluation." It's now been published at Springer LREV Journal. For the project/paper repository, follow this link.


"Science is a wonderful thing if one does not have to earn one's living at it." -- Albert Einstein

Summary

Thanks for stopping by! In this repository, you will find the accompanying code and data for the publication "What's missing in geographical parsing?" in the journal Language Resources and Evaluation. In the unlikely case of any files missing, please track me down and I'll upload 👍

What's included

  1. data - This is the output of all systems on both datasets (2 * 5 files) plus the gold standard (2 files)
  2. The dataset WikToR(SciPaper).xml is the original data as described and used in the paper.
  3. The LGL dataset, which is also used for evaluation is included as lgl.xml
  4. Essential experiment files (plus supporting scripts)

How to replicate

You should have some basic Python libraries like Numpy, NLTK, Matplotlib (if you want graphics), ... to start with.

  • methods.py is the main python script for running the experiments (requires the yahoo.py script)
  • Please install GeoPy to calculate the distances between coordinates.
  • Also install Wikipedia for Python, nice API wrapper 👍
  • Scroll down to the end of the file to see example usage, I included all necessary instructions and comments.
  • Enjoy!

How to (re)create and modify WikToR

The dataset (WikToR) can be created (and unite tested) from scratch, extended, reduced, with more or fewer sentences added, etc. If you wish to do that, great! Here's what you need:

  • The wiktor.py file is the python script used to (re)generate and unit test WikToR.
  • Download the allCountries.txt data dump from GeoNames and save in the same directory as the script.
  • Please sign up for a GeoNames account and a USERNAME, which you will need to fill in on line 42 to ensure the API query works.
  • The first half of wiktor.py is for CORPUS CREATION, the second half is for CORPUS TESTING.
  • Enjoy!

"The science of today is the technology of tomorrow." -- Edward Teller

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].