All Projects → Novetta → CLAVIN-NERD

Novetta / CLAVIN-NERD

Licence: GPL-3.0 license
Stanford NLP Implementation of the CLAVIN LocationTagger

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to CLAVIN-NERD

CLAVIN-rest
A Spring Boot microservice that serves the CLAVIN (https://github.com/novetta/CLAVIN) library for geo rectifying locations mentioned in text.
Stars: ✭ 16 (-27.27%)
Mutual labels:  geoparsing, geonames, geolocation, gazetteer, geotagging, georesolution
WhatsMissingInGeoparsing
The accompanying code and data for the Springer 2017 publication "What's missing in geographical parsing?" in Language Resources and Evaluation.
Stars: ✭ 15 (-31.82%)
Mutual labels:  geoparsing, geonames, geotagging
Xponents
Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
Stars: ✭ 39 (+77.27%)
Mutual labels:  geoparsing, geonames, geotagging
sigsby
Sistem Informasi Geografis (SIG) / GIS Wisata Kota Surabaya Berbasis Web - www.firstplato.com
Stars: ✭ 23 (+4.55%)
Mutual labels:  geolocation, geotagging
lucene-geo-gazetteer
Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.
Stars: ✭ 34 (+54.55%)
Mutual labels:  geonames, gazetteer
Geocoding-with-Map-Vector
Resources for the ACL 2018 publication "Which Melbourne? Augmenting Geocoding with Maps", published in July 2018.
Stars: ✭ 24 (+9.09%)
Mutual labels:  geoparsing, geolocation
GeoParser
Extract and Visualize location from any file
Stars: ✭ 48 (+118.18%)
Mutual labels:  gazetteer
loc
🌍 Helps anyone play with locations
Stars: ✭ 17 (-22.73%)
Mutual labels:  geolocation
wifi-locate
Locates your Wi-Fi-enabled machine using Wi-Fi access points signal strengths, using Google's API
Stars: ✭ 20 (-9.09%)
Mutual labels:  geolocation
geodist
Golang package to compute the distance between two geographic latitude, longitude coordinates
Stars: ✭ 133 (+504.55%)
Mutual labels:  geolocation
radiocells-nlp-android
radiocells.org Unified Network Location Provider
Stars: ✭ 35 (+59.09%)
Mutual labels:  geolocation
ip2location-nginx
Nginx module that allows user to lookup for geolocation information using IP2Location database.
Stars: ✭ 33 (+50%)
Mutual labels:  geolocation
lala
🌎 Analyze and generate reports of web logs (NGINX)
Stars: ✭ 59 (+168.18%)
Mutual labels:  geolocation
geo
Geospatial primitives and algorithms for Crystal
Stars: ✭ 17 (-22.73%)
Mutual labels:  geolocation
world-cities-mongodb
A free world cities database
Stars: ✭ 29 (+31.82%)
Mutual labels:  geolocation
trackanimation
Track Animation is a Python 2 and 3 library that provides an easy and user-adjustable way of creating visualizations from GPS data.
Stars: ✭ 74 (+236.36%)
Mutual labels:  geolocation
captAR
Augmented Reality Geolocation Capture-the-Flag Mobile Game Capstone Project
Stars: ✭ 24 (+9.09%)
Mutual labels:  geolocation
geolocation
A laravel integration for using the IPInfoDB and Ip2Location services
Stars: ✭ 38 (+72.73%)
Mutual labels:  geolocation
IP2Location-C-Library
IP2Location C library enables the user to find the country, region, city, coordinates, zip code, time zone, ISP, domain name, connection type, area code, weather station code, weather station name, mobile, usage types, etc that any IP address or hostname originates from.
Stars: ✭ 37 (+68.18%)
Mutual labels:  geolocation
CiLocks
Crack Interface lockscreen, Metasploit and More Android/IOS Hacking
Stars: ✭ 1,033 (+4595.45%)
Mutual labels:  geolocation

CLAVIN-NERD LOGO

CLAVIN-NERD Master

License: GPL v3

CLAVIN-NERD


CLAVIN-NERD is a GPL-licensed "wrapper project" that connects the Apache-licensed CLAVIN geoparser with the GPL-licensed Stanford CoreNLP NER entity extractor.

Using CLAVIN with Stanford NER (i.e., the CLAVIN-NERD distribution) results in significantly higher accuracy than with the default Apache OpenNLP NameFinder entity extractor. We recommend using CLAVIN-NERD or Novetta's AdaptNLP over OpenNLP. Stanford NER is not included in the standard CLAVIN release because Stanford NER is GPL-licensed and we are committed to distributing CLAVIN itself via the Apache License. Thus, the GPL-licensed CLAVIN-NERD distribution makes CLAVIN available for use with Stanford NER while preserving the freedom of the core CLAVIN source code under the terms of the Apache License.

Novetta also maintains the CLAVIN-Rest project, which provides a RESTful microservice wrapper around CLAVIN or CLAVIN-NERD. To use CLAVIN-NERD with CLAVIN-Rest, you simply have to edit the CLAVIN-Rest POM. CLAVIN-Rest is configured (and provides instructions) to easily build and run this package as a docker image.

Breaking changes

This release includes breaking changes in the form of an update to all namespaces. The namespaces have been changed from com.bericotech to com.novetta which reflects a change in corporate ownership, and re-alignment to our new domain.

How to build and use CLAVIN-NERD:

CLAVIN-NERD relies on CLAVIN to build its lucene index. You can refer to the instructions for getting started with CLAVIN before attempting to work with CLAVIN-NERD. Here are the instructions for building the index using CLAVIN-NERD:

  1. Check out a copy of the source code:
git clone https://github.com/Novetta/CLAVIN-NERD.git
  1. Move into the newly-created CLAVIN-NERD directory:
cd CLAVIN-NERD
  1. Download the latest version of allCountries.zip gazetteer file from GeoNames.org:
curl -O http://download.geonames.org/export/dump/allCountries.zip
  1. Unzip the GeoNames gazetteer file:
unzip allCountries.zip
  1. Package the source code:
mvn clean package
  1. Create the Lucene Index (this one-time process will take several minutes):
MAVEN_OPTS="-Xmx4g" mvn exec:java -Dexec.mainClass="com.novetta.clavin.index.IndexDirectoryBuilder"
  1. Run the example program:

Once you've used CLAVIN to build the required Lucene index with the GeoNames.org gazetteer, consult WorkflowDemoNERD.java for multiple examples of different ways to use CLAVIN-NERD. You can run the CLAVIN-NERD demo from the command line with the following command:

MAVEN_OPTS="-Xmx2g" mvn exec:java -Dexec.mainClass="com.novetta.clavin.nerd.WorkflowDemoNERD"	

The main difference between using CLAVIN and CLAVIN-NERD is in the arguments passed to the GeoParserFactory class to instantiate a GeoParser object. With CLAVIN-NERD, we need to specify that we want to use the StanfordExtractor to extract location names from text.

Here's an example call to GeoParserFactory where we specify that the StanfordExtractor should be used, as seen in the WorkflowDemoNERD class:

GeoParserFactory.getDefault("./IndexDirectory", new StanfordExtractor(), 1, 1, false);

Don't forget: Loading the worldwide gazetteer uses a non-trivial amount of memory. When using CLAVIN-NERD in your own programs, if you encounter Java heap space errors, bump up the maximum heap size for your JVM. Allocating 2GB (e.g., -Xmx2g) is a good place to start.

Get it from Maven Central:

<dependency>
    <groupId>com.novetta</groupId>
    <artifactId>CLAVIN-nerd</artifactId>
    <version>3.0.0</version>
</dependency>

License:

Since the Stanford CoreNLP NER library is licensed via the GPL, CLAVIN-NERD is as well. However, CLAVIN itself remains under the Apache License, version 2.


CLAVIN-NERD Copyright (C) 2012-2020 Novetta

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].