All Projects → iwpnd → flashgeotext

iwpnd / flashgeotext

Licence: MIT license
Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to flashgeotext

textics
📉 JavaScript Text Statistics that counts lines, words, chars, and spaces.
Stars: ✭ 36 (-32.08%)
Mutual labels:  search-in-text
regXwild
⏱ Superfast ^Advanced wildcards++? | Unique algorithms that was implemented on native unmanaged C++ but easily accessible in .NET via Conari (with caching of 0x29 opcodes +optimizations) etc.
Stars: ✭ 20 (-62.26%)
Mutual labels:  search-in-text
note-cli
Markdown Indexing and Pcre Regular Expression Compatible Full Text Searching for Advanced Note Takers.
Stars: ✭ 15 (-71.7%)
Mutual labels:  search-in-text
laravel-scout-settings
DEPRECATED: Use of this repository is deprecated. Please use Scout Extended - https://github.com/algolia/scout-extended instead.
Stars: ✭ 23 (-56.6%)
Mutual labels:  search-in-text
sliceslice-rs
A fast implementation of single-pattern substring search using SIMD acceleration.
Stars: ✭ 66 (+24.53%)
Mutual labels:  search-in-text
bulksearch
Lightweight and read-write optimized full text search library.
Stars: ✭ 108 (+103.77%)
Mutual labels:  search-in-text
Ambar
🔍 Ambar: Document Search Engine
Stars: ✭ 1,829 (+3350.94%)
Mutual labels:  search-in-text
Flexsearch
Next-Generation full text search library for Browser and Node.js
Stars: ✭ 8,108 (+15198.11%)
Mutual labels:  search-in-text
The silver searcher
A code-searching tool similar to ack, but faster.
Stars: ✭ 23,030 (+43352.83%)
Mutual labels:  search-in-text
Flashtext
Extract Keywords from sentence or Replace keywords in sentences.
Stars: ✭ 5,012 (+9356.6%)
Mutual labels:  search-in-text
septum
Context-based code search tool
Stars: ✭ 25 (-52.83%)
Mutual labels:  search-in-text
pg-tsquery
🔍 Parse user input into a valid PostgreSQL tsquery
Stars: ✭ 48 (-9.43%)
Mutual labels:  search-in-text
ngp
Ncurses code parsing tool
Stars: ✭ 52 (-1.89%)
Mutual labels:  search-in-text
cherche
📑 Neural Search
Stars: ✭ 196 (+269.81%)
Mutual labels:  flashtext
alter-nlu
Natural language understanding library for chatbots with intent recognition and entity extraction.
Stars: ✭ 45 (-15.09%)
Mutual labels:  flashtext
slotminer
Tool for slot extraction from text
Stars: ✭ 15 (-71.7%)
Mutual labels:  named-entity-extraction
wink-nlp
Developer friendly Natural Language Processing ✨
Stars: ✭ 312 (+488.68%)
Mutual labels:  named-entity-extraction

Build Status Coverage


flashgeotext 🌍

Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.

documentation: https://flashgeotext.iwpnd.pw/
introductory blogpost: https://iwpnd.pw/articles/2020-02/flashgeotext-library

Usage

from flashgeotext.geotext import GeoText

geotext = GeoText()

input_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans
                to cut tariffs on $75 billion worth of goods that the country
                imports from the US. Washington welcomes the decision.'''

geotext.extract(input_text=input_text)
>> {
    'cities': {
        'Shanghai': {
            'count': 2,
            'span_info': [(0, 8), (45, 53)],
            'found_as': ['Shanghai', 'Shanghai'],
            },
        'Washington, D.C.': {
            'count': 1,
            'span_info': [(175, 185)],
            'found_as': ['Washington'],
            }
        },
    'countries': {
        'China': {
            'count': 1,
            'span_info': [(64, 69)],
            'found_as': ['China'],
            },
        'United States': {
            'count': 1,
            'span_info': [(171, 173)],
            'found_as': ['US'],
            }
        }
    }

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Installing

pip:

pip install flashgeotext

conda:

conda install flashgeotext

for development:

git clone https://github.com/iwpnd/flashgeotext.git
cd flashgeotext/
poetry install

Running the tests

poetry run pytest . -v

Authors

  • Benjamin Ramser - Initial work - iwpnd

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Demo Data cities from http://www.geonames.org licensed under the Creative Commons Attribution 3.0 License.

Acknowledgments

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].