All Projects → Yomguithereal → Talisman

Yomguithereal / Talisman

Licence: mit
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Talisman

Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (-79.28%)
Mutual labels:  information-retrieval, natural-language-processing
Catalyst
Accelerated deep learning R&D
Stars: ✭ 2,804 (+380.14%)
Mutual labels:  information-retrieval, natural-language-processing
Gensim
Topic Modelling for Humans
Stars: ✭ 12,763 (+2085.45%)
Mutual labels:  information-retrieval, natural-language-processing
Pke
Python Keyphrase Extraction module
Stars: ✭ 855 (+46.4%)
Mutual labels:  information-retrieval, natural-language-processing
Libpostal
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Stars: ✭ 3,312 (+467.12%)
Mutual labels:  deduplication, natural-language-processing
Scdv
Text classification with Sparse Composite Document Vectors.
Stars: ✭ 54 (-90.75%)
Mutual labels:  information-retrieval, natural-language-processing
Vec4ir
Word Embeddings for Information Retrieval
Stars: ✭ 188 (-67.81%)
Mutual labels:  information-retrieval, natural-language-processing
Fingerprints
Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Stars: ✭ 91 (-84.42%)
Mutual labels:  deduplication, clustering
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (-69.01%)
Mutual labels:  fuzzy-matching, deduplication
tika-similarity
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Stars: ✭ 92 (-84.25%)
Mutual labels:  information-retrieval, clustering
Knowledge Graphs
A collection of research on knowledge graphs
Stars: ✭ 845 (+44.69%)
Mutual labels:  information-retrieval, natural-language-processing
Cdqa
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
Stars: ✭ 500 (-14.38%)
Mutual labels:  information-retrieval, natural-language-processing
Drl4nlp.scratchpad
Notes on Deep Reinforcement Learning for Natural Language Processing papers
Stars: ✭ 26 (-95.55%)
Mutual labels:  information-retrieval, natural-language-processing
Forte
Forte is a flexible and powerful NLP builder FOR TExt. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 89 (-84.76%)
Mutual labels:  information-retrieval, natural-language-processing
Data Matching Software
A list of free data matching and record linkage software.
Stars: ✭ 206 (-64.73%)
Mutual labels:  deduplication, fuzzy-matching
Neuralqa
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT
Stars: ✭ 185 (-68.32%)
Mutual labels:  information-retrieval, natural-language-processing
Refinr
Cluster and merge similar char values: an R implementation of Open Refine clustering algorithms
Stars: ✭ 91 (-84.42%)
Mutual labels:  fuzzy-matching, clustering
Abydos
Abydos NLP/IR library for Python
Stars: ✭ 91 (-84.42%)
Mutual labels:  fuzzy-matching, natural-language-processing
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+12.16%)
Mutual labels:  fuzzy-matching, deduplication
Awesome Persian Nlp Ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (-21.23%)
Mutual labels:  information-retrieval, natural-language-processing

Build Status DOI

Talisman

Full documentation

Talisman is a JavaScript library collecting algorithms, functions and various building blocks for fuzzy matching, information retrieval and natural language processing.

Installation

You can install Talisman through npm:

npm install talisman

Documentation

The library's full documentation can be found here.

Bibliography

An extensive bibliography of the methods & functions implemented by the library can be found here.

Goals

  • 📦 Modular: the library is completely modular. This means that if you only need to compute a levenshtein distance, you will only load the relevant code.
  • 💡 Straightforward & simple: just want to compute a Jaccard index? No need to instantiate a class and use two methods to pass options and then finally succeed in getting the index. Just apply the jaccard function and get going.
  • 🍡 Consistent API: the library's API is fully consistent and one should not struggle to understand how to apply two different distance metrics.
  • 📯 Functional: except for cases where classes might be useful (clustering notably), Talisman only uses functions, consumes raw data and order functions' arguments to make partial application & currying etc. as easy as possible.
  • ⚡️ Performant: the library should be as performant as possible for a high-level programming language library.
  • 🌐 Cross-platform: the library is cross-platform and can be used both with Node.js and in the browser.

How to cite

Talisman has been published as a paper on the Journal Of Open Source Software (JOSS).

Contribution

Contributions are of course welcome :)

Be sure to lint & pass the unit tests before submitting your pull request.

# Cloning the repo
git clone [email protected]:Yomguithereal/talisman.git
cd talisman

# Installing the deps
npm install

# Running the tests
npm test

# Linting the code
npm run lint

License

This project is available as open source under the terms of the MIT License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].