All Projects → Tokenizer → Similar Projects or Alternatives

1057 Open source projects that are alternatives of or similar to Tokenizer

Geotext
Geotext extracts country and city mentions from text
Stars: ✭ 91 (-31.06%)
Deep Nlp Seminars
Materials for deep NLP course
Stars: ✭ 113 (-14.39%)
Dat8
General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+1048.48%)
Open Semantic Entity Search Api
Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names
Stars: ✭ 98 (-25.76%)
Syntok
Text tokenization and sentence segmentation (segtok v2)
Stars: ✭ 123 (-6.82%)
Mutual labels:  tokenizer
Bert As Service
Mapping a variable-length sentence to a fixed-length vector using BERT model
Stars: ✭ 9,779 (+7308.33%)
Commonsense Rc
Code for Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension
Stars: ✭ 112 (-15.15%)
Rasa Chatbot Templates
RASA chatbot use case boilerplate
Stars: ✭ 127 (-3.79%)
Niutrans.smt
NiuTrans.SMT is an open-source statistical machine translation system developed by a joint team from NLP Lab. at Northeastern University and the NiuTrans Team. The NiuTrans system is fully developed in C++ language. So it runs fast and uses less memory. Currently it supports phrase-based, hierarchical phrase-based and syntax-based (string-to-tree, tree-to-string and tree-to-tree) models for research-oriented studies.
Stars: ✭ 90 (-31.82%)
Mutual labels:  machine-translation
Nlp Papers
Papers and Book to look at when starting NLP 📚
Stars: ✭ 111 (-15.91%)
Bible text gcn
Pytorch implementation of "Graph Convolutional Networks for Text Classification"
Stars: ✭ 90 (-31.82%)
Nlp Pretrained Model
A collection of Natural language processing pre-trained models.
Stars: ✭ 122 (-7.58%)
Multiffn Nli
Implementation of the multi feed-forward network architecture by Parikh et al. (2016) for Natural Language Inference.
Stars: ✭ 89 (-32.58%)
Awesome Emotion Recognition In Conversations
A comprehensive reading list for Emotion Recognition in Conversations
Stars: ✭ 111 (-15.91%)
Character Mining
Mining individual characters in multiparty dialogue
Stars: ✭ 89 (-32.58%)
Prenlp
Preprocessing Library for Natural Language Processing
Stars: ✭ 130 (-1.52%)
Spark Nlp Models
Models and Pipelines for the Spark NLP library
Stars: ✭ 88 (-33.33%)
Xlnet extension tf
XLNet Extension in TensorFlow
Stars: ✭ 109 (-17.42%)
Spf
Cornell Semantic Parsing Framework
Stars: ✭ 87 (-34.09%)
Cs230 Code Examples
Code examples in pyTorch and Tensorflow for CS230
Stars: ✭ 1,701 (+1188.64%)
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+862.12%)
The Nlp Pandect
A comprehensive reference for all topics related to Natural Language Processing
Stars: ✭ 1,349 (+921.97%)
Djurl
Simple yet helpful library for writing Django urls by an easy, short and intuitive way.
Stars: ✭ 85 (-35.61%)
Mutual labels:  tokenizer
Neuraldialog Larl
PyTorch implementation of latent space reinforcement learning for E2E dialog published at NAACL 2019. It is released by Tiancheng Zhao (Tony) from Dialog Research Center, LTI, CMU
Stars: ✭ 127 (-3.79%)
Ofxfontstash
Easy (and fast) unicode string rendering addon for OpenFrameworks. FontStash is made by Andreas Krinke and Mikko Mononen
Stars: ✭ 84 (-36.36%)
Mutual labels:  unicode
Papernotes
My personal notes and surveys on DL, CV and NLP papers.
Stars: ✭ 108 (-18.18%)
Scanrefer
[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
Stars: ✭ 84 (-36.36%)
Ratel
RAT-el is an open source penetration test tool that allows you to take control of a windows machine. It works on the client-server model, the server sends commands and the client executes the commands and sends the result back to the server. The client is completely undetectable by anti-virus software.
Stars: ✭ 121 (-8.33%)
Mutual labels:  unicode
U2c
Unicode To Chinese -- U2C : A burpsuite Extender That Convert Unicode To Chinese 【Unicode编码转中文的burp插件】
Stars: ✭ 83 (-37.12%)
Mutual labels:  unicode
Transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+42128.79%)
Sentence Splitter
Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
Stars: ✭ 82 (-37.88%)
Mutual labels:  tokenizer
Persian Stopwords
Persian (Farsi) Stop Words List
Stars: ✭ 131 (-0.76%)
Unicode
Unicode normalization library. (Mirror of Yoshida-san's code base to maintain the RubyGem.)
Stars: ✭ 81 (-38.64%)
Mutual labels:  unicode
Nltk
NLTK Source
Stars: ✭ 10,309 (+7709.85%)
Lehar
Visualize data using relative ordering
Stars: ✭ 81 (-38.64%)
Mutual labels:  unicode
Nlpcc Wordseg Weibo
NLPCC 2016 微博分词评测项目
Stars: ✭ 120 (-9.09%)
Mimic Code
MIMIC Code Repository: Code shared by the research community for the MIMIC-III database
Stars: ✭ 1,225 (+828.03%)
Mutual labels:  icu
Chatbot
Русскоязычный чатбот
Stars: ✭ 106 (-19.7%)
Tensorflow 1.4 Billion Password Analysis
Deep Learning model to analyze a large corpus of clear text passwords.
Stars: ✭ 1,720 (+1203.03%)
Gtos
Code for AAAI2020 paper "Graph Transformer for Graph-to-Sequence Learning"
Stars: ✭ 129 (-2.27%)
Mutual labels:  machine-translation
Spacy Dev Resources
💫 Scripts, tools and resources for developing spaCy
Stars: ✭ 123 (-6.82%)
Cogcomp Nlpy
CogComp's light-weight Python NLP annotators
Stars: ✭ 115 (-12.88%)
Jupyterlab Prodigy
🧬 A JupyterLab extension for annotating data with Prodigy
Stars: ✭ 97 (-26.52%)
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+1500%)
Mutual labels:  machine-translation
Transformers without tears
Transformers without Tears: Improving the Normalization of Self-Attention
Stars: ✭ 80 (-39.39%)
Mutual labels:  machine-translation
Ios ml
List of Machine Learning, AI, NLP solutions for iOS. The most recent version of this article can be found on my blog.
Stars: ✭ 1,409 (+967.42%)
Ucdn
Unicode Database and Normalization
Stars: ✭ 78 (-40.91%)
Mutual labels:  unicode
Discobert
Code for paper "Discourse-Aware Neural Extractive Text Summarization" (ACL20)
Stars: ✭ 120 (-9.09%)
Text Dependency Parser
🏄 依存关系分析,NLP,自然语言处理
Stars: ✭ 78 (-40.91%)
Textaugmentation Gpt2
Fine-tuned pre-trained GPT2 for custom topic specific text generation. Such system can be used for Text Augmentation.
Stars: ✭ 104 (-21.21%)
Multimodal Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
Stars: ✭ 78 (-40.91%)
Textacy
NLP, before and after spaCy
Stars: ✭ 1,849 (+1300.76%)
Abigsurvey
A collection of 500+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML)
Stars: ✭ 1,203 (+811.36%)
Magnitude
A fast, efficient universal vector embedding utility package.
Stars: ✭ 1,394 (+956.06%)
Monkeylearn Ruby
Official Ruby client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Ruby apps.
Stars: ✭ 76 (-42.42%)
Tokenizer
Source code tokenizer
Stars: ✭ 119 (-9.85%)
Mutual labels:  tokenizer
Bond
BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (-27.27%)
Chaos
Proof of concept, general purpose pastejacker for GNU/Linux
Stars: ✭ 115 (-12.88%)
Mutual labels:  unicode
Botfuel Dialog
Botfuel SDK to build highly conversational chatbots
Stars: ✭ 96 (-27.27%)
Sentence Similarity
PyTorch implementations of various deep learning models for paraphrase detection, semantic similarity, and textual entailment
Stars: ✭ 96 (-27.27%)
121-180 of 1057 similar projects